C # collection type Big secret

Source: Internet
Author: User

The collection is an important part of the. NET FCL (Framework Class Library), and we always have to deal with the collection of C # code, and the FCL provides a rich and easy-to-use collection type, which provides great convenience for us to masturbate. It is this innate convenience that makes us familiar and unfamiliar with the collection. Many students may still stay on the level of use, so today we come together to learn more about the various collections in the C # language.

First we look at the set of interfaces that the FCL provides to us:

The FCL provides both generic and non-generic collection types. Because of the performance overhead of non-generic collection boxing and unpacking, it has become more and more tasteless than a generic collection. So we also focus on the analysis of generic sets , but they are not very different.

IEnumerable and IEnumerator

The IEnumerable interface is the ancestor interface of all collection types, which acts as an Object type to other types. If a type implements the IEnumerable interface, it means that it can be accessed iteratively, or it can be called a collection type (enumerable). The IEnumerable interface definition is very simple and only one GetEnumerator () method is used to get an iterator of type IEnumerator .

We can think of an iterator as a cursor in a database, a position in a sequence (collection) that can move forward only in a sequence (collection). Each time MoveNext () is called, if there is another element in the sequence (the collection), the iterator moves to the next element, and current is used to get the present element in the sequence (collection), because the iterator calls the code once to get only one element, This means that we need to determine which location in the sequence (collection) is accessed. Reset () resets this state, but basically does not reset the state with reset ().

The same sequence (collection) may have multiple iterator operations at the same time, which is equivalent to multiple traversal of a collection at the same time. In this case, iterations may appear interlaced with each other. So how to solve it?

The collection class does not directly support IEnumerator and
IEnumerator interface. Instead, the IEnumerable interface is supported directly, and the only method is GetEnumerator, which returns the object that supports IEnumerator . each time you call the GetEnumerator () method, you need to create a new object, and the iterator must save its state and record which element has been iterated at this time. Such an iterator is like a cursor in a sequence. You can have more than one cursor, and you can move either of them to enumerate the collection and not affect it with other iterators.

How is foreach implemented?

for relies on support for the Length property and the index operator ([]). With the Length property, the C # compiler can use a for statement to iterate over each element in an algebraic group. for is for arrays that are fixed in length and always support index operators, but not all types of collections have a known number of elements. In addition, many collection classes, including Stack , Queue, and Dictionary, do not support retrieving elements by index. Therefore, it is necessary to iterate over the collection of elements using a more general approach. Assuming you can determine the first, second, and last elements, there is no need to know the number of elements and it is not necessary to support retrieving elements by index. foreach emerges in this context. In fact, foreach internally uses the iterator's MoveNext and current to complete the traversal of the element.

List<int> list = new List<int>();List<int>.Enumerator enumerator = list.GetEnumerator();try{    int number;    while (enumerator.MoveNext())    {        number = enumerator.Current;        Console.WriteLine(number);    }}finally{    enumerator.Dispose();}
Implementing a Custom Collection

We can implement our own custom collections by implementing the IEnumerable interface and the IEnumerator interface ourselves.

To implement a custom enumerable type:

public class MySet : IEnumerable{    internal object[] values;    public MySet(object[] values)    {        this.values = values;    }    public IEnumerator GetEnumerator()    {        return new MySetIterator(this);    }}

Handwriting implements a custom iterator:

public class MySetIterator : IEnumerator{    MySet set;    /// <summary>    /// 保存迭代到的位置    /// </summary>    int position;    internal MySetIterator(MySet set)    {        this.set = set;        position = -1;    }    public object Current    {        get        {                               if(position==-1||position==set.values.Length)            {                throw new   InvalidOperationException();             }             int index = position;             return set.values[index];         }    }    public bool MoveNext()    {        if(position!=set.values.Length)        {            position++;        }        return position < set.values.Length;    }    public void Reset()    {        position = -1;    }}

Test procedure:

object[] values = { "a", "b", "c", "d", "e" };MySet mySet = new MySet(values);foreach (var item in mySet){    Console.WriteLine(item);}

This example also proves that the foreach internal uses the iterator's MoveNext and current to complete the traversal.

In the example above, the handwritten implementation of iterators is cumbersome, which is the only way in c#1.0. In c#2.0, we can use yield syntax sugar to simplify iterators.

public IEnumerator GetEnumerator(){    for (int i = 0; i < values.Length; i++)    {        yield return values[i];    }}

IEnumerable and IEnumerator , while simple to implement, have only a few members, but support the C # language to assemble the tall building.

ICollection and ICollection

From the first picture, we can tell that ICollection inherits from the IEnumerable interface and expands the IEnumerable /c10> interface.

The main features of the extension are:

    1. Added attribute count, which is used to record the number of collection elements
    2. Support for adding elements and removing elements
    3. Support for inclusion of an element
    4. Support empty collection and so on

For any collection that implements the ICollection interface, we can get the number of elements of the current collection through the 1th Count property, so these collections are also known as Count collections.

IList and IList

the IList interface inherits directly from the ICollection interface and the IEnumerable interface, and extends through the index the ability to manipulate collections.

The main features of the extension are:

    1. Get an element in a collection by index
    2. Gets the index value of an element in the collection by an element
    3. Inserts an element into the collection at the specified location by index
    4. Removes the element at the specified index of the collection
IDictionary and IDictionary

the IDictionary interface directly inherits from the ICollection interface and IEnumerable interface, the stored element is a key-value pair, extending the function of the key-value pair collection by key manipulation.

The main features of the extension are:

    1. Get the value by keys key
    2. Insert new key-value pair {Key:value}
    3. Whether to include key
    4. Remove key-value pair elements by key

The interface of the main collection is finished, let's look at the specific collection type below.

Associative generic collection Class 1. Dictionary

The time it takes for Dictionary to query data is the fastest in all collection classes, because its internal use of hash function plus a set of arrays to achieve, so its query data operation time complexity can be considered O (1). The implementation of Dictionary is a typical practice of sacrificing space for time (even groups).

Dictionary Add the implementation of the new element:

Dictionary has two arrays, an array named buckets, used to hold a static linked header pointer consisting of multiple synonyms (the index number of the first element of the list in the array, When its value is-1 means that the hash address does not have an element), and the other array is entries, which is used to hold the actual data in the hash table, and the data is composed of multiple single-linked lists through the next pointer. The entries array contains the entry structure, and the entry structure consists of 4 parts, as follows:

Dictionary computes the hash value of the key using the Fetch method, which can cause conflicts and therefore conflict resolution. Dictionary the way to resolve conflicts is by linking.

We can deduce this process by simulating the source code:

When the first element is added, the space and initial size of the hash table buckets array and the entries array are allocated at this time, the default is 3, and the size of the initial array is brainiac. Hash Key=1 is evaluated, assuming that the hash value of the first element = 9, and then Targetbucket = 9%buckets. The value of Length (3) is 0, so the first element should be placed in the first bit of the entries array. Finally, the hash table Buckets array is assigned, with an array index of 0 and a value of 0. At this point the internal structure:

Then insert the second element, hash the key=2, assuming the hash value of the second element = 3, and then Targetbucket = 3%buckets. The value of Length (3) is 0, so the second element should be placed in the first bit of the entries array. But the first bit of the entries array already exists, and there is a conflict. Dictionary the way to resolve conflicts is by linking, by linking the elements of the conflict before the element, by specifying the conflict relationship through the next property, and finally updating the hash table buckets array. At this point the internal structure:

We can prove that our analysis is correct by Dictionary Finding the implementation of the element.

Dictionary The implementation of the Find element:

Dictionary is able to quickly find elements by using a hash table to store the location of the element, and we can quickly get to the value of the key corresponding to the location index of the element from the hash table with the hash value. Extremes meet,Dictionary 's shortcomings are also obvious, that is, the data is unordered arrangement, so in a certain order to traverse the search data efficiency is very low.

2.SortedDictionary

sorteddictionary and Dictionary Similar, as to the difference we can see from the name,Dictionary is disordered, andthe sorteddictionary is orderly. Key to ensure that the only, but also orderly arrangement, which makes it natural for us to think of the search binary tree. sorteddictionary uses a balanced search binary tree-the red-black tree, as a storage structure. Because of binary lookup, the time complexity of adding, finding, and deleting elements is O (log n). Compared to the SortedList mentioned below,sorteddictionary is faster when adding and deleting elements. If you want to quickly query the same time and good support for sorting, and add and delete elements are more frequent, you can use sorteddictionary .

SortedDictionary Add the implementation of the new element:

3.SortedList

In scenarios where both quick lookups and sequential permutations are required,Dictionary is powerless because Dictionary uses hash functions and does not support linear ordering. We can use the SortedList Collection class to deal with this scenario.

the SortedList Collection is internally implemented using arrays, the time complexity of adding and deleting elements is O (n), and the lookup element takes advantage of a two-point lookup, so the time complexity of finding an element is O (log n). Therefore, although the SortedList support the orderly arrangement, but at the expense of finding efficiency at the expense.

SortedList and sorteddictionary support fast query and sorting,SortedList The advantage is that the memory used is less than sorteddictionary , but sorteddictionary You can perform a faster insert and remove operation on unsorted data: It has a time complexity of O (log n), and SortedList is O (n). So SortedList is ideal for scenarios where you need to quickly find and order but add and remove fewer elements.

Internal implementation structure:

Get the implementation of value based on key:

Indexofkey implementation:

To add a new element:

To add an action:

Non-associative generic collection Class 1. List

The generic list class provides an unlimited length collection type, and the list internal implementation uses the data structure as an array. We all know that the array is fixed in length, then the list does not have to limit the length of the array must be maintained. In fact, the list maintains a certain length of the array (the default is 4), when the number of elements inserted more than 4 or the initial length, will be to recreate a new array, the length of the new array is twice times the initial length, and then the original array is assigned to the new array.

We can take a look at the list source code to prove what we said above: Ilspy

List Internal key variables:

New element operation:

New elements confirm Array Capacity:

True Array expansion operations:

The creation and assignment of objects are related to the expansion of the array, which is a more consumed performance. So if you can specify an appropriate initial length, you can avoid frequent object creation and assignment. Furthermore, because the internal data structure is an array, the INSERT and delete operations need to move the element position, so it is not suitable for frequent insertions and deletions, but it is possible to find elements by array subscripts. So list is suitable for reading and writing less scenes.

2.LinkedList

Above we mentioned that list is suitable for reading and writing less scenes, then there must be a list suitable for writing more than a few scenes, this is the goods-- LinkedList . As to why it is suitable for writing and reading less , familiar with the data structure of the students should have guessed. Because the internal implementation of LinkedList uses a linked list structure, and is also a doubly linked list. Direct View Source:

Because the internal implementation structure is a linked list, you can insert a new element before or after a node.

Linked list node definition:

Let's take the example of inserting a new element before a node:

For the specific insert operation, note that the operation steps cannot be reversed:

3.HashSet

HashSet is an unordered collection that keeps the uniqueness. We can consider HashSet as a simplified Dictionary , but Dictionary Stores a key-value pair object, while HashSet stores a normal object. Its internal implementation is also basically consistent with the Dictionary , but also the hash function plus the even-numbered group implementation, the difference is that the stored Slot structure no longer has a key.

Internally implemented data structures:

The slot structure is stored in the m_slots , and theslot structure consists of 3 parts, as follows:

To add a specific implementation of a new element:

The implementation of adding new elements is basically consistent with Dictionary .

4.SortedSet

SortedSet and HashSet , just like sorteddictionary and Like Dictionary . SortedSet support elements are arranged sequentially, the internal implementation is also red black tree, and SortedSet for the red and black tree operation method and SortedDictionary exactly the same. So don't do too much analysis.

5.Stack

Stack is a last-in-first-out structure, the stack of C # is implemented with the help of arrays, taking into account the characteristics of the stack LIFO, using arrays to achieve seemingly inevitable things.

Into the stack operation:

Stack operation:

6.Queue

Queue is an advanced first-out structure, C # queue is also through the use of arrays, with the previous experience, with the help of array implementations will inevitably have array expansion. The queue implementation of C # is actually a way of looping queues, which can be simply understood as the tail-to-toe of the queue. And why did you do it? To conserve storage space and reduce the movement of elements. Because the elements that follow the elements out of the queue are very performance-intensive, but do not move forward, there will always be idle space in front of the wasted memory. So use a circular queue to solve this problem.

Queued Operation:

Team Operation:

Thread-Safe Collection classes

It is important to note that the collections we have described above are not thread-safe, and in a multithreaded environment, there may be thread safety issues. In the case of multithreaded reading, we can use normal collections. In multi-threaded Add/update/delete, we can use manual locking method to ensure thread safety, but should pay attention to lock the scope and granularity, improper locking may lead to poor program performance or even create deadlocks.

A better choice is to use the thread-safe collection provided by C # (namespace: System.Collections.Concurrent). Thread-Safe collections use several algorithms to minimize thread blocking.

    1. concurrentqueue: Thread-safe version of the queue
    2. concurrentstack: The thread-safe version of the stack
    3. concurrentbag: A thread-safe collection of objects
    4. concurrentdictionary: Thread-Safe dictionary
Summarize

It was written with a sudden discovery running up to the data structure. program = data structure + algorithm. The type of collection mentioned above, we need to make the right choice in different scenarios, essentially choosing the right data structure.

Reference:

Https://www.cnblogs.com/jesse2013/p/CollectionsInCSharp.html

https://www.c-sharpcorner.com/article/concurrent-collections-in-net-concurrentdictionary-part-one/

Http://www.cnblogs.com/jeffwongishandsome/archive/2012/09/09/2677293.html

Http://www.cnblogs.com/edisonchou/p/4706253.html



The thing about the code.
Source: http://songwenjie.cnblogs.com/
Statement: This article for Bo Master Learning sentiment Summary, the level is limited, if improper, welcome correction. If you think it's good, just click on the "recommend" button below, thanks for your support. Reprint and quote please specify the source.
Public Number:


C # collection type Big secret

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.