Dictionary, sorteddictionary, and sortedlist are. NET Framework's three classes that support generic and keyword search. They all belong to the system. Collections. Generic namespace. Function They are so similar that we often confuse them in actual use. Therefore, it is necessary to compare them.
1. Implementation
Obtain the following information from msdn:
Dictionary <(of <(tkey, tvalue>) generic classes provide ing from a group of keys to a group of values. Each added item in the dictionary is composed of a value and its associated keys. Keys are used to retrieve values very quickly, which is close to O (1). This is because dictionary <(of <(tkey, tvalue>) class is implemented as a hash table.
The retrieval speed depends onTypeHashAlgorithmQuality.
It can be seen that dictionary is basically a hashtable. However, it is faster than the hashtable class because it supports generics ~ (We will use an experiment later to prove that even the object-type dictionary is a little faster than hashtable ).
--- Gorgeous split line ---
Sorteddictionary <(of <(tkey, tvalue>) is a binary search tree with search complexity O (log n). n is the number of elements in the dictionary. In this case, it is similar to sortedlist <(of <(tkey, tvalue>) generic class. These two classes are similarObjectModel, and all have O (log n) Search complexity. The difference between the two classes is thatMemoryUsage and the speed at which elements are inserted and removed:
Sortedlist <(of <(tkey, tvalue>) uses less memory than sorteddictionary <(of <(tkey, tvalue>.
Sorteddictionary <(of <(tkey, tvalue>)DataPerform faster insert and remove operations: ItsTimeThe complexity is O (log n), and sortedlist <(of <(tkey, tvalue>)> is O (n ).
Sortedlist <(of <(tkey, tvalue>)> is faster than sorteddictionary <(of <(tkey, tvalue>) if the list is filled with sorted data at a time.
Each key/value pair can be searched as a keyvaluepair <(of <(tkey, tvalue>) structure, or as a dictionaryentry, through a non-generic idictionary interface.
As long as the keys are used as the keys in sorteddictionary <(of <(tkey, tvalue>)>, they must be immutable. Each key in sorteddictionary <(of <(tkey, tvalue>) must be unique. The key cannot be referenced by nullnothingnullptrnull (nothing in Visual Basic). However, if the value type is tvalue, the value can be null.
Sorteddictionary <(of <(tkey, tvalue>) requires a comparator to perform key comparison. You can use a constructor that accepts the comparer parameter.FunctionTo specify the implementation of the icomparer <(of <(T>)> generic interface, use the default generic comparator comparer <(of <(T> ).. ::. default. If the tkey type implements the system...:. icomparable <(of <(T>)> generic interface, the default comparator uses this implementation.
C # Language foreachStatement(For each in C ++ and for each in Visual Basic) the type of each element in the collection is required. Because each element of sorteddictionary <(of <(tkey, tvalue>) is a key/value pair, the element type is neither a key type nor a value type. Instead, it is the keyvaluepair <(of <(tkey, tvalue>)> type.
It can be seen that sorteddictionary is similar to a balanced binary search tree (AVL). Since it is a BST, we can certainly traverse it in the middle order. There are two methods:
1. For each
2. Object. getenumerator
Lab:
- Dim testobject as new sorteddictionary (integer, integer)
- With testobject
- . Add (7,2)
- . Add (0, 1)
- . Add (5, 3)
- . Add (1, 1)
- . Add (4, 4)
- End
- For each kVp as collections. Generic. keyvaluepair (of integer, integer) in testobject
- Msgbox kVp. Key
- Next
CopyCode
The obtained sequence is 0, 1, 4, 5, and 7 (sortedlist is the same)
However, if you replace sorteddictionary with dictionary, the result is 7, 0, 5, 1, 4.
Another Traversal method:
- With testobjectx. getenumerator ()
- While. movenext ()
- Msgbox (. Current. Key)
- End while
- End
Copy code
--- Gorgeous split line ---
Sortedlist <(of <(tkey, tvalue>) a generic class is a binary search tree with O (log n) retrieval, where N is the number of elements in the dictionary. In this case, it is similar to sorteddictionary <(of <(tkey, tvalue>) generic class. These two classes have similar object models and both have O (log n) Search complexity. The difference between these two classes is the memory usage and the speed at which elements are inserted and removed:
Sortedlist <(of <(tkey, tvalue>) uses less memory than sorteddictionary <(of <(tkey, tvalue>.
Sorteddictionary <(of <(tkey, tvalue>) allows you to perform faster insert and remove operations on unordered data. The complexity of this operation is O (log n ), the complexity of sortedlist (of <(tkey, tvalue>) is O (n ).
Sortedlist <(of <(tkey, tvalue>)> is faster than sorteddictionary <(of <(tkey, tvalue>) if the list is filled with sorted data at a time.
Another difference between sorteddictionary <(of <(tkey, tvalue>) class and sortedlist <(of <(tkey, tvalue>) class is: sortedlist <(of <(tkey, tvalue>) SupportsAttributeThe returned set performs efficient index retrieval on keys and values. You do not need to regenerate the list when accessing this attribute, because the list is only wrapped in the internal array of the key and value.
How is the insert operation of a binary tree O (n )?
There is a saying on the Internet that sortedlist contains two arrays, which are inserted like O (N ^ 2) insertion sorting (each action is O (n )), however, inserting ordered data is extremely fast (each action becomes O (1 )). the same situation occurs inDeleteData.
- Dim testobject as new sortedlist (of integer, integer)
- For I as integer = 1 to 1000000
- Testobject. Add (I, randomgenerator. Next ())
- Next
Copy code
Of course, randomgenerator is our random number generator:
- Dim randomgenerator as new random
Copy code
The preceding code is executed quite quickly because the key values of the inserted data are ordered.
If I is changed to 1000000-I, the speed will be terrible immediately.
The same situation occurs when I is replaced with a random number. An error occurs after a period of waiting because the key value cannot be repeated.
In this case, sortedlist is not like a binary tree structure.
Sortedlist also has the function of directly accessing keys and values with the key value ranking K.
The methods are object. Keys (K) and object. Values (k ).
This confirms the online statement.
I think sortedlist is useless-unless it is for basic ordered data or memory. if you only need to add the K-ranked node search function to the BST, you can use a classic algorithm: Add a leftsize to each node to store the size of its left subtree. (Of course, you can also use the SBT of cqf. that Sb maintain... ~)
2. Functions
The functions of these three classes are almost mentioned above, because the implementation determines the function. Here is a summary.
Functions of dictionary:
Add <K, V>, clear, contains <k/V>, getcount, enumerator (unordered), getitem <k>, remove <k>
New Features of sorteddictionary:
The Enumerator is ordered-corresponds to the sequential traversal of BST.
New Functions of sortedlist:
Capacity (Set/get)-after all, people are Arrays
Indexofkey, indexofvalue (return the ranking of the key corresponding to value instead of the ranking of value)
Keys (K), values (k)-returns the K element of the array sorted by key
3. Speed
Practice Zhizhi-a celebrity.
Wrong theory and practice-thity.
Our testProgram:
-
- Module dictionaryspeedtest
-
- Dim randomgenerator as new random
- Dim arraylistdata as new list (of key_n_data)
-
- Dim testobject as New Dictionary (of long, long)
-
-
- Structure key_n_data
-
- Dim key as int64
-
- Dim data as int64
-
- End Structure
-
-
- Const item_count as integer = 1000000.
-
- Const test_count as integer = 500000
-
-
- Dim lasttick as long
-
-
- Sub timerstart (byval text as string)
-
- Console. Write (text)
-
- Lasttick = now. ticks
- End sub
-
-
- Sub timerend ()
-
- Dim t as integer = now. ticks-lasttick
-
- Console. writeline (t) \ 10000). tostring () & "Ms ")
-
- End sub
-
-
- Sub main ()
-
- Process. getcurrentprocess. priorityclass = processpriorityclass. High
-
- Console. writeline (testobject. GetType (). tostring ())
-
-
- Timerstart ("generating data ...")
-
- For I as integer = 1 to item_count
- Dim thiskeydata as key_n_data
-
- Thiskeydata. Key = (clng (randomgenerator. Next () <31) or randomgenerator. Next ()
-
- Thiskeydata. Data = (clng (randomgenerator. Next () <31) or randomgenerator. Next ()
-
- Arraylistdata. Add (thiskeydata)
-
- Next
-
- Timerend ()
-
-
- Timerstart ("Test 1: add data test ...")
-
- For each item as key_n_data in arraylistdata
- Testobject. Add (item. Key, item. Data)
-
- Next
-
- Timerend ()
-
-
- Timerstart ("Test 2: Find data test ...")
-
- For I as integer = 1 to test_count
-
- With arraylistdata. Item (randomgenerator. Next (0, item_count ))
-
- If not equals (testobject (. Key),. Data) Then msgbox ("error! ")
-
- End
-
- Next
-
- Timerend ()
-
- Timerstart ("Test 3: remove data test ...")
-
- For I as integer = 1 to test_count
-
- Testobject. Remove (arraylistdata. Item (randomgenerator. Next (0, item_count). Key)
-
- Next
-
- Timerend ()
-
- End sub
-
- End Module
Copy code
By changing the testobject type, we can easily compare the speed of these three classes. Test results:
Add find remove
Dictionary 265 Ms 203 Ms 187 Ms
Sorteddictionary 1843 Ms 828 Ms 1234 Ms
Sortedlist N/
We can reduce item_count and test_count by 10 times:
Add find remove
Dictionary 15 ms 31 Ms 15 ms
Sorteddictionary 93 Ms 46 Ms 38 ms
Sortedlist 8031 Ms 15 ms 6046 Ms
The random search of sortedlist is faster than dictionary and sorteddictionary (hashtable and BST). In this way, sortedlist does not seem to be a simple array. (But I still think it is useless)
4. Summary
If it is only used as an index, use dictionary.
To find the smallest elements or traverse the elements in order, use sorteddictionary.
IfInput/The deleted elements are in ascending order, or the number of accesses is much higher than the number of modifications, or the element with the K size needs to be accessed, or the memory is stingy with Bt. Use sortedlist. (It is actually the most useful... orz)
PS: Microsoft seems to be very stingy. sorteddictionary only supports incremental Order (default comparator). If we want to reduce the order, we have to write a comparator by ourselves.
- Class mycomparer
- Inherits comparer (of Long)
- Public overrides function compare (byval X as long, byval y as long) as integer
- Return comparer (of long). Default. Compare (Y, X)
- End Function
- End Class
- Dim testobject as new sortedlist (of long, long) (new mycomparer)
Copy code
Now we can start the dictionary vs hashtable showdown.
- Const item_count as integer = 1000000.
- Const test_count as integer = 500000
Copy code
Add find remove
Dictionary (of long, long) 271 Ms 203 Ms 187 Ms
Dictionary (of object) 468 Ms 312 Ms 234 Ms
Hashtable 859 Ms 390 Ms 218 Ms
Conclusion: It is best to use dictionary instead of hashtable.