Dictionary, sorteddictionary, sortedlist comparison

Source: Internet
Author: User

Http://hi.baidu.com/keeper/blog/item/1119d01ba481ddd3ad6e75e7.html

Dictionary, sorteddictionary, sortedlist is. the three classes that support generic search and keyword search belong to system. collections. generic namespace. their names and functions are very similar, so we often confuse them when using them. therefore, it is necessary to compare them.

1. Implementation
Obtain the following information from msdn:
Dictionary <(of <(tkey, tvalue>) generic classes provide ing from a group of keys to a group of values. Each added item in the dictionary is composed of a value and its associated keys. Keys are used to retrieve values very quickly, which is close to O (1). This is because dictionary <(of <(tkey, tvalue>) class is implemented as a hash table.
The retrieval speed depends on the quality of the hash algorithm of the type specified for tkey.

It can be seen that dictionary is basically a hashtable. However, it is faster than the hashtable class because it supports generics ~ (We will use an experiment later to prove that even the object-type dictionary is a little faster than hashtable ).

--- Gorgeous split line ---

Sorteddictionary <(of <(tkey, tvalue>) is a binary search tree with search complexity O (log n). n is the number of elements in the dictionary. In this case, it is similar to sortedlist <(of <(tkey, tvalue>) generic class. These two classes have similar object models and both have O (log n) Search complexity. The difference between these two classes is the memory usage and the speed at which elements are inserted and removed:

Sortedlist <(of <(tkey, tvalue>) uses less memory than sorteddictionary <(of <(tkey, tvalue>.

Sorteddictionary <(of <(tkey, tvalue>) allows you to perform faster insert and remove operations on unordered data. The time complexity of sorteddictionary is O (log n ), sortedlist <(of <(tkey, tvalue>) is O (n ).

Sortedlist <(of <(tkey, tvalue>)> is faster than sorteddictionary <(of <(tkey, tvalue>) if the list is filled with sorted data at a time.

Each key/value pair can be searched as a keyvaluepair <(of <(tkey, tvalue>) structure, or as a dictionaryentry, through a non-generic idictionary interface.

As long as the keys are used as the keys in sorteddictionary <(of <(tkey, tvalue>)>, they must be immutable. Each key in sorteddictionary <(of <(tkey, tvalue>) must be unique. The key cannot be referenced by nullnothingnullptrnull (nothing in Visual Basic). However, if the value type is tvalue, the value can be null.

Sorteddictionary <(of <(tkey, tvalue>) requires a comparator to perform key comparison. You can use a constructor that accepts the comparer parameter to specify the implementation of the icomparer <(of <(T>)> generic interface. If no implementation is specified, use the default generic comparator comparer <(of <(T> ).. ::. default. If the tkey type implements the system...:. icomparable <(of <(T>)> generic interface, the default comparator uses this implementation.

The foreach Statement of C # language (for each in C ++ and for each in Visual Basic) requires the type of each element in the set. Because each element of sorteddictionary <(of <(tkey, tvalue>) is a key/value pair, the element type is neither a key type nor a value type. Instead, it is the keyvaluepair <(of <(tkey, tvalue>)> type.

It can be seen that sorteddictionary is similar to a balanced binary search tree (AVL). Since it is a BST, we can certainly traverse it in the middle order. There are two methods:
1. For each
2. Object. getenumerator

Lab:

[Copy to clipboard]

Code:

Dim testobject as new sorteddictionary (integer, integer)
With testobject
. Add (7,2)
. Add (0, 1)
. Add (5, 3)
. Add (1, 1)
. Add (4, 4)
End

For each kVp as collections. Generic. keyvaluepair (of integer, integer) in testobject
Msgbox kVp. Key
Next

The obtained sequence is 0, 1, 4, 5, and 7 (sortedlist is the same)
However, if you replace sorteddictionary with dictionary, the result is 7, 0, 5, 1, 4.

Another Traversal method:

[Copy to clipboard]

Code:

With testobjectx. getenumerator ()
While. movenext ()
Msgbox (. Current. Key)
End while
End

--- Gorgeous split line ---

Sortedlist <(of <(tkey, tvalue>) a generic class is a binary search tree with O (log n) retrieval, where N is the number of elements in the dictionary. In this case, it is similar to sorteddictionary <(of <(tkey, tvalue>) generic class. These two classes have similar object models and both have O (log n) Search complexity. The difference between these two classes is the memory usage and the speed at which elements are inserted and removed:

Sortedlist <(of <(tkey, tvalue>) uses less memory than sorteddictionary <(of <(tkey, tvalue>.

Sorteddictionary <(of <(tkey, tvalue>) allows you to perform faster insert and remove operations on unordered data. The complexity of this operation is O (log n ), the complexity of sortedlist (of <(tkey, tvalue>) is O (n ).

Sortedlist <(of <(tkey, tvalue>)> is faster than sorteddictionary <(of <(tkey, tvalue>) if the list is filled with sorted data at a time.

Another difference between sorteddictionary <(of <(tkey, tvalue>) class and sortedlist <(of <(tkey, tvalue>) class is: sortedlist <(of <(tkey, tvalue>)> supports efficient index retrieval by using the set Keys and values returned by the keys and values attributes. You do not need to regenerate the list when accessing this attribute, because the list is only wrapped in the internal array of the key and value.

Quote: How is the insert operation of a binary tree O (n )?

There is a saying on the Internet that sortedlist contains two arrays, which are inserted like O (N ^ 2) insertion sorting (each action is O (n )), however, inserting ordered data is extremely fast (each action becomes O (1 )). the same situation occurs when data is deleted.

[Copy to clipboard]

Code:

Dim testobject as new sortedlist (of integer, integer)
For I as integer = 1 to 1000000
Testobject. Add (I, randomgenerator. Next ())
Next

Of course, randomgenerator is our random number generator:

[Copy to clipboard]

Code:

Dim randomgenerator as new random

The preceding code is executed quite quickly because the key values of the inserted data are ordered.
If I is changed to 1000000-I, the speed will be terrible immediately.
The same situation occurs when I is replaced with a random number. An error occurs after a period of waiting because the key value cannot be repeated.
In this case, sortedlist is not like a binary tree structure.

Sortedlist also has the function of directly accessing keys and values with the key value ranking K.
The methods are object. Keys (K) and object. Values (k ).
This confirms the online statement.

I think sortedlist is useless-unless it is for basic ordered data or memory. if you only need to add the K-ranked node search function to the BST, you can use a classic algorithm: Add a leftsize to each node to store the size of its left subtree. (Of course, you can also use the SBT of cqf. that Sb maintain... ~)

2. Functions
The functions of these three classes are almost mentioned above, because the implementation determines the function. Here is a summary.
Functions of dictionary:
Add <K, V>, clear, contains <k/V>, getcount, enumerator (unordered), getitem <k>, remove <k>
New Features of sorteddictionary:
The Enumerator is ordered-corresponds to the sequential traversal of BST.
New Functions of sortedlist:
Capacity (Set/get)-after all, people are Arrays
Indexofkey, indexofvalue (return the ranking of the key corresponding to value instead of the ranking of value)
Keys (K), values (k)-returns the K element of the array sorted by key

3. Speed
Practice Zhizhi-a celebrity.
Wrong theory and practice-thity.

Our testing program:

[Copy to clipboard]

Code:

Module dictionaryspeedtest
Dim randomgenerator as new random
Dim arraylistdata as new list (of key_n_data)
Dim testobject as New Dictionary (of long, long)

Structure key_n_data
Dim key as int64
Dim data as int64
End Structure

Const item_count as integer = 1000000.
Const test_count as integer = 500000

Dim lasttick as long

Sub timerstart (byval text as string)
Console. Write (text)
Lasttick = now. ticks
End sub

Sub timerend ()
Dim t as integer = now. ticks-lasttick
Console. writeline (t) \ 10000). tostring () & "Ms ")
End sub

Sub main ()
Process. getcurrentprocess. priorityclass = processpriorityclass. High
Console. writeline (testobject. GetType (). tostring ())

Timerstart ("generating data ...")
For I as integer = 1 to item_count
Dim thiskeydata as key_n_data
Thiskeydata. Key = (clng (randomgenerator. Next () <31) or randomgenerator. Next ()
Thiskeydata. Data = (clng (randomgenerator. Next () <31) or randomgenerator. Next ()
Arraylistdata. Add (thiskeydata)
Next
Timerend ()

Timerstart ("Test 1: add data test ...")
For each item as key_n_data in arraylistdata
Testobject. Add (item. Key, item. Data)
Next
Timerend ()

Timerstart ("Test 2: Find data test ...")
For I as integer = 1 to test_count
With arraylistdata. Item (randomgenerator. Next (0, item_count ))
If not equals (testobject (. Key),. Data) Then msgbox ("error! ")
End
Next
Timerend ()

Timerstart ("Test 3: remove data test ...")
For I as integer = 1 to test_count
Testobject. Remove (arraylistdata. Item (randomgenerator. Next (0, item_count). Key)
Next
Timerend ()
End sub
End Module

By changing the testobject type, we can easily compare the speed of these three classes. Test results:

Add find remove
Dictionary 265 Ms 203 Ms 187 Ms
Sorteddictionary 1843 Ms 828 Ms 1234 Ms
Sortedlist N/

We can reduce item_count and test_count by 10 times:

Add find remove
Dictionary 15 ms 31 Ms 15 ms
Sorteddictionary 93 Ms 46 Ms 38 ms
Sortedlist 8031 Ms 15 ms 6046 Ms

The random search of sortedlist is faster than dictionary and sorteddictionary (hashtable and BST). In this way, sortedlist does not seem to be a simple array. (But I still think it is useless)

4. Summary
If it is only used as an index, use dictionary.
To find the smallest elements or traverse the elements in order, use sorteddictionary.
If the input/deletion elements are in ascending order, or the number of accesses is much higher than the number of modifications, or you need to access the element at the K level, or you are stingy with the BT memory, use sortedlist. (It is actually the most useful... orz)

PS: Microsoft seems to be very stingy. sorteddictionary only supports incremental Order (default comparator). If we want to reduce the order, we have to write a comparator by ourselves.

[Copy to clipboard]

Code:

Class mycomparer
Inherits comparer (of Long)

Public overrides function compare (byval X as long, byval y as long) as integer
Return comparer (of long). Default. Compare (Y, X)
End Function
End Class

Dim testobject as new sortedlist (of long, long) (new mycomparer)

Now we can start the dictionary vs hashtable showdown.

[Copy to clipboard]

Code:

Const item_count as integer = 1000000.
Const test_count as integer = 500000

Add find remove
Dictionary (of long, long) 271 Ms 203 Ms 187 Ms
Dictionary (of object) 468 Ms 312 Ms 234 Ms
Hashtable 859 Ms 390 Ms 218 Ms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.