Analyze the C # Set internally ---- HashTable

Source: Internet
Author: User

I plan to write several articles specifically to introduce HashTable, Dictionary, HashSet, SortedList, List and other collection objects, and analyze the principles from the inside so that they can be selected and used in practical applications. This article first introduces HashTable. A few questions are raised in the precedent: 1. Why does Hashtable have a fast query speed, but the addition speed is relatively slow, and the ratio of its addition speed to the query speed is different by a quantity level? 2. What is the Load Factor and what is the default Load Factor of hashtable? 3. are elements in hashtable ordered in sequence? 4. What is the default length of the data bucket (array) in hashtable? Why can it only be a prime number? The data in Hashtable is actually stored in an internal data bucket (the bucket structure array), which is the same as an ordinary array and has a fixed capacity. The value is obtained based on the array index. The following describes how the Hashtable can be implemented internally and what work has been done internally. 1. new Hashtable, Hashtable ht = new Hashtable (); Hashtable has multiple constructor, commonly used is the non-argument constructor: Hashtable ht = new Hashtable (), when a new hashtable is created, it does the following internally: Call Hashtable (int capacity, float loadFactor), where capacity is: 0, loadFactor is: 1, the size of the initialized bocket array is 3 and the load factor is 0.72 (this value is the value given by Microsoft after the trade-off), as shown in, this figure captures Reflector 2 and adds an element to Hashtable, ht. add ("a", "123") 1. Determine whether the ratio of the number of elements of the current Hashtable: ht to the bucket array exceeds the loading Factor 0.72, 1) less than 0.72: hash the value of a, and then perform the modulo calculation on the obtained value and the length of the bucket array, and insert the result of the modulo to B. The index corresponding to the ucket array, and "123" is assigned its value. because the hash value may be repeated (not clear about Baidu), resulting in address conflicts, Hashtable uses the "open address method" to handle conflicts. The specific behavior is to set HashOf (k) % Array. change Length to (HashOf (k) + d (k) % Array. length to obtain the data corresponding to the keyword "a" stored in another location. d is an incremental function. if the conflict persists, perform the incremental operation again. Follow this cycle until a blank space in the Array is found. 2) greater than 0.72: expands the bucket array. a creates an array (the size of the array is twice the prime number of the current capacity. For example, the length of the current array is 3, the length of the new array is 7 ). B. Copy the original array elements to the new array. because the length of the bocket array has changed, re-hash all keys (this is an important factor affecting hashtable performance ). C. Perform step a above. 3. Obtain the value corresponding to Hashtable through key. var v = ht ["a"]; 1) Calculate the hash value of ". 2) perform modulo calculation on the calculated result and the length of the bocket array. Because the hash value may conflict, the key on a similar index may be different from the input key, in this case, continue to find the next location ..... 3) The result of the modulo operation is the index "123" stored on the bocket array. Hashtable has many other methods, such as Clear, Remove, ContainsKey, and ContainsValue. Due to space limitations, we will not discuss them one by one. Here we will answer several questions at the beginning of the article. 1. The Hashtable query speed is fast because it is located based on the array index internally. The slightly consumed performance is the hash value of the KEY. The slow query performance is caused by:, when adding an element, the address may conflict and you need to locate the address again. B. Copy the array after expansion and re-hash all keys of the old array. 2. the fill factor is Hashtable "number of existing elements/length of the internal bucket array". This ratio is too large, resulting in increased conflict probability and space waste. The default value is 0.72. This value is a balanced value obtained by Microsoft after a large number of experiments. The loading factor range is 0.1 <loadFactor <1. Otherwise, an ArgumentOutOfRangeException is thrown. 3. It's not in order (you should know why it's not in order after reading the article ?) 4. The default length is 3. I see the decompiled code of. net framework 4.5. Other versions of. net framework do not know if this value is used. Why must the length of the expanded array be a prime number? A prime number has one feature that can only be divisible by itself and 1. If it is not a prime number, multiple values may appear during modulo calculation. I wrote this article today. There are many mistakes or unreasonable points. I hope you can point them out in time. This cainiao will change their knowledge and errors !!! First, we will introduce Hashtable. Next, we will introduce dictionary. Then, we will select a scenario for comprehensive analysis in the project.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.