On the similarities and differences between Hashtable and dictionary

Source: Internet
Author: User
Tags add array constructor continue empty hash insert key

The previous understanding of these two sets of classes is only to stay in the support of generics, these days while looking at the opportunity to look at the introduction of algorithms, the two classes of the internal implementation mechanism for a good understanding.

Hashtable and dictionary from the data structure of all belong to the Hashtable, are the key words (key value) for hashing operations, the keyword is hashed to a Hashtable slot, the difference is to deal with the collision method. The hash function may hash different keywords into the same slot in the Hashtable, at which point we call a collision, and in order to insert the data we need another way to solve the problem.

Link method (Chaining)

In the link method, place all the elements of the hash into the same slot in a list with a pointer to the head of the list, or nil if not. For a hash table that can hold n elements and have m slots, we define load factor A as n/m, which is the average number of elements stored in a chain.

Link method in the join, delete, find the operation is basically a list of basic operations. I'm not going to talk about it here.

Open addressing method (opening addressing)

In the open addressing method, all elements are stored in a hash table, rather than as a link method, where the data is stored in an external list, and in open addressing, the slot must be greater than or equal to n because the data is all in the hash, which means that the load factor must be less than or equal to 1.

In open addressing, when inserting an element, we pass the keyword and the probe number (accumulated from 0) as input to the hash function, and the hash function returns the corresponding slot. First look for a hash (key,0) slot when inserting, if not empty, probe number +1, continue to check the next slot until the slot is found, or the hash table is full. The lookup process is similar to inserting, and when searching for a keyword, if we run into an empty slot, the lookup ends, because if the keyword exists, it should appear in the same place.

The most special of the open addressing method is the deletion operation, if you delete the data to NULL, then there will be a problem, such as we inserted in the process of k when we found that slot I has been occupied, we plug into the back of the slot, if deleted when we simply put slot I to null, Then the keyword K will not be found when searching. We can solve this problem with a bit of sign. The specific implementation will be described below.

Double Hash

There are a number of probing methods for open addressing, where only double probing is possible, because this method is one of the best methods and it is used in Hashtable.

Here and for the auxiliary hash function, for the first time, the subsequent probing position is based on the offset, then modulo the M. What needs to be mentioned here is to look up the entire hash table, which needs to be the size of the slot m coprime, and so on to see how this condition is met in the Hashtable class.

After explaining the link method and the open addressing method, speaking Hashtable and dictionary.

Hashtable this class uses an open addressing method to solve the collision problem, here's a look at a constructor for Hashtable

 
 
    1. this.loadfactor = 0.72f * loadfactor;    
    2.  double num =  ((float)  capacity)  / this.loadFactor;    
    3.  if  (num > 2147483647.0)     
    4.  {     
    5.    throw new argumentexception ( Environment.getresourcestring ("Arg_htcapacityoverflow"));    
    6. }    
    7. Li class= "alt" > int num2 =  (num > 11.0)  ? hashhelpers.getprime ((int)  num)  : 11;    
    8.  this.buckets = new bucket[num2];     
    9.  this.loadsize =  (int)   (this.loadfactor *  NUM2);    
    10.  this.iswriterinprogress = false;  

The constructor multiplies the incoming load factor by 0.72, a value that Microsoft considers ideal. As mentioned above, we need to maintain and slot size M coprime in double hashing, we just need to ensure that M is prime, and smaller than M, so that they are always coprime. The hashhelpers.getprime here is to return a prime number larger than NUM, which guarantees that the num2 is always a prime and then sets the slot array.

(This. Gethash (key) & 0x7fffffff) This corresponds to a double hash formula, 1 + ((UINT) ((seed >> 5) + 1) (hashsize-1));

The Hash_coll in the slot is used to hold the key corresponding to the hashcode, the highest bit to identify whether there has been a collision, the collision of the highest bit of the slot will be set to 1, when searching, if the highest bit of 1 then the search function will continue to search, pay attention to the Contains method in the

 
 
  1. Todo
  2. {
  3. bucket = Buckets[index];
  4. if (Bucket.key = null)
  5. {
  6. return false;
  7. }
  8. if ((Bucket.hash_coll & 0x7fffffff) = = num3) && this. Keyequals (Bucket.key, key))
  9. {
  10. return true;
  11. }
  12. index = (int) (index + num2)% ((ULONG) buckets. Length));
  13. }
  14. while (Bucket.hash_coll < 0) && (++num4 < buckets. Length));

BTW, when I looked at this method, I thought that the search function could actually be written by skipping the Bucket.key = = This.buckets, because if Bucket.hash_coll < 0 in the removal method, then Bucket.key = This.buckets, later thought for a while, Bucket.hash_coll < 0 so more efficient, here is not to say why, love thinking friends in the back write your answer it.

In the Add method, you need to check for count, if you reach the set value, this time you need to expand the Hashtable, the enlarged capacity is more than twice times the current capacity of a prime number, and then the existing elements in the hash operation, which is equivalent to reinsert the new slot array. For the effect of the index variable in the Insert method I still have some questions when I look at the code, if there are any friends who know the trouble to tell in the message.

Dictionary<tkey, tvalue> this generic class uses the link method to solve the collision, where the bucket store is pointing to the entry subscript, entry is equivalent to the node in the list, The entry is stored in a subscript that points to the next element that creates a collision. Slightly different is that the entry here is an array.

 
  
  
  1. public struct Entry<tkey, tvalue>
  2. {
  3. public int hashcode;
  4. public int Next;
  5. Public TKey key;
  6. public TValue value;
  7. }

The add operation of the dictionary first calculates the hash value of the element, then searches for the bucket based on the hash value, finds the corresponding bucket, and stores the value in the entry. and point the bucket to the corresponding entry. The query operation logic finds the corresponding bucket based on the hash value and then searches through the bucket to the entry array.

A little need to mention is the Remove method, in order to remove the entry of the node for reuse, Dictionary has a freelist field, the deletion of the subscript value of the node, to assign to Freelist, when the add operation if the freelist> 0 inserts the data into the Freelist pointing entry.

Original link: http://www.cnblogs.com/MichaelYin/archive/2011/02/14/1954724.html

"Edit Recommendation"



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.