The implementation principle of HashMap in Java

Source: Internet
Author: User

The recent interview was asked about the principle of hashmap in Java, and suddenly speechless, so the lesson from the bitter feeling research.

hashcode and equals in Java 1, about Hashcode
    1. The existence of hashcode is mainly used to find the shortcut, such as Hashtable,hashmap, etc., Hashcode is used in the hash storage structure to determine the storage address of the object
    2. If two objects are the same, which is true for the Equals (Java.lang.Object) method, then the hashcode of the two objects must be the same
    3. If the object's Equals method is overridden, the object's hashcode is also rewritten as much as possible, and the object used by the hashcode must be consistent with the use of the Equals method, or it would violate the 2nd mentioned above.
    4. The hashcode of two objects is the same, and does not necessarily mean that two objects are the same, which is not necessarily applicable to the Equals (Java.lang.Object) method, only that the two objects are in the hash storage structure, such as Hashtable, they are "stored in the same basket"

Again, the hashcode is used for lookup purposes, and equals is used to compare the equality of two objects.

The following interpretation of Hashcode is excerpted from other blogs:

1.hashcode is used to find, if you have learned the data structure you should know that in the Find and sort this chapter has
For example, there is such a location in memory
0 1 2 3 4 5 6 7
And I have a class, this class has a field called ID, I want to put this class in one of the above 8 locations, if not hashcode and arbitrary storage, then when looking for the need to go to these eight locations to find, or using a two-way algorithm.
But if you use hashcode that will improve the efficiency a lot.
We have a field called ID in this class, then we define our hashcode as id%8, and then we store our class in the place where we get the remainder. For example, our ID is 9, 9 except 8 of the remainder is 1, then we put the class exists 1 this position, if the ID is 13, the remainder is 5, then we put the class at 5 this position. In this way, the remainder can be found directly by ID in addition to 8 when the class is looked up.
2. But if the two classes have the same hashcode what to do (we assume that the ID of the class above is not unique), such as 9 divided by 8 and 17 divided by 8, the remainder is 1, then this is not legal, the answer is: can do so. So how do you judge it? At this point, you need to define equals.
In other words, we first determine whether two classes are stored in a bucket by hashcode, but there may be many classes in the bucket, then we need to find our class in this bucket by equals.
So. overriding Equals (), why rewrite Hashcode ()?
If you want to find something in a bucket, you have to find the bucket first, you don't have to rewrite hashcode () to find the bucket, the Light rewrite equals () What's the use
?
2. About equals

1.equals and = =
= = is used to compare references and compare basic data types with different features:
Compare basic data types, if two values are the same, the result is true
When comparing references, if the reference points to the same object in memory, the result is true;

Equals () implements the comparison of objects as a method. Because the = = operator does not allow us to overwrite, that is, it restricts our expression. So we replicate the Equals () method to achieve the same purpose of comparing object content. These are not done through the = = operator.

The Equals () method of the 2.object class is a comparison rule: Returns True if the two objects are of the same type and the content is consistent, and these classes are:
Java.io.file,java.util.date,java.lang.string, packing class (integer,double, etc.)
String S1=new string ("abc");
String S2=new string ("abc");
System.out.println (S1==S2);
System.out.println (s1.equals (S2));
The run result is false true

Second, the realization principle of HashMap1. HashMap Overview

HashMap is a non-synchronous implementation of a hash table-based map interface. This implementation provides all the optional mapping operations and allows NULL values and NULL keys to be used. This class does not guarantee the order of the mappings, especially because it does not guarantee that the order is constant.

In the Java programming language, the most basic structure is two, one is an array, the other is an analog pointer (reference), all the data structures can be constructed with these two basic structure, HASHMAP is no exception. HashMap is actually a "chain-table hash" of the data structure, that is, the combination of arrays and linked lists.

As you can see, the bottom of the HashMap is an array structure, and each item in the array is a linked list. When a new HashMap is created, an array is initialized.

The Java source code is as follows:

/***/transient  entry[] table; Static class Implements Map.entry<k,v> {    final  K key;    V value;    Entry<K,V> next;     Final int Hash;    ...}

As you can see, Entry is the element in the array, and each map.entry is actually a key-value pair, which holds a reference to the next element, which forms the list.

2. HashMap for storage and reading

1) Storage

1  Publicv put (K key, V value) {2     //The hashmap allows null keys and null values to be stored. 3     //when key is null, the Putfornullkey method is called, and value is placed in the first position of the array. 4     if(Key = =NULL)5         returnPutfornullkey (value);6     //The hash value is recalculated based on the keycode of the key. 7     inthash =Hash (Key.hashcode ());8     //searches for the index of the specified hash value in the corresponding table. 9     inti =indexfor (hash, table.length);Ten     //if the Entry at the I index is not NULL, the next element of the E element is traversed continuously through the loop.  One      for(entry<k,v> e = table[i]; E! =NULL; E =e.next) { A Object K; -         if(E.hash = = Hash && (k = e.key) = = Key | |Key.equals (k))) { -             //if the key value is found, the new value is stored and the original value is returned theV OldValue =E.value; -E.value =value; -E.recordaccess ( This); -             returnOldValue; +         } -     } +     //If the entry at the I index is null, there is no entry here.  Amodcount++; at     //adds the key, value, to the I index.  - addentry (hash, key, value, I); -     return NULL; -}

According to the hash is worth the position of this element in the array (that is, subscript), if the array is already stored in the position of other elements, then the elements in this position will be stored in the form of a list, the new join in the chain head, the first to join the end of the chain. If the array has no elements at that position, the element is placed directly at that position in the array.

The hash (int h) method recalculates the hash once based on the hashcode of the key. This algorithm joins the high-level calculation to prevent the low-level and high-level changes, resulting in hash collisions.

1 Static int hash (int  h) {2     H ^= (H >>>) ^ (h >>>N); 3     return H ^ (H >>> 7) ^ (H >>> 4); 4 }

We can see that in the HashMap to find an element, we need to base the hash value of the key to obtain the position in the corresponding array. How to calculate this position is the hash algorithm. Previously said HASHMAP data structure is the combination of arrays and linked lists, so we certainly hope that the hashmap inside the element location as far as possible, so that the number of elements in each position is only one, then when we use the hash algorithm to obtain this position, It is immediately possible to know that the corresponding position element is what we want, without having to traverse the linked list, which greatly optimizes the efficiency of the query.

According to the source code of the Put method above, when the program tries to put a key-value pair into HashMap, the program first determines the storage location of the Entry based on the hashcode () return value of the key: if the Entry () of two hashcode keys The return values are the same, and they are stored in the same location. If these two Entry keys return true by equals, the newly added Entry value overrides the Entry value in the collection, but the key is not overwritten. If these two Entry keys return false by Equals, the newly added Entry will form a Entry chain with Entry in the collection, and the newly added Entry is located in the head of the Entry chain--Specify to continue to see AddEntry () Description of the method.

In this way, we can effectively solve the conflict problem of HashMap.

2) Read

1  PublicV get (Object key) {2     if(Key = =NULL)3         returnGetfornullkey ();4     inthash =Hash (Key.hashcode ());5      for(Entry<k,v> e =table[indexfor (hash, table.length)];6E! =NULL;7E =e.next) {8 Object K;9         if(E.hash = = Hash && (k = e.key) = = Key | |Key.equals (k)))Ten             returnE.value; One     } A     return NULL; -}

When a get element is obtained from HashMap, the hashcode of the key is computed first, an element in the corresponding position in the array is found, and the required element is found in the linked list of the corresponding position through the Equals method of key.

3) summed up simply said, HashMap at the bottom of the key-value as a whole to deal with, this whole is a Entry object. HashMap uses a entry[] array to hold all key-value pairs, and when a Entry object needs to be stored, the hash algorithm is used to determine where it is stored in the array, and where it is stored in the linked list on the array location according to the Equals method When a entry is needed, it is also located in the array based on the hash algorithm, and the entry is removed from the linked list at that location according to the Equals method.

3, HashMap's resize

When there are more and more elements in the HashMap, the probability of collisions becomes higher (because the length of the array is fixed), so in order to improve the efficiency of the query, we need to expand the HashMap array, the array expansion will also appear in the ArrayList, so this is a common operation , many people have expressed doubts about its performance, but to think about our "averaging" principle, we are relieved, and after the HashMap array is expanded, the most performance-consuming point arises: the data in the original array must recalculate its position in the new array and put it in, which is resize.

So when is the hashmap going to be enlarged? When the number of elements in HashMap exceeds the array size *loadfactor, the array is expanded, the default value of Loadfactor is 0.75, that is, by default, the array size is 16, then when the number of elements in HashMap exceeds 16*0.75= At 12, it expands the size of the array to 2*16=32, which is one-fold, then recalculates the position of each element in the array, which is a very performance-intensive operation, so if we have predicted the number of elements in HashMap, Then the number of preset elements can effectively improve the performance of HashMap. For example, we have 1000 elements new HashMap (1000), but in theory new HASHMAP (1024) is more appropriate, but above Annegu has said that even 1000,hashmap will automatically set it to 1024. But new HashMap (1024) is not more suitable, because 0.75*1000 < 1000, that is, in order to let 0.75 * size > 1000, we must so new HashMap (2048) is the most suitable, both to consider the problem of & , but also avoids the problem of resize.

Summary: The implementation principle of HashMap:

    1. Computes the subscript of the current object's elements in the array using the hashcode of the key.
    2. When storing, if there is a key with the same hash value, there are two cases at this time. (1) If key is the same, the original value is overwritten, (2) If the key is different (conflict), the current key-value is placed in the list
    3. When obtained, the corresponding subscript of the hash value is found directly, and the corresponding value is found by further judging if key is the same.
    4. Understanding the above process is not difficult to understand how to solve the problem of hash conflict, the core is the use of the HashMap array storage, and then the object of the conflict key into the list, once the conflict is found in the linked list to do a further comparison.

1. Reprint Annotated: http://www.cnblogs.com/yuanblog/p/4441017.html

2. This article for personal notes, experience, may cite other articles, so the blog is only in private scope for everyone to study reference.

Reference blog:

Http://www.cnblogs.com/yxnchinahlj/archive/2010/09/27/1836556.html

http://blog.csdn.net/fenglibing/article/details/8905007

http://zhangshixi.iteye.com/blog/672697

The implementation principle of HashMap in Java

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.