As we all know, HashMap is a collection of Key-value key-value pairs, and each key-value pair is also called entry. These key-value pairs (Entry) are stored in an array, which is the backbone of the hashmap.
The initial value of each element of the HashMap array is null.
For HashMap, we use two methods most often: get and put.
The principle of 1.Put method
What happens when you invoke the Put method.
For example, call Hashmap.put ("Apple", 0) and insert an element with the key "Apple". At this point we need to use a hash function to determine the insertion position (index) of the entry:
index = Hash ("Apple")
Assuming the last calculated index is 2, the result is as follows:
However, because the length of the HashMap is limited, when the insertion of the entry more and more, the perfect hash function will inevitably appear in the case of index conflict. Let's say this:
What to do at this time. We can use the linked list to solve.
Each element of the HashMap array is more than just a entry object, but also the head node of a linked list. Each entry object points to its next entry node through the next pointer. When a new entry is mapped to a conflicting array position, only the corresponding list needs to be inserted:
Note that when the new entry node inserts the list, it uses the header interpolation method. As for why not insert the tail of the list, there will be an explanation later.
The principle of 2.Get method
What happens when you use the Get method to find value based on the key.
First will be the input key to do a hash map, get the corresponding index:
index = Hash ("Apple")
Because of the hash conflict just mentioned, the same location may match to multiple entry, this time need to follow the corresponding list of the head node, a one down to find. Suppose the key we're looking for is "apple":
The first step, we are looking at the head node Entry6,entry6 key is banana, obviously not the result we are looking for.
In the second step, we looked at the next node Entry1,entry1 key is Apple, which is what we are looking for.
The reason for putting Entry6 on the head node is because HashMap's inventor believes that the entry that is inserted later is more likely to be searched.
Previously, a hash function was used when mapping from key to the corresponding position of the HashMap array:
index = Hash ("Apple")
How to achieve a uniform distribution of the hash function. We do some sort of operation by using the hashcode value of the key.
index = hashcode (Key)% Length?
How to do bit arithmetic. The following formula (length is HashMap):
index = hashcode (Key) & (Length-1)
Here we demonstrate the entire process with the value "book" Key:
1. Calculates the book's hashcode, the result is the decimal 3029737, the binary 101110001110101110 1001.
2. Assuming the HashMap length is the default 16, the computed Length-1 result is 15 of the decimal, and 1111 of binary.
3. To do with the above two results and operations, 101110001110101110 1001 & 1111 = 1001, decimal is 9, so index=9.
It can be said that the end of the hash algorithm to get the results of the index, completely depends on the key hashcode value of the last few.
Assuming that the length of the HashMap is 10, repeat the steps of the operation just now:
Looking at the result alone, there is no problem on the surface. We're going to try a new hashcode 101110001110101110 1011:
Let's change another hashcode 101110001110101110 1111 try:
Yes, although Hashcode's penultimate third digit changed from 0 to 1, the result of the operation was 1001. In other words, when the HashMap length is 10, some index results are more likely to appear, and some index results will never appear (such as 0111).
In this way, obviously does not conform to the hash algorithm uniform distribution principle.
The inverse length 16 or the other 2 power, the Length-1 value is all bits 1, in which case, the result of the index is equal to the hashcode after the value. As long as the input of the hashcode itself evenly distributed, the result of the hash algorithm is homogeneous.
————— End —————