Java class set framework-HashMap (JDK1.8) Source Code Analysis
I learned the underlying implementation of HashMap over the past few days and found that there are still many blogs about HashMap implementation, but almost all of them are in JDK 1.8.0 _ 25. In contrast, it is found that the implementation of Hashmap is greatly changed. This blog has been written on and off for a day. If you do not understand it properly, please correct me.
In JDK1.6, HashMap is implemented by using a bucket + linked list, that is, using a linked list to handle conflicts. All linked lists with the same hash value are stored in one linked list. However, when there are many elements in a bucket, that is, there are many elements with the same hash value, the efficiency of searching by key values is low. In JDK1.8, HashMap uses a bucket, a linked list, and a red-black tree. when the length of the linked list exceeds the threshold (8), it converts the linked list to a red-black tree, greatly reducing the search time.
The following code is directly pasted:
1. Data Structure involved:Linked List, red/black tree, and bucket for processing hash conflicts
// Node is a one-way linked list, which implements the Map. Entry interface static class Node
Implements Map. Entry
{Final int hash; final K key; V value; Node
Next; // constructor Hash key value next Node (int hash, K key, V value, Node
Next) {this. hash = hash; this. key = key; this. value = value; this. next = next;} public final K getKey () {return key;} public final V getValue () {return value;} public final String toString () {return key + = + value;} public final int hashCode () {return Objects. hashCode (key) ^ Objects. hashCode (value);} public final V setValue (V newValue) {V oldValue = value; value = newValue; return oldVa Lue;} // determines whether two nodes are equal. If both the key and value are equal, true is returned. It can be compared with itself to true public final boolean equals (Object o) {if (o = this) return true; if (o instanceof Map. Entry) {Map. Entry
E = (Map. Entry
) O; if (Objects. equals (key, e. getKey () & Objects. equals (value, e. getValue () return true;} return false ;}}
// Red/black tree static final class TreeNode
Extends hashmap. Entry
{TreeNode
Parent; // parent node TreeNode
Left; // left subtree TreeNode
Right; // right subtree TreeNode
Prev; // needed to unlink next upon deletion boolean red; // color attribute TreeNode (int hash, K key, V val, Node
Next) {super (hash, key, val, next) ;}// return the final TreeNode of the current node.
Root () {for (TreeNode
R = this, p;) {if (p = r. parent) = null) return r; r = p ;}}
Transient Node
[] Table; // array of buckets
With the above three data structures, anyone who has a basic data structure can roughly think of the implementation of HashMap. First, each element is an array of linked lists (which may be inaccurate). When an element (key-value) is added, the hash value of the element key is calculated first, to determine the position of the inserted array, but elements with the same hash value may already be placed in the same position of the array. Then, the elements with the same hash value are added to the end of the element, they are in the same position of the array, but form a linked list, so the array stores the linked list. When the length of the linked list is too long, the linked list is converted to a red/black tree, which greatly improves the search efficiency.
The following describes the code implementation:
2 main attributes of HashMap
The default value of fill ratio is 0.75. If the actual capacity occupied by elements accounts for 75% of the allocated capacity, the system needs to expand. If the filling ratio is very large, it indicates that there is a lot of space to use, but the search efficiency is very low, because the chain table length is very large (of course, the latest version will improve a lot after using the red/black tree ), hashMap was originally used to change the time space, so it is not necessary to fill the space too much. However, filling is too small, leading to a waste of space. If you focus on memory, the filling ratio is acceptable.
Slightly larger. If you focus on search performance, the filling ratio can be slightly smaller.
Public class HashMap
Extends AbstractMap
Implements Map
, Cloneable, Serializable {private static final long serialVersionUID = complete; static final int DEFAULT_INITIAL_CAPACITY = 1 <4; // aka 16 static final int MAXIMUM_CAPACITY = 1 <30; // maximum capacity static final float DEFAULT_LOAD_FACTOR = 0.75f; // fill ratio // when an element is added to a bucket, when the linked list length reaches 8, the linked list is converted to the static final int TREEIFY_THRESHOLD = 8; static final int UNTREEIFY_THRESHOLD = 6; static final int MIN_TREEIFY_CAPACITY = 64; transient Node
[] Table; // an array of stored elements, transient Set
> EntrySet; transient int size; // The number of stored elements. transient int modCount; // The number of modifications. fast-fail: int threshold; // when the actual size (capacity * fill ratio) exceeds the critical value, the final float loadFactor will be expanded; // fill ratio (...... later)
3. Constructor
There are four HashMap constructor methods. The main parameters involved are: Specify the initial capacity, specify the fill ratio and the Map used for initialization. You can view the Code directly.
/* ---------------- Public operations ------------ * // constructor 1 public HashMap (int initialCapacity, float loadFactor) {// The specified initial capacity is not negative if (initialCapacity <0) throw new capacity (Illegal initial capacity: + initialCapacity); // set the initial capacity to the maximum capacity if (initialCapacity> MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; // The filling ratio is positive if (loadFactor <= 0 | Float. isNaN (loadFactor) throw new IllegalArgumentException (Illegal load factor: + loadFactor); this. loadFactor = loadFactor; this. threshold = tableSizeFor (initialCapacity); // new resizing critical value} // constructor 2 public HashMap (int initialCapacity) {this (initialCapacity, DEFAULT_LOAD_FACTOR );} // constructor 3 public HashMap () {this. loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted} // constructor 4 initialize the hash ing public HashMap (Map
M) {this. loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries (m, false );}
4. Expansion Mechanism
When constructing a hash table, if the initial size is not specified, the default size is 16 (that is, the Node array size is 16). If the element in the Node [] array reaches (fill ratio * Node. length)
// It can be used to initialize the HashMap size or re-adjust the HashMap size to twice the size of the final Node.
[] Resize () {Node
[] OldTab = table; int oldCap = (oldTab = null )? 0: oldTab. length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap> 0) {if (oldCap> = MAXIMUM_CAPACITY) {// The size exceeds 1> 30, the threshold value threshold = Integer cannot be changed. MAX_VALUE; return oldTab;} else if (newCap = oldCap <1) <MAXIMUM_CAPACITY & oldCap> = DEFAULT_INITIAL_CAPACITY) // The minimum new capacity is 16 newThr = oldThr <1; // The expansion threshold is doubled} else if (oldThr> 0) newCap = oldThr; // oldCap = 0, oldThr> 0 at this time newThr = 0 Else {// oldCap = 0, oldThr = 0 is equivalent to using the default fill ratio and initial capacity to initialize newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY );} if (newThr = 0) {float ft = (float) newCap * loadFactor; newThr = (newCap <MAXIMUM_CAPACITY & ft <(float) MAXIMUM_CAPACITY? (Int) ft: Integer. MAX_VALUE);} threshold = newThr; @ SuppressWarnings ({rawtypes, unchecked}) Node
[] NewTab = (Node
[]) New Node [newCap]; // The array is used to aid in the new array. The dividend black tree and the linked list Discuss table = newTab; if (oldTab! = Null) {for (int j = 0; j <oldCap; ++ j) {Node
E; if (e = oldTab [j])! = Null) {oldTab [j] = null; if (e. next = null) newTab [e. hash & (newCap-1)] = e; else if (e instanceof TreeNode) (TreeNode
) E). split (this, newTab, j, oldCap); else {// preserve order Node
LoHead = null, loTail = null; Node
HiHead = null, hiTail = null; Node
Next; do {next = e. next; if (e. hash & oldCap) = 0) {if (loTail = null) loHead = e; else loTail. next = e; loTail = e;} else {if (hiTail = null) hiHead = e; else hiTail. next = e; hiTail = e ;}} while (e = next )! = Null); if (loTail! = Null) {loTail. next = null; newTab [j] = loHead;} if (hiTail! = Null) {hiTail. next = null; newTab [j + oldCap] = hiHead ;}}} return newTab ;}
Obviously, resizing is time-consuming because there are operations for copying old array elements to the new array.
5. Determine the Node [] position of the put/get array of the element.
static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
public native int hashCode();
First, the key value obtains the hash value h through hash (key), and then obtains the position of the array through h & (length-1. In general, common hash table hash methods include direct addressing. Apart from the remainder, it is easy to calculate and reduce conflicts.
In Hashtable, the hash distribution is exclusive to the remainder, as shown below:
int index = (hash & 0x7FFFFFFF) % tab.length;
However, the Division operation in the modulo operation is very inefficient. HashMap uses h & (length-1) to replace the modulo operation to obtain the position of the array, which is much more efficient.
In the implementation of HashMap, we can also see that the following code replaces the while loop of JDK1.6 in the previous version to ensure that the capacity of the hash table is always an integer multiple of 2, and the cyclic shift is replaced by the shift operation.
// This Code ensures that the HashMap capacity is always 2 to the Npower static final int tableSizeFor (int cap) {int n = cap-1; n | = n >>> 1; n | = n> 2; n | = n> 4; n | = n> 8; n | = n> 16; return (n <0 )? 1: (n> = MAXIMUM_CAPACITY )? MAXIMUM_CAPACITY: n + 1 ;}We can see from the source code that tableSizeFor is directly or indirectly called in the HashMap constructor. The reason for the following analysis: the integer power of length 2 ensures that the last bit of length-1 (of course, binary representation) is 1, thus ensuring the index retrieval operation h & (length-1) at the same time, there is a possibility of 0 and 1, ensuring the uniformity of the hash. In turn, when length is an odd number, the last digit of length-1 is 0, which is equal
The last digit of the array must be 0, that is, the index position must be an even number. In this way, the odd position of the array is not all placed with elements, which wastes a lot of space.
In short, the power of length 2 ensures the validity of the bitwise AND last bit, and makes the hash table hash more even.
6. The following describes the most common operations of HashMap: put and get.
Note that keys and values in HashMap can both be null.
Directly run the Code:
// *********************************** Get ** **************************************** * *******/public V get (Object key) {Node
E; return (e = getNode (hash (key), key) = null? Null: e. value;} final Node
GetNode (int hash, Object key) {Node
[] Tab; Node
First, e; int n; K k; // hash & (length-1) Get the object's save bit if (tab = table )! = Null & (n = tab. length)> 0 & (first = tab [(n-1) & hash])! = Null) {if (first. hash = hash & // always check first node (k = first. key) = key | (key! = Null & key. equals (k) return first; if (e = first. next )! = Null) {// if the first node is TreeNode, the array + red/black tree structure is used to handle conflicts. // traverse the red/black tree to obtain the node value if (first instanceof TreeNode) return (TreeNode
) First ). getTreeNode (hash, key); // do {if (e. hash = hash & (k = e. key) = key | (key! = Null & key. equals (k) return e;} while (e = e. next )! = Null) ;}} return null ;}
// ************************ Put ************* **************************************** * *************** public V put (K key, V value) {return putVal (hash (key), key, value, false, true);} final V putVal (int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {Node
[] Tab; Node
P; int n, I; // if the tab is empty or the length is 0, memory resize () if (tab = table) is allocated) = null | (n = tab. length) = 0) n = (tab = resize ()). length; // (n-1) & hash locate the put position. if it is null, directly put if (p = tab [I = (n-1) & hash]) = null) tab [I] = newNode (hash, key, value, null); else {Node
E; K k; // The hash value of the first node is the same, and the key value is the same as the inserted key if (p. hash = hash & (k = p. key) = key | (key! = Null & key. equals (k) e = p; else if (p instanceof TreeNode) // belongs to the red/black tree processing conflict e = (TreeNode
) P ). putTreeVal (this, tab, hash, key, value); else {// handle conflicts with the linked list for (int binCount = 0; ++ binCount) {// p points to the header for the first time, move the if (e = p. next) = null) {// e is null, indicating that no node with the same key value is found at the end of the table. Then, create a node p. next = newNode (hash, key, value, null); // after adding a node, if the number of nodes reaches the threshold, convert the linked list to the red/black tree if (binCount> = TREEIFY_THRESHOLD-1) //-1 for 1st treeifyBin (tab, hash); break;} // allow null = null if (e. hash = hash & (k = e. key) = key | (key! = Null & key. equals (k) break; p = e; // update p to the next node} // update if (e! = Null) {// existing mapping for key V oldValue = e. value; if (! OnlyIfAbsent | oldValue = null) e. value = value; afterNodeAccess (e); return oldValue ;}++ modCount; if (++ size> threshold) resize (); afterNodeInsertion (evict); return null ;}
The following describes how to add a key-value pair to put (key, value): (in fact, the Code logic is clearer)
1. judge whether the key-Value Pair array tab [] is null or not; otherwise, resize ();
2. Calculate the hash value to the inserted array index I based on the key value. If tab [I] = null, add a new node directly. Otherwise, transfer to 3.
3. check whether the hash conflicts in the current array are handled by the linked list or the red/black tree (check the first node type.