Java Collection: HashMap source code analysis, hashmap source code
1. HashMap Overview
HashMap is implemented based on the Map interface of the hash table. This implementation provides all optional ing operations and allows the use of null values and null keys. (Except for not synchronizing data and allowing null, The HashMap class is roughly the same as that of Hashtable .) This class does not guarantee the order of mappings, especially it does not guarantee that the order remains unchanged.
It is worth noting that HashMap is NOT thread-safe. To use a thread-safe HashMap, you can use the static method synchronizedMap of the Collections class to obtain a thread-safe HashMap.
Map map = Collections.synchronizedMap(new HashMap());
Ii. Data Structure of HashMap
The underlying layer of HashMap is mainly implemented based on arrays and linked lists. It has a very fast query speed mainly because it determines the storage location by calculating the hash code. HashMap mainly uses the hashCode of the key to calculate the hash value. As long as the hashCode is the same, the calculated hash value is the same. If many objects are stored, different objects may calculate the same hash value, which leads to a so-called hash conflict. All those who have learned the data structure know that there are many ways to solve hash conflicts. The bottom layer of HashMap uses a linked list to solve hash conflicts.
In the figure, 0 ~ Each element of the array is the header node of a single-chain table. The linked list is used to resolve conflicts, if different keys are mapped to the same position of the array, they are placed in the single-link table.
We can find that the hash table is composed of arrays and linked lists. In an array with a length of 16, each element stores the Bucket of the header node of a linked list. So what rules are these elements stored in the array. It is generally obtained through hash (key) % len, that is, the hash value of the element's key is modeled on the array length. For example, in the hash table above, 12% 16 = 12,108%, 12,140% = 16 = 12. Therefore, 12, 28, 108, and 140 are stored at the position where the array subscript is 12.
HashMap is actually a linear array, so it can be understood that the container for storing data is a linear array. This may make us puzzled. How does a linear array implement key-value pairs to access data? Here, HashMap does some processing.
First, a static internal class Entry is implemented in HashMap. Its important attributes include key, value, next, from attribute key, value, we can see that Entry is a basic bean implemented by the HashMap key-value pair. We mentioned above that the basis of HashMap is a linear array, and this array is Entry [], the content in Map is saved in Entry.
Let's look at the Entry class code in HashMap:
/** Entry is a one-way linked list. * It is a linked list corresponding to the "HashMap chain storage method. * It implements Map. entry interface, that is, functions such as getKey (), getValue (), setValue (V value), equals (Object o), and hashCode () **/static class Entry <K, v> implements Map. entry <K, V> {final K key; V value; // point to the next node Entry <K, V> next; final int hash; // constructor. // Input parameters include "hash value (h)", "Key (k)", "value (v)", "next node (n)" Entry (int h, K k, V v, Entry <K, V> n) {value = v; next = n; key = k; hash = h;} public final K getKey () {return key;} public final V getValue () {return value;} public final V setValue (V newValue) {V oldValue = value; value = newValue; return oldValue ;} // determine whether two entries are equal. // if both the "key" and "value" values of the two entries are equal, true is returned. // Otherwise, false public final boolean equals (Object o) {if (! (O instanceof Map. entry) return false; Map. entry e = (Map. entry) o; Object k1 = getKey (); Object k2 = e. getKey (); if (k1 = k2 | (k1! = Null & k1.equals (k2) {Object v1 = getValue (); Object v2 = e. getValue (); if (v1 = v2 | (v1! = Null & v1.equals (v2) return true;} return false;} // implement hashCode () public final int hashCode () {return (key = null? 0: key. hashCode () ^ (value = null? 0: value. hashCode ();} public final String toString () {return getKey () + "=" + getValue () ;}// when an element is added to a HashMap, call recordAccess (). // Void recordAccess (HashMap <K, V> m) {}// recordRemoval () is called when elements are deleted from HashMap (). // Void recordRemoval (HashMap <K, V> m) {}} is not processed here ){}}
HashMap is actually an Entry array. The Entry object contains keys and values. next is also an Entry object, which is used to handle hash conflicts and form a linked list.
Iii. HashMap source code analysis
Let's take a look at some key attributes in the HashMap class:
Transient Entry [] table; // The object array of the stored element transient int size; // The number of stored elements int threshold; // the critical value when the actual size exceeds the critical value, will be resized threshold = loading Factor * Capacity final float loadFactor; // loading Factor transient int modCount; // number of times modified
The loadFactor loading factor indicates the extent to which elements in the H_3 table are filled.
If the load factor is greater, the more elements are filled, the advantage is that the space utilization is high, but the chance of conflict increases. The length of the linked list will grow longer and the search efficiency will decrease.
On the contrary, the smaller the loading factor, the fewer elements to be filled. The advantage is that the chance of conflict is reduced, but space is wasted. the data in the table will be too sparse (a lot of space is useless and it will start to expand)
The larger the chance of conflict, the higher the search cost.
Therefore, a balance and compromise must be found between "conflicting opportunities" and "Space Utilization. this balance and compromise is essentially a balance and compromise between the famous "time-space" in the data structure.
If the machine memory is sufficient and you want to increase the query speed, you can set the load factor to a smaller value. If the machine memory is insufficient and there is no requirement for the query speed, you can set the load factor to a larger value. However, we generally do not need to set it, so we can set it to the default value of 0.75.
2. Constructor
Let's take a look at several construction methods of HashMap:
Public HashMap (int initialCapacity, float loadFactor) {// make sure that the number is valid if (initialCapacity <0) throw new IllegalArgumentException ("Illegal initial capacity:" + initialCapacity ); if (initialCapacity> MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 | Float. isNaN (loadFactor) throw new IllegalArgumentException ("Illegal load factor:" + loadFactor); // Find a power of 2> = initialCapacity int capacity = 1; // initial capacity while (capacity <initialCapacity) // ensures that the capacity is 2's n power, so that capacity is greater than initialCapacity's minimum 2's n power capacity <= 1; this. loadFactor = loadFactor; threshold = (int) (capacity * loadFactor); table = new Entry [capacity]; init ();} public HashMap (int initialCapacity) {this (initialCapacity, DEFAULT_LOAD_FACTOR);} public HashMap () {this. loadFactor = DEFAULT_LOAD_FACTOR; threshold = (int) (DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR); table = new Entry [DEFAULT_INITIAL_CAPACITY]; init ();}
We can see that when constructing a HashMap, If we specify the loading Factor and initial capacity, the first constructor is called. Otherwise, the default constructor is used. The default initial capacity is 16, and the default load factor is 0.75. We can see 13-15 lines in the above Code. This code is used to ensure that the capacity is the n power of 2, so that capacity is the n power that is greater than the minimum 2 power of initialCapacity, as to why we need to set the capacity to the power of n, let's wait.
The most two put and get methods used in HashMap are analyzed.
3. Data Storage