Java HashMap Underlying principle source code Analysis __java

Source: Internet
Author: User
Tags array length

in introducing HashMap at the same time, I will put it and Hashtable and concurrenthashmap differences also say, but this article is mainly introduced HashMap, in fact, they are similar to the principle of array plus linked list of the form of storage data, In addition, the introduction of this article are JDK1.8 version. Before you introduce, look at the map family's inheritance system diagram:


Among them, TreeMap is based on the tree, the other three are hash table structure.

The main differences between HashMap and Hashtable are :

1. Hashtable is thread-safe, while HashMap is not thread-safe, most of the Hashtable implementations add synchronized keywords to ensure thread synchronization, so HashMap performance is relatively high. Using HashMap in a multithreaded environment requires the use of the Collections.synchronizedmap () method to obtain a thread-safe collection.

2. HashMap can use NULL as a key, while Hashtable does not allow NULL as a key.

3. The initial capacity of the HashMap is 16,hashtable with an initial capacity of 11, and the filling factor of both is 0.75 by default. HashMap expansion capacity is the current capacity doubling that is: capacity*2,hashtable expansion capacity is capacity doubling +1 that is: capacity*2+1 (about the expansion and filling factor will be said later)

4. The hash algorithm of the two is different, HashMap is first to the key (key) Hashcode code, and then the code worthy of high and low to Do & (and) operations, the source code is as follows:

Static final int hash (Object key) {
        int h;
        return (key = = null)? 0: (H = key.hashcode ()) ^ (h >>>);
    }
i = (n-1) & Hash

The hash value returned by the hash (key) is then hashmap with the initial capacity (also called the length of the initial array) of the hash& to calculate that the key value pair should be saved to the array (n-1). Here why do not use the key itself Hashcode method, but also the right to move 16-bit is different or operation. What is the purpose of this for developers? My understanding is that when the size of the array is small, the position of the computed element in the array (n-1) is &hash, only the low value of the hash is used, so that when the different hash values are the same, the highs are different and the conflicts will occur. In fact, the hash value will be hashcode low 16 bit and high 16 bits do or operation, the equivalent of mixing high and low, increased randomness. Of course, the less the conflict the better, the more random the distribution of elements, the better.

The Hashtable computing position is as follows:

int hash = Key.hashcode ();
int index = (hash & 0x7fffffff)% Tab.length;

The hash code for the key is computed directly, then the & (and operation) is done with the 2 31, and then the length of the array is computed by the remainder.

HashMap Access elements detailed

HashMap Storage is a hash table, then what is a hash table, in fact, is the array + linked list. The HashMap initial array length is 16. Each element of the array holds the address (or null) of the header of the chain, and when put (Key,value) in the HashMap, the hash algorithm is used to compute the hashed value, and then the sum of the arrays is reduced and computed. To calculate that the key value pair should be saved to the array, if there is no element in this position, meaning that the header node of the list is null, create a new node node and save Key,value and next. Node class source code is as follows:

Static Class Node<k,v> implements map.entry<k,v> {
    final int hash;
    Final K key;
    V value;
    Node<k,v> Next;

    Node (int hash, K key, V value, node<k,v> next) {
        This.hash = hash;
        This.key = key;
        This.value = value;
        This.next = next;
    ....
    ......
    ........
}

I've only posted a section here, and I can see that the node node that holds the key value pair is an inner class that implements the Map.entry, where the attributes have a hash value (which is computed by the hash algorithm), Key,value, and next. When the put (Key,value) is executed, a conflict occurs if there is an element in the computed array position (indicating that the calculated hash value is the same as the hash value of all elements on the single linked list of the corresponding array position). ), along this array position corresponding to the single link table on the comparison, if you encounter the same key, with the new value to replace the old value, if the same key can not find a new node node, save the hash value, key and value, Then insert the tail of this single list. After inserting, the program will determine the number of nodes on this single list (note here, not all of the element nodes, but the number of nodes on the single linked list, independent of the single linked list at the other array position, is more than the limit (hashmap default is 8), if the limit is exceeded, Then HashMap will turn this single linked list into a red-black tree (about what is red-black tree, please the reader's own Baidu), the purpose is to improve the speed of get (key). The original O (n) of the time complexity becomes O (logn). That's not the end of it, once you insert the new node, the program checks whether the HashMap load (the number of all key pairs) exceeds the threshold, which is calculated by the load factor multiplying the array capacity. Once the load is greater than this threshold, the program executes the Resize () method to extend the capacity. HashMap is a direct expansion of twice times, after the expansion, the original list of the array of each linked list into the odd and even two sub-list, respectively, hanging in the hash of the new list array, which reduces the length of each linked list, Increase search efficiency, but expansion is time-consuming.

static final int default_initial_capacity = 1 << 4; Default initial array capacity

static final int maximum_capacity = 1 <<;	Maximum capacity
	
static final float default_load_factor = 0.75f;	Load factor default 0.75

static final int treeify_threshold = 8;	The number of single linked table nodes is more than 8 to be converted into red-black tree

int threshold;//threshold, exceeding this number to expand capacity

Below is a look at the general situation of the HashMap storage element:


now look at the source analysis of the put operation:


Now look at the source analysis of the get operation:


In fact, the initial capacity and load factor of the HashMap can be changed through the hashmap of the parameter structure, if the parameterless constructor is used to define HASHMAP, both of these properties are default. So far, the bottom of the hashmap implementation of the principle is introduced, the following simple next Hashtable and Concurrenthashmap. All two of them are thread-safe and can not store keys with null values, but they are somewhat different on thread synchronization. Hashtable is used in the method of synchronized keyword, in fact, this is the object lock, lock is the object as a whole, when the size of the hashtable increased to a certain time, performance will drop dramatically, because the iteration needs to be locked for a long time. Concurrenthashmap is the optimization of the above problems. Concurrenthashmap introduces a partition (Segment), which can be understood as it splits a large map into n small Hashtable, and in the Put method, the hash algorithm is used to determine which Segment to store. If we look at Segment's put operation source code, we will find that the internal use of the synchronization mechanism is still based on the lock, but this can only be a part of the map (Segment) to be locked, the impact is to be placed into the same Segment elements of the put operation, to ensure synchronization, The lock is not the entire map (and Hashtable is locked all), relative to the hashtable of the Synchronized keyword lock granularity more refined some, improve the performance of multi-threaded environment, so Hashtable has been eliminated.





Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.