The hashmap of common data structure map in Java

Source: Internet
Author: User
Tags array length rehash

Earlier in the blog has written some of hashmap things: Thoroughly understand Hashmap,hashtableconcurrenthashmap Association: http://www.cnblogs.com/wang-meng/p/5808006. The difference between Htmlhashmap and Hashtable: http://www.cnblogs.com/wang-meng/p/5720805.html today HashMap is divided JDK7 and JDK8 contrast, Because there are some minor changes in JDK8 for HashMap, this is the point that some interviews often ask. One: JDK7 in the HashMap:HashMap the bottom layer maintains an array of table, and each item in the array is a key,value form of entry.The objects we put in the HashMap are actually stored in the array. The Key,value in the map are stored in the array as entry. This entry should be placed in the position of the array, which is calculated by the hashcode of the key. This position also becomes a hash bucket.

The value calculated by hash will find the table subscript where it is located using the Indexfor method:

This method is actually to take the model of Table.length, when two keys are passed a hash collision (collision) occurs when the hashcode computation is in the same phase, HashMap the way to resolve hash conflicts is to use a linked list. When a hash conflict occurs, the entry placed in the array is set to the next of the new value (it is important to note that, for example, A and B are both mapped to subscript I, before a is already there, when Map.put (b), B is placed in subscript I, A is the next of B, So the new value is stored in the array, and the old value is on the linked list of the new value.

For example, in an array of length 16, each element stores the head node of a linked list. So what rules are these elements stored in the array? The general situation is obtained by hash (key)%len, that is, the hash value of the key of the element is modeled by the array length. For example, in the above hash table, 12%16=12,28%16=12,108%16=12,140%16=12. So 12, 28, 108, and 140 are all stored in the position labeled 12 below the array. Its interior is actually implemented with an entity array, with the properties of key, value, next. Then look at the put method: 469 lines, if the key is empty, then put the object on the first array. 471 rows, the hash value of the key is calculated 472 lines, by the Indexfor method to return scattered to the array table subscript 473 rows, through Table[i] to get the value of the new entry, if the value is not empty, The hash value of the key and equals to determine whether the new entry and the old entry value is the same, if the same will overwrite the old entry value and return. 484 lines, add a new entry to the array. When adding entry, when the capacity of the table is greater than theshold ( (int) math.min (capacity * Loadfactor, maximum_capacity + 1)), here is actually 16*0.75=12, when a certain condition after the table began to expand, this process is also known as rehash, see: 559 lines: Create a new entry array 564 rows: Transfer the array to the new entry array 565 rows: Modify the conditions of the resize threshold again concrete implementation we can see the jdk7 in HashMap related source code. II: JDK8 in the HashMap:Until JDK7, the structure of HashMap is so simple, based on an array and the implementation of multiple lists, hash value conflict, the corresponding node is stored in the form of a linked list. The HashMap performance of such a certain doubt, if the hundreds of nodes in the hash collision, store a linked list, then if you want to find one of the nodes, it is inevitable to spend O (N) of the lookup time, this will be how much performance loss. The question was finally settled in the JDK8. In the worst case, the time complexity of the list lookup is O (n), and the red-black tree has always been O (Logn), which increases the efficiency of the hashmap. JDK7 HashMap uses a bit bucket + linked list, which is what we often say Hash ListJDK8, and the use of a bit bucket + linked list/red-black treeis also non-thread safe. When when a bucket's list reaches a certain threshold, the list is converted into a red-black tree .。 JDK8, when the number of nodes of the same hash value is greater than or equal to 8 o'clock, it will no longer be stored as a single-linked list, and will be adjusted to a red-black tree (the null node is not drawn). This is the biggest difference between JDK7 and JDK8 in HashMap implementations.
 Publicv put (K key, V value) {returnPutval (hash (key), key, value,false,true);}FinalV Putval (intHash, K key, V value,BooleanOnlyifabsent,Booleanevict) {Node<K,V>[] tab; Node<K,V>p; intN, I; //If there is no data in the current map, execute the Resize method. and return n    if(tab = table) = =NULL|| (n = tab.length) = = 0) n= (Tab =resize ()). length; //If you want to insert a key-value pair that has exactly no element to store, then wrap it up as a node object and put it in this position and it's done.    if(p = tab[i = (n-1) & hash]) = =NULL) Tab[i]= NewNode (hash, key, value,NULL); //Otherwise, it means there's an element on it .    Else{Node<K,V>e;        K K; //if the key of this element is the same as the one to be inserted, then replace it and do it.         if(P.hash = = Hash &&((k= p.key) = = Key | | (Key! =NULL&&Key.equals (k)))) E=p; //1. If the current node is a TreeNode type of data, execute the Puttreeval method        Else if(pinstanceofTreeNode) e= ((treenode<k,v>) p). Puttreeval ( This, tab, hash, key, value); Else {            //or traversing the data on this chain, it's no different from Jdk7 .             for(intBincount = 0;; ++Bincount) {                if((e = p.next) = =NULL) {P.next= NewNode (hash, key, value,NULL); //2. After completing the operation, do one more thing, judge, and possibly execute the Treeifybin method                    if(Bincount >= treeify_threshold-1)//-1 for 1sttreeifybin (tab, hash);  Break; }                if(E.hash = = Hash &&((k= e.key) = = Key | | (Key! =NULL&&Key.equals (k))))  Break; P=e; }        }        if(E! =NULL) {//existing mapping for keyV OldValue =E.value; if(!onlyifabsent | | oldValue = =NULL)//true | |--E.value =value; //3.afternodeaccess (e); returnOldValue; }    }    ++Modcount; //Determine the threshold value and decide whether to enlarge    if(++size >threshold) resize (); //4.afternodeinsertion (evict); return NULL;}
Treeifybin () is the conversion of a linked list into a red-black tree. Before the Indeffor () method disappears, directly with the (tab.length-1) &hash, so see this, represents the array of the angle of the label. The realization of the specific red and black tree can be seen in the JDK8 HashMap. Three: Where to note: again on the importance of HashcodeAs mentioned earlier, HashMap key in the hashcode to do a rehash, to prevent some bad hash algorithm generated bad hashcode, then why to prevent bad hashcode? Bad hashcode means a hash conflict, that is, multiple different keys may get the same hashcode, bad hash algorithm means that the probability of the hash conflict increases, which means that the performance of HashMap will fall, performance in two ways: 1, There are 10 keys, maybe 6 key hashcode are the same, the other four keys are located in the entry evenly distributed in the position of the table, and a location is connected to 6 entry. This loses the meaning of HashMap, hashmap the premise that the data is structurally high-performance, entry evenly distributed across the table position, but is now 1 1 1 1 6 distribution. Therefore, we require hashcode to have a very strong randomness, so as far as possible to ensure the randomness of the distribution of entry, improve the efficiency of hashmap. 2. HashMap the code to traverse the linked list at a table location: if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k))) see, because of the use of the "&&" operator, so the comparison hashcode,hashcode is not the same as the direct pass, will not be compared with equals. Hashcode because it is an int value, it is relatively fast, and the Equals method tends to compare a series of content, which is slower. The probability of hash conflict is large, which means that the number of equals is bound to increase, which inevitably reduces the efficiency of hashmap. Why is the HashMap table transient?A very detailed place: transient entry[] table, see table with the transient decoration, that is, the contents of the table is not serialized, do not know if you have thought of the reason for this writing? In my opinion, it is very necessary to write like this. Because HashMap is based on Hashcode, Hashcode as the method of object, is native: public native int hashcode ();This means: Hashcode is related to the underlying implementation, and different virtual machines may have different hashcode algorithms. Further to understand, it is possible that the same key is hashcode=1 on virtual machine A, hashcode=2 on virtual machine B, hashcode=3 on virtual machine C. This is a problem, Java since its inception, to cross-platform as the biggest selling point, well, if the table is not modified by transient, can be used on virtual machine a program to virtual machine B can use the program is not used, lost cross-platform, because: 1, Key on the virtual machine a hashcode=100, even on the table[4] 2, key on the virtual machine B hashcode=101, so, go to table[5] Find key, obviously can not find the entire code on the problem. Therefore, to avoid this, Java takes the method of rewriting its own serialization table, and at WriteObject chooses to append the key and value to the last side of the serialized file:
Private voidWriteObject (java.io.ObjectOutputStream s)throwsioexception{Iterator<Map.Entry<K,V>> i =(Size> 0)? EntrySet0 (). Iterator ():NULL; //Write out the threshold, loadfactor, and any hidden stuffS.defaultwriteobject (); //Write out number of bucketsS.writeint (table.length); //Write out size (number of Mappings)s.writeint (size); //Write out keys and values (alternating)    if(Size > 0) {         for(map.entry<k,v>e:entryset0 ())            {S.writeobject (E.getkey ());        S.writeobject (E.getvalue ()); }    }}
And in ReadObject, refactor the HASHMAP data structure:
Private voidReadObject (java.io.ObjectInputStream s)throwsIOException, classnotfoundexception{//Read in the threshold (ignored), loadfactor, and any hidden stuffS.defaultreadobject (); if(loadfactor <= 0 | |Float.isnan (loadfactor))Throw NewInvalidobjectexception ("Illegal load factor:" +loadfactor); //Set Hashseed (can only happen after VM boot)Holder.UNSAFE.putIntVolatile ( This, Holder.hashseed_offset, Sun.misc.Hashing.randomHashSeed ( This)); //Read in number of buckets and allocate the bucket array;S.readint ();//ignored//Read Number of mappings    intmappings =S.readint (); if(Mappings < 0)        Throw NewInvalidobjectexception ("Illegal mappings count:" +mappings); intInitialcapacity = (int) Math.min (//capacity chosen by number of mappings//and desired load (if >= 0.25)Mappings * Math.min (1/loadfactor, 4.0f),            //we have limits ...hashmap.maximum_capacity); intCapacity = 1; //find smallest power of which holds all mappings     while(Capacity <initialcapacity) {Capacity<<= 1; } table=NewEntry[capacity]; Threshold= (int) Math.min (capacity * Loadfactor, maximum_capacity + 1); Usealthashing= sun.misc.VM.isBooted () &&(Capacity>=holder.alternative_hashing_threshold);  Init (); //Give subclass a chance to does its thing. //Read the keys and values, and put the mappings in the HashMap     for(inti=0; i<mappings; i++) {K key=(K) s.readobject (); V value=(V) s.readobject ();    Putforcreate (key, value); }}
A troublesome way, but it guarantees a cross-platform nature. This example also tells us that although the virtual machine used is a hotspot in most cases, it is not a good idea to have a cross-platform mind regardless of the other virtual machines. the difference between HashMap and HashtableHashMap and Hashtable are a set of similar key-value pairs, and their differences are also one of the frequently asked questions, I here briefly summarize the differences between HashMap and Hashtable: 1, Hashtable is thread-safe, Hashtable all external methods are used synchronized, that is, synchronization, and HashMap thread is not safe 2, hashtable not allow null value, empty value will cause null pointer exception, and HASHMAP does not matter,  There is no limitation in this area 3, the above two shortcomings is the main difference, another difference is irrelevant, I just mention, is two rehash algorithm, Hashtable is: This hashseed is produced using the Randomhashseed method of the Sun.misc.Hashing class. HashMap's rehash algorithm has been seen above, namely:

The hashmap of common data structure map in Java

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.