A comprehensive analysis of HashMap class _java in Java

Source: Internet
Author: User
Tags assert constant hash prev rehash static class concurrentmodificationexception

HashMap and HashSet are two important members of the Java Collection Framework, where HashMap is a common implementation class for the Map interface, and HashSet is a common implementation class for Set interfaces. Although HashMap and hashset implement different interface specifications, their underlying Hash storage mechanism is exactly the same, and even hashset itself is implemented by HASHMAP.
In fact, there are many similarities between HashSet and HashMap, for HashSet, the system uses the Hash algorithm to determine the storage location of the set elements, so as to ensure that the collection elements can be saved and fetched quickly. For HashMap, System Key-value as a whole into Row processing, the system is always based on the Hash algorithm to calculate the Key-value storage location, so as to ensure that the MAP can be quickly saved and fetched key-value pairs.
Before introducing the collection store, it is important to point out that while the collection is known to store Java objects, it does not actually put Java objects into the set collection, but rather to keep references to those objects in the Set collection. That is, a Java collection is actually a collection of multiple reference variables that point to the actual Java object.

First, the basic characteristics of HashMap

After reading the annotation section of the JDK source code Hashmap.class, you can summarize many HashMap features.

HashMap allows both key and value to be null, while Hashtable is not allowed.

HashMap is thread-unsafe, while Hashtable is thread-safe

The order of elements in the hashmap is not always constant, and the position of the same element may change over time (resize)

The time complexity of the traversal hashmap is proportional to its capacity (capacity) and the number of existing elements (size). If you want to ensure the efficiency of the traversal, the initial capacity (capacity) cannot be set too high or the balance factor (load factor) cannot be set too low.

As with the previous related list, because HashMap is thread insecure, the iterator produces fail-fast when it tries to make changes to the container structure during the iteration. A synchronized HashMap can be obtained by Collections.synchronizedmap (HASHMAP)

Second, Hash table data structure analysis

Hash table (hash list, hash table) is a data structure that directly accesses the memory storage location based on the keyword. That is, a hash table establishes a direct mapping between the keyword and the stored address

As shown in the following figure, the key passes through the hash function to get an index position of the buckets.

Getting index by hash function inevitably leads to the same situation, which is conflict. Here's a brief description of several ways to resolve conflicts:

Open addressing: The basic idea of this method is that when encountering a conflict, sequentially scans the table n positions and fills them if they are free. The specific algorithm no longer explains, the following is a schematic:

Separate chaining (zipper): This method is the basic idea is to encounter conflict, the same index value of the entry linked list. The specific algorithm no longer explains, the following is a schematic:

The HashMap method of resolving conflicts in JDK is the separate chaining method.

Third, HashMap source analysis (JDK1.7)

1, HashMap read and write elements

Entry
HashMap in the storage element is entry type, the following gives the source code in the entry source:

Static Class Entry<k,v> implements Map.entry<k,v> {final K key;
 V value;
 Entry<k,v> Next;
 int hash;
  Entry (int h, K K, v V, entry<k,v> N) {value = V;
  Next = n;
  key = k;
 hash = h; //key, the Get and set methods of value are omitted, and the gets and set operations are used in subsequent iterators ... public final boolean equals (Object o) {if (!) (
  o instanceof Map.entry) return false;
  Map.entry e = (map.entry) o;
  Object K1 = Getkey ();
  Object K2 = E.getkey (); if (k1 = = K2 | | (K1!= null && k1.equals (K2)))
  {Object V1 = GetValue ();
  Object v2 = E.getvalue (); if (v1 = = V2 | |
   (v1!= null && v1.equals (v2)))
  return true;
 return false; //Here the hashcode of the key and the hashcode of value are also calculated to entry hashcode Public final int hashcode () {return Objects.hashcode (Getkey)
 ) ^ Objects.hashcode (GetValue ());
 Public final String toString () {return getkey () + "=" + GetValue (); /** * This is invoked whenever the ' value in ' a entry is * overwritten by a invocation to put (K,V) for a Key K ' s already * in the HASHMAP. */void Recordaccess (hashmap<k,v> m) {}/** * this to is invoked whenever the entry are * removed from th
  E table.
 */void Recordremoval (hashmap<k,v> m) {}}

A entry includes a reference to key, value, hash, and the next entry, which is clearly a single linked list that implements the Map.entry interface.

Recordacess (Hashmap<k, v> and Recordremoval (hashmap<k, v>) are not specifically implemented in HASHMAP. But in Linkedhashmap these two methods are used to implement the LRU algorithm.

Get: Read element
Obtain the corresponding entry from the HashMap, the following gives the get related source code:

The public V get (Object key) {
 //key is null
 if (key = null) return
  getfornullkey ();
 Find Entry
 entry<k,v> Entry = getentry (key) according to key;
 return NULL = = entry? Null:entry.getValue ();
 }

Getfornullkey Source

Private V Getfornullkey () {
 if (size = = 0) {return
  null;
 }
 Traversal conflict chain for
 (entry<k,v> e = table[0]; e!= null; e = e.next) {
  if (E.key = null) return
  E.VALUE;
   } return
 null;
 }

The entry of the key is stored in table[0], but there is not necessarily a null key in the conflict chain in table[0], so it needs to be traversed.

Get entry according to key:

Final entry<k,v> getentry (Object key) {
 if (size = = 0) {return
  null;
 }
 int hash = (key = null)? 0:hash (key);
 The index position in table is obtained by hash, then the key for
 (entry<k,v> e = table[indexfor (hash, table.length) is found by traversing the conflict list).
  e!= null;
  E = e.next) {
  Object k;
  if (E.hash = = Hash &&
  (k = e.key) = = Key | | (Key!= null && key.equals (k)
  )) return e;
 }
 return null;
 }

The above is HashMap read a entry process and its source code. Time complexity O (1)

Put: Write elements
The put operation in HashMap is relatively complex because of the HASHMAP expansion operation during the put operation.
Writes a new element, and if there is a key to write to the element in the HashMap, then the operation that replaces value is executed, equivalent to update. Here is the put source code:

If public V is put (K key, V value) {
 //empty table, fill
 if (table = = empty_table) {
  inflatetable (threshold) According to the threshold value of size; c4/>}
 //Fill key NULL entry
 if (key = null) return
  Putfornullkey (value);
 Generate a hash, get index indexed by the map
 int hash = hash (key);
 int i = indexfor (hash, table.length);
 Traverse the conflicting chain of the current index to find out if there is a corresponding key for
 (entry<k,v> e = table[i]; e!= null; e = e.next) {
  Object K;
  If there is a corresponding key, replace OldValue and return oldvalue
  if (E.hash = Hash && ((k = e.key) = = Key | | key.equals (k))) {
  V ol Dvalue = E.value;
  E.value = value;
  E.recordaccess (this);
  Return OldValue
  }
 }
 The key modcount++ of the newly written entry is not present in the conflict chain
 ;
 Inserts a new entry
 addentry (hash, key, value, I);
 return null;
 }

AddEntry and Createentry Source:

void AddEntry (int hash, K key, V value, int bucketindex) {
 //before inserting a new entry, determine the size of the current HashMap and its thresholds, and choose whether to enlarge the
 if ( Size >= threshold) && (null!= Table[bucketindex]) {
  Resize (2 * table.length);
  hash = (null!= key)? Hash (key): 0;
  Bucketindex = Indexfor (hash, table.length);
 }
 Createentry (hash, key, value, Bucketindex);
 void Createentry (int hash, K key, V value, int bucketindex) {
 entry<k,v> e = Table[bucketindex];
 Header interpolation, newly written Entry inserts the conflict chain at the current index position in front of the first Entry
 table[bucketindex] = new entry<> (hash, key, value, e);
 size++;
 }

The above is HashMap write a entry process and its source code. Time complexity O (1)

Remove removal element:

Final entry<k,v> Removeentryforkey (Object key) {
 if (size = = 0) {return
  null;
 }
 Calculates the hash value according to key, gets the index
 int hash = (key = = null)? 0:hash (key);
 int i = indexfor (hash, table.length);
 List deletion, define two pointers, pre indicates precursor
 entry<k,v> prev = table[i];
 entry<k,v> e = prev;
 Traverse the conflict chain and delete all enrty while
 (e!= null) {
  entry<k,v> next = E.next for key;
  Object K;
  Found the
  if (E.hash = = Hash &&
  (k = e.key) = = Key | | (Key!= null && key.equals (k))) {
  modcount++;
  size--;
  Finding the first node is the node to be deleted
  if (prev = e)
   table[i] = next;
  else
   prev.next = next;
  E.recordremoval (this);
  return e;
  }
  prev = e;
  e = next;
 }
 return e;
 }

The above is HashMap delete a entry process and its source code. Time complexity O (1)

2, HashMap hash principle (hash function)

The implementation of the hash function in the hashmap is done through the hash (Object K) and indexfor (int h, int length), and below look at the source code:

 Final int hash (Object k) {
 int h = hashseed;
 if (0!= h && k instanceof String) {return
  Sun.misc.Hashing.stringHash32 ((String) k);
 }
 H ^= K.hashcode ();
 This function ensures so hashcodes that differ only by
 //constant multiples in each bit position have a bounded< c8/>//number of collisions (approximately 8 at default load factor).
 To reduce the chance of conflict
 H ^= (H >>>) ^ (h >>>);
 Return h ^ (H >>> 7) ^ (H >>> 4);
 }

Get the index index source:

static int indexfor (int h, int length) {
 //Assert integer.bitcount (length) = = 1: "Length must is a Non-zero power O F 2 ";
 Return H & (length-1);
 }

HashMap maps a key to the index within the interval of [0, Table.length] through a hash function. There are generally two kinds of indexing methods:

Hash (key)% table.length, where length must be a prime number. This implementation is used by Hashtable in the JDK.
The specific use of prime reasons, you can find relevant algorithm data proof, here no longer stated.

Hash (key) & (Table.length-1) in which length must be 2 exponent of the second party. This implementation is used by HashMap in the JDK.
The hash (key) & (TABLE.LENGTH-1) is always between [0, Length-1] because the size of length is 2 exponent times. But just doing this will have a problem with a large number of conflicts, because the Java hashcode value is 32 bits, when the hashmap capacity is small, such as 16 o'clock, do XOR or operation, the high position is always discarded, low operation but increased the probability of conflict.

So in order to reduce the probability of conflict, there are many bit operations and XOR operations in the code.

3. HashMap Memory allocation policy

Member variable capacity and Loadfactor
HashMap requires a capacity capacity of 2, the default capacity is 1 << 4 = 16. There is also a balance factor (loadfactor) in the HashMap, which can reduce storage space but the time to find (lookup, including the put and get methods in HashMap) increases. The Loadfactor default value of 0.75 is the trade-off between the time complexity and the spatial complexity of the optimal value given.

 static final int default_initial_capacity = 1 << 4; aka
 static final int maximum_capacity = 1 <<;
 Static final float default_load_factor = 0.75f;

Constructor for HashMap
The construction of HashMap is to set the capacity, and the initial value of Loadfactor

Public HashMap (int initialcapacity, float loadfactor) {
 if (initialcapacity < 0)
  throw new IllegalArgumentException ("Illegal initial capacity:" +
      initialcapacity);
 if (initialcapacity > maximum_capacity)
  initialcapacity = maximum_capacity;
 if (loadfactor <= 0 | | Float.isnan (loadfactor))
  throw new IllegalArgumentException ("Illegal load factor:" +
      loadfactor);
 This.loadfactor = Loadfactor;
 threshold = initialcapacity;
 Init ();
 }

Previously said HashMap capacity must be 2 of the number of times, there is no limit to the constructor, then how to ensure that the capacity value is 2 of the number of times?
In the put operation, the source will determine whether the current hash table is empty, if it is called inflatetable (int tosize)

private void inflatetable (int tosize) {
 //Find a power of 2 >= tosize
 int capacity = ROUNDUPTOPOWEROF2 (tosiz e);
 threshold = (int) math.min (capacity * Loadfactor, maximum_capacity + 1);
 Table = new Entry[capacity];
 Inithashseedasneeded (capacity);
 }

Where ROUNDUPTOPOWEROF2 is to obtain the N power of the smallest 2, which is greater than or equal to the given parameter

private static int roundUpToPowerOf2 (int number) {
 //Assert number >= 0: "Number must be non-negative";
 Return number >= maximum_capacity
  ? Maximum_capacity
  : (number > 1)? Integer.highestonebit ((number-1) << 1): 1;
 }

Integer.hightestonebit (int) is an operation that preserves 1 of the highest bits of a given parameter, leaving the remainder to 0, simply by changing the parameter int to be less than or equal to the N power of its maximum 2.

If the number is 2 n power, minus 1 after the highest level in the original secondary high, and then left 1-bit can still be positioned to the highest position
If number is not 2 n power, minus 1 left 1 digits after the highest bit is still the original highest

Expansion:
HashMap in put operation will occur resize behavior, the specific source code is as follows:

 void resize (int newcapacity) {entry[] oldtable = table;
 int oldcapacity = Oldtable.length;
  Hash table has reached maximum capacity, 1 << if (oldcapacity = = maximum_capacity) {threshold = Integer.max_value;
 Return
 } entry[] newtable = new Entry[newcapacity]; Transfer the entry in oldtable to the//inithashseedasneeded return value in newtable determines whether to recalculate the hash value transfer (newtable, inithashseedasneeded (
 newcapacity));
 Table = newtable;
 Recalculate threshold threshold = (int) math.min (newcapacity * loadfactor, maximum_capacity + 1);
 } void Transfer (entry[] newtable, Boolean rehash) {int newcapacity = newtable.length;
  Traversal oldtable for (entry<k,v> e:table) {//traverse the conflict chain while (null!= e) {entry<k,v> next = E.next;
  if (rehash) {//recalculate hash value E.hash = NULL = = E.key? 0:hash (E.key);
  int i = indexfor (E.hash, newcapacity);
  Inserts the element to the head, the head inserts the method e.next = Newtable[i];
  Newtable[i] = e;
  e = next; }
 }
 }

The above is the whole process of hashmap memory allocation, summed up that is, hashmap in put a entry will check the current capacity and threshold size to choose whether or not expansion. The size of each expansion is 2 * table.length. During the expansion, the hash value needs to be recalculated based on inithashseedasneeded.

Four, HashMap of the iterator

HashMap in the Valueiterator, Keyiterator, Entryiterator and other iterators are based on Hashiterator, the following look at its source:

Private abstract class Hashiterator<e> implements iterator<e> {entry<k,v> next;//Next Entry to Retu RN int expectedmodcount;  for Fast-fail int index; Current slot,table index entry<k,v> current;
  Current Entry Hashiterator () {expectedmodcount = Modcount;
  The first Entry if (Size > 0) {entry[] t = table is found in the hash table;
  while (Index < t.length && (next = t[index++]) = = null);
 } Public Final Boolean hasnext () {return next!= null; Final entry<k,v> NextEntry () {//hashmap is not thread-safe, the traversal still first determines whether there is a table structure modification if (modcount!= expectedmodcount) throw n
  EW concurrentmodificationexception ();
  Entry<k,v> e = next;
  if (E = = null) throw new Nosuchelementexception ();
  if (next = e.next) = = null) {//Find the next Entry entry[] t = table;
  while (Index < t.length && (next = t[index++]) = = null);
  current = e;
 return e; public void Remove () {if (current = = null) throw new IllegalstateexceptiOn ();
  if (Modcount!= expectedmodcount) throw new Concurrentmodificationexception ();
  Object k = Current.key;
  current = null;
  HashMap.this.removeEntryForKey (k);
 Expectedmodcount = Modcount;
 }
 }

Key, Value, entry the three iterators to encapsulate it into keyset, values, EntrySet three sets of perspectives. These three sets of visual angles support the Remove, removeall, clear operation of HashMap, and do not support add, addall operation.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.