HASHMAP Structure and use

Source: Internet
Author: User

Data structure of HashMap

HashMap is mainly used to store data in arrays, we all know that it will hash the key, the HA operation will have a duplicate hash value, for the hash value of the conflict, HashMap using a linked list to solve. In HashMap, there is such a statement of attributes:

transient Entry[] table;

Entry is the class that HashMap uses to store data, and it has the following properties

Final K Key; V value; Final int Hash; Entry<K,V> Next;

Have you seen next? Next is the existence of a hash conflict. For example, by hashing, a new element should be in the 10th position of the array, but the 10th position already has the entry, then all right, the new added element is also placed in the 10th position, the 10th position of the original entry assigned to the current new entry next property. It is important to note that the array stores the list of linked lists in order to resolve the hash conflict.

A few key attributes
An array of stored data

transient Entry[] table; It's already been mentioned.
Default Capacity Static Final int default_initial_capacity = 16;
Maximum Capacity Static Final int maximum_capacity = 1 << 30;

The default load factor, the load factor, is a scale that hashmap capacity when the HashMap data size >= the capacity * load factor

Static Final float Default_load_factor = 0.75f;

When the actual data size exceeds threshold, HASHMAP will expand Capacity, threshold= capacity * load factor

int threshold;

Load factor

Final float loadfactor;

The initial process of HashMap
Constructor 1

 PublicHashMap (intInitialcapacity,floatloadfactor) {if(Initialcapacity < 0)Throw NewIllegalArgumentException ("Illegal initial capacity:" +initialcapacity); if(Initialcapacity >maximum_capacity) initialcapacity=maximum_capacity; if(loadfactor <= 0 | |Float.isnan (loadfactor))Throw NewIllegalArgumentException ("Illegal load factor:" +loadfactor); //Find a power of 2 >= initialcapacity    intCapacity = 1;  while(Capacity <initialcapacity) Capacity<<= 1;  This. Loadfactor =Loadfactor; Threshold= (int) (Capacity *loadfactor); Table=NewEntry[capacity]; Init ();}

Focus on this.

 while (Capacity <<<= 1;

Capacity is the initial capacity, not the initialcapacity, this should pay special attention to, if the implementation of new HashMap (9,0.75), then the initial capacity of HashMap is 16, instead of 9, think about why.

Constructor 2

 Public HASHMAP (int  initialcapacity) {   this(initialcapacity, default_load_factor);}

Constructor 3, all of which are default values

 Public HashMap () {    this. loadfactor = default_load_factor;      = (int) (default_initial_capacity * default_load_factor);     New entry[default_initial_capacity];    Init ();}

Constructor 4

 Public extends extends V> m) {    this (Math.max ((int) (m.size ()/default_load_factor) +1,    default_ initial_capacity), default_load_factor);    Putallforcreate (m);}

How to Hash
HashMap does not directly use the object's hashcode as a hash, but instead takes the hashcode of the key to make some calculations to get the final hash, and the resulting hash value is not in the array, either get or put or any other method, The hash value is calculated as this sentence:

int hash = hash (Key.hashcode ());

The hash function is as follows:

Static int hash (int  h) {    return usenewhash? newhash (h): Oldhash (h);}

Usenewhash declares as follows:

Private Static Final Boolean Usenewhash; Static false; }

This shows that the Usenewhash in fact has been false and immutable, the hash function of the Usenewhash judgment is really superfluous.

Private Static intOldhash (inth) {h+ = ~ (H << 9); H^= (H >>> 14); H+ = (H << 4); H^= (H >>> 10); returnh;}Private Static intNewhash (inth) {//This function ensures, hashcodes that differ//constant multiples at each bit position has a bounded//Number of collisions (approximately 8 at default load factor).H ^= (H >>>) ^ (H >>> 12); returnH ^ (H >>> 7) ^ (H >>> 4);}

In fact, HashMap's hash function will always be oldhash.


If you determine the location of the data
Look at the following two lines

int hash = Hash (k.hashcode ()); int i = indexfor (hash, table.length);

The first line, above, is to get the hash value, and the second line is to calculate the position of the element in the array based on the hash, and the position is computed by bitwise AND operation of the hash value and the set length.

Static int indexfor (intint  length) {  return H & (Length-1);}

What exactly did the put method do?

 Publicv put (K key, V value) {if(Key = =NULL)returnPutfornullkey (value); inthash =Hash (Key.hashcode ()); inti =indexfor (hash, table.length);  for(entry<k,v> e = table[i]; E! =NULL; E =e.next) {Object k; if(E.hash = = Hash && (k = e.key) = = Key | |Key.equals (k))) {V OldValue=E.value; E.value=value; E.recordaccess ( This); returnOldValue; }} Modcount++;  AddEntry (hash, key, value, I); return NULL;}

If key is null, it is handled separately, see the Putfornullkey method:

Privatev Putfornullkey (v value) {inthash =Hash (Null_key.hashcode ()); inti =indexfor (hash, table.length);  for(entry<k,v> e = table[i]; E! =NULL; E =e.next) {if(E.key = =Null_key) {V OldValue=E.value; E.value=value; E.recordaccess ( This); returnOldValue; }} Modcount++;  AddEntry (hash, (K) Null_key, value, i); return NULL;} 

Null_key's statement:

Static Final New Object ();

This piece of code deals with hash collisions, that is, an object in an array position may not be unique, it is a linked list structure, after the list is found based on the hash value, the linked list is traversed, the key is found to be equal, the object is replaced, and the old value is returned.

 for null;    E = e.next) {  if (e.key= = = value;    E.recordaccess (this);   return  oldValue; }}

If the linked list is not found to have the key equal, then add the current object to the list.

modcount++; AddEntry (hash, (K) Null_key, value, i); return null;

And look at the AddEntry method

voidAddEntry (intHash, K key, V value,intBucketindex) {Entry<K,V> e =Table[bucketindex]; Table[bucketindex]=NewEntry<k,v>(hash, key, value, E); if(size++ >=threshold) Resize (2 *table.length); } Table[bucketindex]=NewEntry<k,v>(hash, key, value, E), create a new entry object and put it in the head of the entry list at the current position, and look at the entry constructor below to see the red part. Entry (intH, K K, v V, entry<k,v>N) {value=v; Next=N; Key=K; Hash=h;}

How to expand?

When put an element, if the capacity limit is reached, HashMap will expand, the new capacity is always twice times the original.
There is a paragraph in the Put method above:

if (size++ >= threshold) Resize (2 * table.length);

This is the expansion of the judgment, it should be noted that the size of the data can not reach the maximum capacity of HashMap only to expand, but to reach the threshold specified value to start the expansion, threshold= maximum capacity * load factor. Look at the Resize method

void Resize (int= table;   int oldcapacity = oldtable.length;     if (oldcapacity == integer.max_value;  returnnew = = (int) (newcapacity *  Loadfactor);}

Focus on the red part of the transfer method

voidtransfer (entry[] newtable) {entry[] src=table; intNewcapacity =newtable.length;  for(intj = 0; J < Src.length; J + +) {Entry<K,V> e =Src[j]; if(E! =NULL) {Src[j]=NULL;  Do{Entry<K,V> next =E.next; inti =indexfor (E.hash, newcapacity); E.next=Newtable[i]; Newtable[i]=e; E=Next; }  while(E! =NULL); }}}

The Tranfer method hashes all the elements, because the new capacity becomes larger, so the hash and position of each element are different.



Correct use of HashMap
1: Do not use HashMap in concurrent scenarios
HashMap is not thread-safe, if the operation is shared by multiple threads, will cause unpredictable problems, according to Sun, in the expansion, will cause the closed loop of the list, in the Get element, will be infinite loop, the consequence is CPU 100%.
Look at the red part of the Get method

 PublicV get (Object key) {if(Key = =NULL)returnGetfornullkey (); inthash =Hash (Key.hashcode ());  for(Entry<k,v> e =table[indexfor (hash, table.length)]; e!=NULL; e=e.next) {Object k; if(E.hash = = Hash && (k = e.key) = = Key | |Key.equals (k))) returnE.value; }return NULL;}

2: If the data size is fixed, then it is best to set a reasonable capacity value for HashMap

According to the above analysis, the initial default capacity of HashMap is 16, the default loading factor is 0.75, that is, if the default constructor of HashMap is used, when the data is increased, the actual capacity exceeds 16*0.75=12, the HashMap expands, and the expansion brings a series of operations , create a new twice-fold array, the original element all re-hash, if your data has thousands of tens of thousands of, and with the default HashMap constructor, the result is very tragic, because HashMap constantly expanding, constantly hashing, in the use of hashmap scenes, It is not possible for multiple threads to share a hashmap, and unless the HashMap is packaged and synchronized, the resulting memory overhead and CPU overhead may be fatal in some cases.

Transferred from: http://www.java3z.com/cwbwebhome/article/article8/81388.html?id=3973

HASHMAP Structure and use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.