Java basics: parsing HashMap and HashSet

Last Update:2015-07-03 Source: Internet

Author: User

Tags rehash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Java basics: parsing HashMap and HashSet

I. HashMap

HashMap stores the data structure referenced by the "Key-Value" Object Based on the hash table.

The saved key must have two key functions:

(1) equals (): determines whether the two keys are the same to ensure the uniqueness of the saved Key;

(2) hashCode (): Calculate the location of the k-v object's reference in the hash table based on its Key;

The underlying structure of HashMap is an array:

TransientEntry [] Table

The Entry Definition:

Static classEntry Implements Map. Entry {

Final K key;

V value;

Entry Next;

Int hash;

}

Contains key, value, and hash value. More importantly, there is a next pointer pointing to the next node.

Based on the put method described below, we can see that HashMap is a hash table at the bottom layer and uses the link method to resolve conflicts.

Find a picture on the Internet:

1. public V put (K key, V value)

View the Code directly:

Public V put (K key, V value) {if (table = EMPTY_TABLE) {inflateTable (threshold);} // insert if (key = null) when the key is null) return putForNullKey (value); // calculate the hash value int hash = hash (key) based on the key; // return the hash table index position int I = indexFor (hash, table. length); // search for entries with the same key in the linked list of the index in the hash table
 
  
// Note that table [I] points to Entry
  
   
Pointer to the head node of the linked list
   
    
E = table [I]; e! = Null; e = e. next) {Object k; if (e. hash = hash & (k = e. key) = key | key. equals (k) {// use the equals method to locate the node with the same key, overwrite the old value with the new value, and return the old value V oldValue = e. value; e. value = value; e. recordAccess (this); return oldValue ;}} modCount ++; // create a new Entry
    
     
Entity, insert the header into position I addEntry (hash, key, value, I); return null ;}

The processing of null key is as follows:

Private V putForNullKey (V value) {// At The 0th position of the hash table, check whether a node with null key already exists. for (Entry
 
  
E = table [0]; e! = Null; e = e. next) {if (e. key = null) {V oldValue = e. value; e. value = value; e. recordAccess (this); return oldValue;} modCount ++; // Insert the node addEntry (0, null, value, 0); return null ;}

The method for calculating the hash value using the key is as follows:

Final inthash (Object k) {int h = hashSeed; if (0! = H & k instanceofString) {returnsun. misc. hashing. stringHash32 (String) k);} // call the hashCode () method of the Key to calculate the hash value h ^ = k. hashCode (); // This function ensures that hashCodesthat differ only by // constant multiples at each bitposition have a bounded // number of collisions (approximately8 at default load factor ). h ^ = (h >>>> 20) ^ (h >>>> 12); return h ^ (h >>>> 7) ^ (h >>> 4 );}

The following conclusions can be drawn:

(1) when inserting If the Key already exists, the old value will be overwritten with the new value;

(2) When the inserted When the key is null, it will be inserted to the position 0 of the hash table, and there will be only one node with the key being null;

(3) when inserting We use the hashCode () method of the Key to calculate the index in the hash table, and use the equals () method of the Key to determine the two entries. If the Key is the same, the same will overwrite, so the inserted The Key must implement these two methods.

2. public V get (Object key)

The Return Value Method Based on the Key is relatively simple:

public V get(Objectkey) {        if (key == null)            return getForNullKey();        Entry
 
   entry = getEntry(key);         return null == entry ? null :entry.getValue();    }

The getEntry (key) method is mainly implemented as follows:

Final Entry
 
  
GetEntry (Object key) {if (size = 0) {return null;} // calculate the hash value int hash = (Key = null) based on the key )? 0: hash (key); // indexFor (hash, table. length) returns the index location in the hash table based on the hash Value. // searches for nodes with the same Key in the linked list pointed to by the index location and returns its Value for (Entry
  
   
E = table [indexFor (hash, table. length)]; e! = Null; e = e. next) {Object k; if (e. hash = hash & (k = e. key) = key | (key! = Null & key. equals (k) return e;} return null ;}

3. Hash capacity

HashMap has the default load factor loadFactor = 0.75, and the default entry array length is 16. The significance of the load factor is to make the entry array redundant. By default, 25% redundancy is allowed. When the number of HashMap data exceeds 12 (16*0.75) the entry array will be resized for the first time, and the subsequent expansion will be followed by so on.

Each time a HashMap is resized by a factor of one, the existing values are computed from the new array bottom mark during the resize operation, which is a waste of time. In normal use, if you can estimate the approximate HashMap capacity, You can reasonably set the initial length of the load factor loadFactor and entry array to avoid the resize operation and improve the put efficiency.

The following describes how to perform the resize operation:

Void resize (intnewCapacity) {Entry [] oldTable = table; int oldCapacity = oldTable. length; if (oldCapacity = MAXIMUM_CAPACITY) {threshold = Integer. MAX_VALUE; return;} // create a new hash table Entry [] newTable = newEntry [newCapacity] based on the new capacity; // copy the nodes one by one to the new hash table, relatively time-consuming transfer (newTable, initHashSeedAsNeeded (newCapacity); table = newTable; threshold = (int) Math. min (newCapacity * loadFactor, MAXIMUM_CAPACITY + 1 );}

4. Similar Structure

(1) Hashtable:

In earlier versions of HashMap, the underlying layer is similar to HashMap and is also used for hash table storage. The link method resolves conflicts and ensures thread security through the synchronized keyword. Let's take a look at its put method, which is similar to the put Method of HashMap:

Public synchronizedV put (K key, V value) {// if the Key is not allowed to be null (value = null) {throw new NullPointerException ();} entry tab [] = table; // use the hashCode () method of key to calculate the hash value int hash = hash (key ); // calculate the index int index = (hash & 0x7FFFFFFF) % tab except the remainder. length; // use the equals () method to find nodes with the same key Overwrite for (Entry
 
  
E = tab [index]; e! = Null; e = e. next) {if (e. hash = hash) & e. key. equals (key) {V old = e. value; e. value = value; return old ;}} modCount ++; if (count> = threshold) {// Rehash the table if thethreshold is exceeded rehash (); tab = table; hash = hash (key); index = (hash & 0x7FFFFFFF) % tab. length;} // Insert a new node Entry
  
   
E = tab [index]; tab [index] = new Entry <> (hash, key, value, e); count ++; return null ;}

(2) ConcurrentHashMap

The thread-safe version of HashMap is similar to that of HashMap at the underlying layer. It also uses hash table storage and the link method to resolve conflicts. The hash table index is calculated using the hashCode () method of the Key, use the equals () method of the key to determine whether the two keys are the same.

The difference is that ConcurrentHashMap puts forward a "segment" concept for the hash table, inserting When the hash table is successfully obtained, the locks are obtained by multipart requests.

Summary:The underlying implementation of the three is the same, but the difference lies in whether it is thread-safe and how it is implemented.HashMapThread security is not supported. It is a simple and efficient version,HashtablePassSynchronizedKeyword implements a thread-safeHashMapAnd the newConcurrentHashMap achieves thread security through a flexible method called SegmentationHashMap.

So for whatever reason, the oldHashtableWe recommend that you do not use it again. If you do not need concurrent access, we recommendHashMapOtherwise, we recommend thread-safe ConcurrentHashMap.

Ii. HashSet

HashSet is a Hash Storage designed for the storage of independent elements. It has the advantage of fast access.

HashSet is designed to be "lazy", which is directly used in HashMapEncapsulated on:

public classHashSet
 
      extends AbstractSet
  
       implements Set
   
    , Cloneable,java.io.Serializable{    static final long serialVersionUID =-5024744406713321676L;     private transient HashMap
    
     map;     // Dummy value to associate with an Objectin the backing Map    private static final Object PRESENT = newObject();

We can see that the underlying layer is a HashMap. The Key is stored in the elements of the set, and the corresponding Value is an arbitrary object PRESENT.

The Put method is as follows:

public boolean add(Ee) {        return map.put(e, PRESENT)==null;    }

The following method returns the iterator:

publicIterator
 
   iterator() {        return map.keySet().iterator();    }

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More