Java container HashMap and HashSet Learning
In Java learning, I saw the HashMap and HashSet classes. I learned the JDK source code and recorded the notes at the usage level (many companies have to ask the underlying layer for interviews.
Src.zip under source code jdk1.7
HashMap is a Key-Value Pair type. It provides a data structure for storing key-Value pairs and implements the Map interface. The Key Value is unique, that is, a key can only be mapped to a unique value at a time point.
Check several members (no full columns)
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16static final float DEFAULT_LOAD_FACTOR = 0.75f;transient Entry
[] table = (Entry
[]) EMPTY_TABLE;final float loadFactor;int threshold;
Table is an array structure. every pair of ing is placed here.
Entry Implement Map. Entry, that is, a key-value Pair in Map, to check whether the key-value Pair exists. (Other methods are included. skip this step first)
static class Entry
implements Map.Entry
{ final K key; V value; Entry
next; int hash; /** * Creates new entry. */ Entry(int h, K k, V v, Entry
n) { value = v; next = n; key = k; hash = h; } …
The Entry is the internal class of HashMap. Its members include key, value, and hash. next is a chain structure. The Entry contains the basic hashing hashCode () method, and OK is clear about its composition.
The data structure of HashMap is
Now we are concerned about the put (K key, V value) method of HashMap.
Public V put (K key, V value) {if (table = EMPTY_TABLE) {// The table is empty and the applied space is inflateTable (threshold);} if (key = null) return putForNullKey (value); int hash = hash (key); int I = indexFor (hash, table. length); for (Entry
E = table [I]; e! = Null; e = e. next) {// if the key of the entry to be inserted exists, replace it with the value of the new entry and return the oldValue Object k; if (e. hash = hash & (k = e. key) = key | key. equals (k) {// If the hash value is equal, it may be a collision. If the value is equal, The equals method may be overwritten. Therefore, the || V oldValue = e. value; e. value = value; e. recordAccess (this); return oldValue;} modCount ++; addEntry (hash, key, value, I); // Add it to the head of the linked list and return null ;}
We can see from the above that the storage location of each Entry is determined based on the key. We can regard value as a subsidiary of the key and look at the hash () method.
final int hash(Object k) { int h = hashSeed; if (0 != h && k instanceof String) { return sun.misc.Hashing.stringHash32((String) k); } h ^= k.hashCode(); // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }
That is to say, through a hash function (the hashCode and the method for getting h are unclear ......), Map the key to an int-type hash value, and then use indexFor to determine the subscript of the key in the table array.
static int indexFor(int h, int length) { // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2"; return h & (length-1); }
This length is the power of 2 to solve the problem where the previous hash value is larger than the array length.
Let's take a look at the addEntry method:
Void addEntry (int hash, K key, V value, int bucketIndex) {if (size> = threshold) & (null! = Table [bucketIndex]) {resize (2 * table. length); hash = (null! = Key )? Hash (key): 0; bucketIndex = indexFor (hash, table. length);} createEntry (hash, key, value, bucketIndex);} void createEntry (int hash, K key, V value, int bucketIndex) {// note that the Entry pair is inserted in the head of the linked list. You can see that the above Entry constructor // e is the previous head, and then acts as next in the new Entry. Entry
E = table [bucketIndex]; table [bucketIndex] = new Entry <> (hash, key, value, e); size ++ ;}
As you can see, if different keys are mapped to the same hash value, call addEntry and then use
Link address MethodAs a method to handle conflicts, the new object is placed in
Linked List Header
At the same time, when the number of added entries reaches threshold, the table will be resize, and there is a transfer method in resize to copy the content of the old table to the new table. Double the table size
Void resize (int newCapacity) {Entry [] oldTable = table; int oldCapacity = oldTable. length; if (oldCapacity = MAXIMUM_CAPACITY) {// if it has reached 1 <30, threshold = Integer cannot be expanded. MAX_VALUE; // (1 <31)-1 return;} Entry [] newTable = new Entry [newCapacity]; transfer (newTable, initHashSeedAsNeeded (newCapacity )); // copy the content of the current table to the new table. table = newTable; threshold = (int) Math. min (newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); // recalculate threshold}
Now, the entire put process of HashMap has been completed.
Get method:
Public V get (Object key) {if (key = null) return getForNullKey (); Entry
Entry = getEntry (key); return null = entry? Null: entry. getValue ();} final Entry
GetEntry (Object key) {if (size = 0) {return null;} int hash = (key = null )? 0: hash (key); for (Entry
E = table [indexFor (hash, table. length)]; // table [indexFor (hash, table. length)] is to directly map the value obtained by the indexFor operation to the index e of the array! = Null; // search for the next Entry e of the Entry chain = e. next) {Object k; if (e. hash = hash & (k = e. key) = key | (key! = Null & key. equals (k) return e;} return null ;}
Search process: Find the mapped point and compare it with the elements in the linked list one by one to ensure that the target value is found. Because it is a hash table, multiple values will be mapped to the same index, so we need to compare it with the elements in the linked list.
When the Entry stored in each bucket of HashMap is only a single Entry -- that is, the Entry chain is not generated through the pointer, HashMap has the best performance, that is, the drop-down list is not found in the table array.
Summary HashMap:
At the underlying layer, HashMap treats key-value as a whole, which is an Entry object. At the underlying layer of HashMap, an Entry [] array is used to store all key-value pairs. When an Entry object needs to be stored, its storage location is determined based on the Hash algorithm; when an Entry needs to be retrieved, it will also find its storage location based on the Hash algorithm and retrieve it directly. It can be seen that the reason why HashMap can quickly store and retrieve the entries it contains is similar to what our mother taught us in real life: different things should be placed in different places, you can quickly find it as needed.
Implementation of HashSet
For a HashSet, it is implemented based on HashMap. The underlying HashSet uses HashMap to store all elements. Therefore, the implementation of HashSet is relatively simple. You can view the source code of HashSet and see the following code:
Public class HashSet
Extends AbstractSet
Implements Set
, Cloneable, java. io. Serializable {// use the HashMap key to save all the elements in the HashSet private transient HashMap
Map; // define a virtual Object as the value of HashMap private static final Object PRESENT = new Object ();... // initialize the HashSet. A HashMap public HashSet () {map = new HashMap is initialized at the underlying layer.
() ;}// Create a HashSet with the specified initialCapacity and loadFactor // In fact, it is to create a HashMap public HashSet (int initialCapacity, float loadFactor) {map = new HashMap
(InitialCapacity, loadFactor);} public HashSet (int initialCapacity) {map = new HashMap
(InitialCapacity);} HashSet (int initialCapacity, float loadFactor, boolean dummy) {map = new LinkedHashMap
(InitialCapacity, loadFactor);} // call map's keySet to return all key public Iterator
Iterator () {return map. keySet (). iterator () ;}// call the size () method of HashMap to return the number of entries, and the number of elements in the Set is obtained. public int size () {return map. size () ;}// call the isEmpty () of HashMap to determine whether the HashSet is empty. // when the HashMap is empty, the corresponding HashSet is also empty. public boolean isEmpty () {return map. isEmpty ();} // call the hashinskey of HashMap to determine whether to include all elements of the specified key // HashSet. It is the public boolean contains (Object o) Saved by the key of HashMap) {return map. containsKey (o) ;}// put the specified element into the HashSet, that is, put the element as the key into the HashMap public boolean add (E e) {return map. put (e, PRESENT) = null;} // call the remove Method of HashMap to delete the specified Entry. In this way, the public boolean remove (Object o) element of HashSet is deleted) {return map. remove (o) = PRESENT;} // call the clear method of Map to clear all entries, so that all elements in the HashSet are cleared. public void clear () {map. clear ();}...}
From the source code above, we can see that the implementation of HashSet is actually very simple. It only encapsulates a HashMap object to store all the set elements, all set elements in a HashSet are actually saved by the HashMap key, while the HashMap value stores a PRESENT, which is a static Object.
Most HashSet methods are implemented by calling the HashMap method. Therefore, the implementation of HashSet and HashMap is essentially the same.
Reference: Crazy Java
This is the only analysis here. lz looks at the jdk source code for the first time. If there is a problem or correction, you can leave a message ......