See: http://blog.yemou.net/article/query/info/tytfjhfascvhzxcyt3591. Data structure of HashMap
There are arrays and linked lists in the data structure that can be stored, but these are basically two extremes.
Array
The array storage interval is continuous and occupies a serious memory, so the space is very complex. But the binary finding time of the array is small and the complexity is O (1); The array is characterized by: easy addressing, insertion and deletion difficulties;
Linked list
The storage interval of the list is discrete, the memory is relatively loose, so the space complexity is very small, but the time complexity is very large, up to O (N). The list is characterized by difficult addressing, easy insertion and deletion.
Hash table
Can we combine the characteristics of both to make a data structure that is easy to address, insert and delete? The answer is yes, and that's the hash table we're going to mention. Hash table (hash table) not only satisfies the data search convenience, but also does not occupy too much content space, the use is very convenient.
There are a number of different implementations of the hash table, and what I'll explain next is the most commonly used method-the Zipper method, which we can understand as "arrays of linked lists",
From what we can find is that the hash table consists of an array + linked list, an array of length 16, each of which stores the head node of a linked list. So what rules are these elements stored in the array? The general situation is obtained by hash (key)%len, that is, the hash value of the key of the element is modeled by the array length. For example, in the above hash table, 12%16=12,28%16=12,108%16=12,140%16=12. So 12, 28, 108, and 140 are all stored in the position labeled 12 below the array.
HashMap is actually a linear array, so it can be understood that the container where the data is stored is a linear array. This may be confusing to us, how does a linear array implement key-value pairs to access data? Here HashMap has to do some processing.
First HashMap inside the implementation of a static internal class entry, its important attributes are key, value, next, from the property key,value we can clearly see entry is the HashMap key value of the implementation of a basic bean, What we said above is that the basis of hashmap is a linear array, which is the contents of Entry[],map are stored in entry[].
/**
* The table, resized as necessary. Length must always be a power of.
*/
Transient entry[] table;
2. HashMap Access Implementation
Since it is a linear array, why random access? Here HashMap uses a small algorithm, which is generally implemented as follows:
When storing:
int hash = Key.hashcode (); This hashcode method is not detailed here, as long as the hash of each key is understood to be a fixed int value
int index = hash% Entry[].length;
Entry[index] = value;
When values are taken:
int hash = Key.hashcode ();
int index = hash% Entry[].length;
return Entry[index];
1) put
Question: If two keys get the same index through Hash%entry[].length, will there be a risk of coverage?
Here HashMap uses a concept of chained data structure. We mentioned above that there is a next property in the entry class that refers to the downward one entry. For example, the first key value to a comes in, by calculating the hash of its key to get the index=0, remember to do: entry[0] = A. After a while. A key value pair B, by calculating its index is also equal to 0, now what? HashMap will do this:B.next = A, entry[0] = B, if it comes in again C,index is equal to 0, then c.next = B, entry[0] = C; so we find index= 0 of the places actually access the A,b,c three key-value pairs, they are linked by the next property. So don't worry about it. This means that the last inserted element is stored in the array. So far, the general realization of HASHMAP, we should have been clear.
Public V put (K key, V value) {
if (key = = null)
return Putfornullkey (value); Null is always placed in the first linked list of the array
int hash = hash (Key.hashcode ());
int i = indexfor (hash, table.length);
Traversing a linked list
for (entry<k,v> e = table[i]; E! = null; e = e.next) {
Object K;
Replace the key with the new value if it already exists in the linked list
if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k)) {
V oldValue = E.value;
E.value = value;
E.recordaccess (this);
return oldValue;
}
}
modcount++;
AddEntry (hash, key, value, I);
return null;
}
void AddEntry (int hash, K key, V value, int bucketindex) {
Entry<k,v> e = Table[bucketindex];
Table[bucketindex] = new entry<k,v> (hash, key, value, E); Parameter e, is Entry.next
If size exceeds threshold, the table size is expanded. Re-hash
if (size++ >= threshold)
Resize (2 * table.length);
}
Of course HashMap also contains some optimization aspects of the implementation, here also say. For example: entry[] The length of a certain, with the map inside the data more and more long, so that the same index chain will be very long, will affect performance? HashMap inside a factor, as the map size becomes larger, entry[] will be extended with a certain length of rules.
2) Get
Public V get (Object key) {
if (key = = null)
return Getfornullkey ();
int hash = hash (Key.hashcode ());
Locate the array element first, and then traverse the linked list at that element
for (entry<k,v> e = table[indexfor (hash, table.length)];
E! = null;
E = e.next) {
Object K;
if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k)))
return e.value;
}
return null;
}
3) access to null key
A null key is always stored in the first element of the entry[] array.
Private v Putfornullkey (v value) {
for (entry<k,v> e = table[0]; E! = null; e = e.next) {
if (E.key = = null) {
V oldValue = E.value;
E.value = value;
E.recordaccess (this);
return oldValue;
}
}
modcount++;
AddEntry (0, NULL, value, 0);
return null;
}
Private V Getfornullkey () {
for (entry<k,v> e = table[0]; E! = null; e = e.next) {
if (E.key = = null)
return e.value;
}
return null;
}
4) determine the array index:hashcode% table.length modulo
When HashMap is accessed, it is necessary to calculate which element of the current key should correspond to the entry[] array, that is, the array subscript, as follows:
/**
* Returns index for hash code h.
*/
static int indexfor (int h, int length) {
Return H & (LENGTH-1);
}
The bitwise take and, the function is equivalent to modulo mod or take the remainder%.
This means that the array subscript is the same and does not indicate that the hashcode is the same.
5) Table Initial Size
Public HashMap (int initialcapacity, float loadfactor) {
...//Find a power of 2 >= initialcapacity
int capacity = 1;
while (Capacity < initialcapacity)
Capacity <<= 1;
This.loadfactor = Loadfactor;
threshold = (int) (capacity * loadfactor);
Table = new Entry[capacity];
Init ();
}
Note that the table initial size is not initialcapacity!! in the constructor
But >= initialcapacity 2 of the power of n!!!!
Why are ———— so designed? ——
3. Ways to resolve hash conflicts
Open addressing Method (linear detection re-hash, two-time detection and re-hash, pseudo-random detection and hashing)
Re-hash method
Chain Address method
Create a public overflow zone
The solution to HashMap in Java is to use the chain address approach.
4. Re-hashing the rehash process
When the hash table capacity exceeds the default capacity, the table must be resized. When the capacity has reached the maximum possible value, then the method will adjust the capacity to Integer.max_value return, at this time, you need to create a new table, the original table map to the new table.
/**
* Rehashes the contents of this map to a new array with a
* Larger capacity. This method was called automatically when the
* Number of keys in this map reaches its threshold.
*
* If current capacity are maximum_capacity, this method does not
* Resize the map, but sets threshold to Integer.max_value.
* This have the effect of preventing future calls.
*
* @param newcapacity The new capacity, must be a power of;
* must is greater than current capacity unless current
* Capacity is maximum_capacity (in which case value
* is irrelevant).
*/
void Resize (int newcapacity) {
entry[] oldtable = table;
int oldcapacity = Oldtable.length;
if (oldcapacity = = maximum_capacity) {
threshold = Integer.max_value;
Return
}
entry[] newtable = new Entry[newcapacity];
Transfer (newtable);
Table = newtable;
threshold = (int) (newcapacity * loadfactor);
}
/**
* Transfers all entries from the current table to newtable.
*/
void Transfer (entry[] newtable) {
entry[] src = table;
int newcapacity = Newtable.length;
for (int j = 0; J < Src.length; J + +) {
Entry<k,v> e = src[j];
if (E! = null) {
SRC[J] = null;
do {
Entry<k,v> next = E.next;
Recalculate Index
int i = indexfor (E.hash, newcapacity);
E.next = Newtable[i];
Newtable[i] = e;
e = next;
} while (E! = null);
}
}
}
The implementation principle of JAVA HashMap