The implementation principle of JAVA HashMap

Source: Internet
Author: User

See: http://blog.yemou.net/article/query/info/tytfjhfascvhzxcyt3591. Data structure of HashMap

There are arrays and linked lists in the data structure that can be stored, but these are basically two extremes.

Array

The array storage interval is continuous and occupies a serious memory, so the space is very complex. But the binary finding time of the array is small and the complexity is O (1); The array is characterized by: easy addressing, insertion and deletion difficulties;

Linked list

The storage interval of the list is discrete, the memory is relatively loose, so the space complexity is very small, but the time complexity is very large, up to O (N). The list is characterized by difficult addressing, easy insertion and deletion.

Hash table

Can we combine the characteristics of both to make a data structure that is easy to address, insert and delete? The answer is yes, and that's the hash table we're going to mention. Hash table (hash table) not only satisfies the data search convenience, but also does not occupy too much content space, the use is very convenient.

There are a number of different implementations of the hash table, and what I'll explain next is the most commonly used method-the Zipper method, which we can understand as "arrays of linked lists",

From what we can find is that the hash table consists of an array + linked list, an array of length 16, each of which stores the head node of a linked list. So what rules are these elements stored in the array? The general situation is obtained by hash (key)%len, that is, the hash value of the key of the element is modeled by the array length. For example, in the above hash table, 12%16=12,28%16=12,108%16=12,140%16=12. So 12, 28, 108, and 140 are all stored in the position labeled 12 below the array.

HashMap is actually a linear array, so it can be understood that the container where the data is stored is a linear array. This may be confusing to us, how does a linear array implement key-value pairs to access data? Here HashMap has to do some processing.

First HashMap inside the implementation of a static internal class entry, its important attributes are key, value, next, from the property key,value we can clearly see entry is the HashMap key value of the implementation of a basic bean, What we said above is that the basis of hashmap is a linear array, which is the contents of Entry[],map are stored in entry[].

/**

* The table, resized as necessary. Length must always be a power of.

*/

Transient entry[] table;

2. HashMap Access Implementation

Since it is a linear array, why random access? Here HashMap uses a small algorithm, which is generally implemented as follows:

When storing:
int hash = Key.hashcode (); This hashcode method is not detailed here, as long as the hash of each key is understood to be a fixed int value
int index = hash% Entry[].length;
Entry[index] = value;

When values are taken:
int hash = Key.hashcode ();
int index = hash% Entry[].length;
return Entry[index];

1) put

Question: If two keys get the same index through Hash%entry[].length, will there be a risk of coverage?

Here HashMap uses a concept of chained data structure. We mentioned above that there is a next property in the entry class that refers to the downward one entry. For example, the first key value to a comes in, by calculating the hash of its key to get the index=0, remember to do: entry[0] = A. After a while. A key value pair B, by calculating its index is also equal to 0, now what? HashMap will do this:B.next = A, entry[0] = B, if it comes in again C,index is equal to 0, then c.next = B, entry[0] = C; so we find index= 0 of the places actually access the A,b,c three key-value pairs, they are linked by the next property. So don't worry about it. This means that the last inserted element is stored in the array. So far, the general realization of HASHMAP, we should have been clear.

Public V put (K key, V value) {

if (key = = null)

return Putfornullkey (value); Null is always placed in the first linked list of the array

int hash = hash (Key.hashcode ());

int i = indexfor (hash, table.length);

Traversing a linked list

for (entry<k,v> e = table[i]; E! = null; e = e.next) {

Object K;

Replace the key with the new value if it already exists in the linked list

if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k)) {

V oldValue = E.value;

E.value = value;

E.recordaccess (this);

return oldValue;

}

}

modcount++;

AddEntry (hash, key, value, I);

return null;

}

void AddEntry (int hash, K key, V value, int bucketindex) {

Entry<k,v> e = Table[bucketindex];

Table[bucketindex] = new entry<k,v> (hash, key, value, E); Parameter e, is Entry.next

If size exceeds threshold, the table size is expanded. Re-hash

if (size++ >= threshold)

Resize (2 * table.length);

}

Of course HashMap also contains some optimization aspects of the implementation, here also say. For example: entry[] The length of a certain, with the map inside the data more and more long, so that the same index chain will be very long, will affect performance? HashMap inside a factor, as the map size becomes larger, entry[] will be extended with a certain length of rules.

2) Get

Public V get (Object key) {

if (key = = null)

return Getfornullkey ();

int hash = hash (Key.hashcode ());

Locate the array element first, and then traverse the linked list at that element

for (entry<k,v> e = table[indexfor (hash, table.length)];

E! = null;

E = e.next) {

Object K;

if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k)))

return e.value;

}

return null;

}

3) access to null key

A null key is always stored in the first element of the entry[] array.

Private v Putfornullkey (v value) {

for (entry<k,v> e = table[0]; E! = null; e = e.next) {

if (E.key = = null) {

V oldValue = E.value;

E.value = value;

E.recordaccess (this);

return oldValue;

}

}

modcount++;

AddEntry (0, NULL, value, 0);

return null;

}

Private V Getfornullkey () {

for (entry<k,v> e = table[0]; E! = null; e = e.next) {

if (E.key = = null)

return e.value;

}

return null;

}

4) determine the array index:hashcode% table.length modulo

When HashMap is accessed, it is necessary to calculate which element of the current key should correspond to the entry[] array, that is, the array subscript, as follows:

/**

* Returns index for hash code h.

*/

static int indexfor (int h, int length) {

Return H & (LENGTH-1);

}

The bitwise take and, the function is equivalent to modulo mod or take the remainder%.

This means that the array subscript is the same and does not indicate that the hashcode is the same.

5) Table Initial Size

Public HashMap (int initialcapacity, float loadfactor) {

...//Find a power of 2 >= initialcapacity

int capacity = 1;

while (Capacity < initialcapacity)

Capacity <<= 1;

This.loadfactor = Loadfactor;

threshold = (int) (capacity * loadfactor);

Table = new Entry[capacity];

Init ();

}

Note that the table initial size is not initialcapacity!! in the constructor

But >= initialcapacity 2 of the power of n!!!!

Why are ———— so designed? ——

3. Ways to resolve hash conflicts
    1. Open addressing Method (linear detection re-hash, two-time detection and re-hash, pseudo-random detection and hashing)

    2. Re-hash method

    3. Chain Address method

    4. Create a public overflow zone

The solution to HashMap in Java is to use the chain address approach.

4. Re-hashing the rehash process

When the hash table capacity exceeds the default capacity, the table must be resized. When the capacity has reached the maximum possible value, then the method will adjust the capacity to Integer.max_value return, at this time, you need to create a new table, the original table map to the new table.

/**

* Rehashes the contents of this map to a new array with a

* Larger capacity. This method was called automatically when the

* Number of keys in this map reaches its threshold.

*

* If current capacity are maximum_capacity, this method does not

* Resize the map, but sets threshold to Integer.max_value.

* This have the effect of preventing future calls.

*

* @param newcapacity The new capacity, must be a power of;

* must is greater than current capacity unless current

* Capacity is maximum_capacity (in which case value

* is irrelevant).

*/

void Resize (int newcapacity) {

entry[] oldtable = table;

int oldcapacity = Oldtable.length;

if (oldcapacity = = maximum_capacity) {

threshold = Integer.max_value;

Return

}

entry[] newtable = new Entry[newcapacity];

Transfer (newtable);

Table = newtable;

threshold = (int) (newcapacity * loadfactor);

}

/**

* Transfers all entries from the current table to newtable.

*/

void Transfer (entry[] newtable) {

entry[] src = table;

int newcapacity = Newtable.length;

for (int j = 0; J < Src.length; J + +) {

Entry<k,v> e = src[j];

if (E! = null) {

SRC[J] = null;

do {

Entry<k,v> next = E.next;

Recalculate Index

int i = indexfor (E.hash, newcapacity);

E.next = Newtable[i];

Newtable[i] = e;

e = next;

} while (E! = null);

}

}

}

The implementation principle of JAVA HashMap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.