HashMap source Reading (2)-Collision (conflict) and capacity expansion

Source: Internet
Author: User
Tags rehash

Last time in HashMap source reading (1)-initial value, data structure, hash calculation article describes the initial size of HashMap, the underlying storage structure, as well as the hash value calculation and index calculation, this article will continue to understand the problem of hash collision and expansion in HashMap

1) Hash collision

Talk about hash, have to mention of course is the problem of hash collision, so-called hash collision, simply said by the different keys calculated the same hash value. The author Caishuxueqian that there are several ways to solve the hash collision:

1. Open Address Law:

When a conflict occurs, a sniffing (detection) sequence is formed in the hash table using some sort of probing (also called probing) technique. Finds the specified keyword along this sequence, either until a given key is found, or when an open address (that is, the address cell is empty) (to insert, in the case of an open address, the new node to be inserted is stored in the Address cell). Probing to open addresses while searching indicates that there are no unknown origin keywords in the table, that is, the lookup failed. Open address law there are several ways to deal with this, not in detail, the reader can own Google Baidu.

2. Re-hashing:

This method constructs several different hash functions at the same time:

HI=RH1 (Key) i=1,2,...,k

When the hash address HI=RH1 (key) conflicts, calculate HI=RH2 (key) ... until the conflict no longer occurs. This method is not easy to generate aggregation, but increases the computational time.

3. Internal Chain Method:

The basic idea of this method is that when a hash collision occurs, a linked list is created at the collision point, and all the values of the collision are maintained in the form of a list. Such as:

In the code of HashMap, how to solve the problem of hash collision? Let's take a look at the put function of HashMap:

 Public V put (K key, V value) {if (table = = empty_table) {inflatetable (threshold);//This is the Inflatet mentioned earlier        The able function, which initializes the table and raises the capacity to 2 n power} if (key = = null) return Putfornullkey (value); int hash = hash (key);  Bit operation hash value int i = indexfor (hash, table.length); Return hash & (table.length-1) for (entry<k,v> e = table[i]; E! = null; e = e.next) {objec            T k;                For loop if found to be the same element, update its value if (E.hash = = Hash && (k = e.key) = = Key | | key.equals (k)) {                V oldValue = E.value;                E.value = value;                E.recordaccess (this);            return oldValue;        }}//does not exist the same value, go down, the key, value, hash and index (i) is created as entry and stored in the table array modcount++;        AddEntry (hash, key, value, I);    return null; }/** * The key, value, hash, and index build entry coexist in the table array */void addentry (int hash, K key, V value, int bucketinde  x) {      if (size >= threshold) && (null! = Table[bucketindex])) {//Determine if the current capacity exceeds the threshold && whether the table position has a value Resize (2 * table.length); If the threshold is reached, the expansion is 1 time times the hash = (null! = key)? Hash (key): 0; Re-calculate the hash value after expansion bucketindex = Indexfor (hash, table.length); Recalculate index subscript after expansion} createentry (hash, key, value, Bucketindex); Create a entry in the table array by creating the entry with the table}/** * */void Createentry (int hash, K key, V value, int Bucketi Ndex) {entry<k,v> e = Table[bucketindex];//Gets the value of the current position of the table (for handling collisions, establishing the inner chain) table[bucketindex] = new Ent        Ry<> (hash, key, value, E);    size++;      }/** * Creates entry in a linked list, with a link to the previous value.          */Entry (int h, K K, v V, entry<k,v> N) {value = V;          Next = n;          key = k;      hash = h; }

It is known from code and annotations that when a collision occurs, the HashMap first obtains the current value E, then e as the next value of the newly created entry, and the newly created entry value as the table, that is, assuming that the hash function is (k%5), There are 21 and 362 numbers, the hash result is 1, then first Entry (21) will be stored in the position of table[1], and when 36 is put into HashMap, Entry (36) is placed in table[1], and 21 is: Entry (36). Next=entry (+).


2) Expansion:

In the previous article we mentioned Default_load_factor, in the various descriptions of the previous text, I believe the reader also understand that this parameter is used to calculate the threshold value of HashMap expansion threshold, that is threshold=capacity*default_load_ FACTOR; When the size of HashMap arrives at threshold, HashMap will expand itself 1 time times (why, as the previous article has described).

HashMap's resize function is simple:

void Resize (int newcapacity) {        entry[] oldtable = table;        int oldcapacity = oldtable.length;        if (oldcapacity = = maximum_capacity) {            threshold = Integer.max_value;            return;        }         entry[] newtable = new entry[newcapacity];        Transfer (newtable, inithashseedasneeded (newcapacity));        Table = newtable;        threshold = (int) math.min (newcapacity * loadfactor, maximum_capacity + 1);    }

Create a new, double-sized table array, and then move the old table array all to the new table array. However, the key problem is in the transfer function. Imagine that the index value we talked about is calculated based on the table.length in HashMap, and now that the table size increases by a factor, the result of index naturally needs to be recalculated, so in the transfer function:

/**     * Transfers all entries from the current table to newtable.     *    /void Transfer (entry[] newtable, Boolean rehash) {        int newcapacity = newtable.length;        for (entry<k,v> e:table) {            //handles the linked list of collisions            while (null! = e) {                entry<k,v> next = e.next;                if (rehash) {                    E.hash = NULL = = E.key? 0:hash (E.key);                }                int i = indexfor (E.hash, newcapacity); Recalculate index                e.next = newtable[i];//chain                newtable[i] = e;                e = Next;}}}    

At this point, the basic principle of hashmap to a certain comb, next time hope to be able to concurrenthashmap source analysis.

This article connects: HashMap source Reading (2)-Collision and expansion

This article Vick

Reprint Please specify:http://www.iyowei.cn/2015/03/hashmap-collision-resize/


HashMap source Reading (2)-Collision (conflict) and capacity expansion

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.