Several problems of Java's HashMap

Source: Internet
Author: User
Tags array length

Several methods of HashMap dealing with hash conflicts I. Open addressing method

Hi= (H (key) + di) MOD m i=1,2,... K (k<=m-1) where H (key) is a hash function; M is a hash table, and di is an incremental sequence.

The open addressing method can be divided into 3 types depending on the step size:

1) Linear probing method (Linear probing): di=1,2,3,..., m-1
Simply put, the current conflict position as the starting point, the step is 1 loop lookup, until an empty position to find the elements inserted into the loop is not found the container is full. It's like you go to a restaurant in a street and ask the first one to be told that the house is full, and then go next to a family and ask if there is a location.

2) Linear Compensation detection method : di=q the next position satisfies hi= (H (key) + Q) mod m i=1,2,... K (k<=m-1), requiring Q and m to be coprime so that all the cells in the hash table can be detected.
Continue to use the above example, now you are not next to a family to ask, take out a calculator to forget about it, and then ask the Q-Home once there is no location.

3) pseudo-random detection re-hash : di= pseudo-random number sequence. Or that example, it is entirely in the mood to choose a shop to ask

Disadvantages:

    • This method builds up the hash table when the conflict is many times the data is easy to heap together, when the search is not friendly;
    • Deleting a node does not simply set the space of the deleted node to null, otherwise it truncates the lookup path of the synonym node of the hash table after it is populated. Therefore, the delete operation is performed on the hash list that handles the conflict with the open address method, and only the deletion mark is done on the deleted node, and the node cannot be deleted.
    • When the space is full, create an overflow table to save the extra elements.
Second, the re-hashing method

Hi = RHi (key), i=1,2,... K
RHI are different hash functions, that is, when a synonym generates an address conflict, the address of another hash function is computed until no conflict occurs. This method is not easy to generate aggregation, but increases the computational time.

Cons: Increased calculation time.

Iii. establishment of a public overflow zone

Assuming that the value of the hash function is [0,m-1], set the vector hashtable[0...m-1] as the base table, each component holds one record, and the other set vector OVERTABLE[0....V] is an overflow table. All keywords and keywords in the base table are synonyms for the records, regardless of the hash address they get from the hash function, and in the event of a conflict, fill in the overflow table.

In short, it's a new element that has a conflicting table.

Four, chain address method (Zipper method)

Stores all keywords as synonyms in the same linear list, that is, the elements of the conflict position are constructed into a linked list.

Advantages of Zipper Method:

    • Zipper method to deal with the conflict is simple, and no accumulation phenomenon, that is, non-synonym will never conflict, so the average search length is short;
    • Because of the dynamic application of the node space on each linked list in the Zipper method, it is more suitable for the case that the table length can not be determined before watchmaking.
    • In a hash list constructed with the Zipper method, the deletion of nodes is easy to implement. Simply delete the corresponding node on the list.

Disadvantages of the Zipper method:

    • Pointers require additional space, so when the node size is small, open addressing method is more space-saving, and if the saving of the pointer space to expand the scale of the hash table, can make the filling factor smaller, which reduces the open addressing method of conflict, thus improving the average search speed

Why is hashmap in java1.8 with red and black trees?

After the jdk1.8 version, Java has made improvements to HashMap, where the list length is greater than 8, and the subsequent data is in a red-black tree to speed up the retrieval.

Example: The put operation of HashMap:

What is the length of the array in the HashMap as the n power of 2?

Use the Indexfor method in HashMap to return the location where the object is saved:

static int indexfor (int h,int length) {    return  H & (length-1);} static int hash (int h) {    h ^= (H >>> a) ^ (h >>>12);    Return h ^ (H >>> 7) ^ (H >>> 4);}

So the array length is 2 of the n-th square:

    1. H & (Table.length-1) to the object of the Save bit, 2 of the N power minus 1 in each digit value is 1, all 1 and operation speed will be greatly improved;
    2. Array length of 2 n times, the different key value of the index of the same probability is smaller, reduce the probability of the hash collision, the same query does not need to always traverse the linked list, query efficiency has been improved;



Reference:
Links: https://www.jianshu.com/p/dff8f4641814

Links: 75331926

Several problems of Java's HashMap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.