"Go" Java Learning---The internal working mechanism of HASHMAP and HashSet

Source: Internet
Author: User

"Original" https://www.toutiao.com/i6593863882484220430/

Internal working mechanism of HASHMAP and HashSet

How does the HashMap and HashSet work inside? What are hash functions (hashing function)?

HashMap is not only a common data structure, but also a hot topic in the interview.

Q1. HashMap How to store data?

A1. stored as key/value pairs (key/value). You can use the key to save and value.

Q2. What is the complexity of HASHMAP query time?

A2. Is O (n) = O (k * n). If the Hashcode () method can scatter the data into buckets, as discussed below, the average is O (1).

Q3. How is the data stored inside the HASHMAP?

A3. HashMap uses a background array (backing array) as a bucket and stores key/value pairs using a linked list (linked list).

A background array of buckets: as shown below

1) When an object is placed in a map using the key (key) and value (values), the hashcode () method is called implicitly, returning the hash value (hash code value), such as 123. Two different keys can return the same hash value. A good hashing algorithm (hashing algorithm) is capable of dispersing values. In the above example, we assume that the keys ("John", 01/01/1956) and the keys ("Peter", 01/01/1995) return the same hash value, which is 123.

2) when a hashcode is returned, for example 123, and the initial HashMap capacity is 10, how does it know which index is stored in the background array (backing array)? The HashMap internally calls the hash (int) and indexfor (int h, int length) methods. This is called a hash function (hashing functions).

Briefly explain the following function:

Hashcode ()% capacity

123% 10 = 3

456% 10 = 6

This means that "Hashcode = 123" is stored on index 3 of the backup array.

In the case of a capacity of 10, you can get numbers between 0 and 9 .

Once the HashMap reaches 75% of the capacity, that is, the hash factor (hash factor) defaults to 0.75, the capacity of the background array (backing array) is doubled, and a re- hash (rehashing) re-allocates the bucket for the new 20 capacity.

Hashcode ()% capacity

123% 20 = 3

456% 20 = 16

There is a flaw in the method of modulo the above hash. What happens if the hashcode is negative? Negative indexing is not what you want. Therefore, an improved hash formula moves out of the sign bit, and then computes the remainder with the modulo (or%) operator.

(123 & 0x7FFFFFFF)% 20 = 3

(456 & 0x7FFFFFFF)% 20 = 16

This ensures that you get a positive index value. If you look at the Java 8 HashMap source code, its implementation uses the following methods:

a). prevent undesirable discrete values (poorer hashes) by extracting only important lows.

b). The index is determined by the hash code (hashcode) and the capacity (capacity).

The actual name-value pairs (name values pairs) are stored as a key/value pair in LinkedList.

As shown, key/value pairs are stored as a linked list. It is important to understand that two different keys can produce the same hashcode, such as 123, and are stored in the same bucket. For example, in the example above, "John, 01/01/1956" and "Peter, 01/01/1995". How do you only search for "John, 01/01/1956"? At this point the equals () method of the class to which your key belongs is called. It iterates through each entry in the bucket "123" linkedlist, using the Equals () method to find and retrieve an entry with the key "John, 01/01/1956". This is why it is important to implement the Hashcode () and equals () methods in your class. If you use an existing wrapper class, such as an Integer or String as a key, they have implemented both methods. If you use a class that you write as a key, such as "John, 01/01/1956", which contains the name and birth date attributes of the "MyKey", it is your responsibility to implement these methods correctly.

Q5. Why is it best practice to properly set the initial capacity of HASHMAP (initial capacity)?

A5. This can reduce the occurrence of a hash .

Q6. HashSet how to store data internally?

A6. HashSet internal use of HASHMAP. It stores the element as a key and a value. (Translator Note: HashSet the stored value as key)

Q7. What is the impact of having a bad hashcode () for Object?

A7. Different object calls the Hashcode () method should return different values. If a different object returns the same value, it causes more key/value pairs to be stored in the same bucket. This reduces the performance of HashMap and HashSet.

"Go" Java Learning---hashmap and hashset internal working mechanisms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.