Principles and differences of hashmap, hashtable, and concurrenthashmap

Last Update:2018-10-31 Source: Internet

Author: User

Tags concurrentmodificationexception

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hashtable

Underlying array + linked list implementation, both key and ValueCannot be null, ThreadSecurityThread security is implemented by locking the entire hashtable during data modification, which is inefficient and concurrenthashmap has been optimized.
The initial size is11, Expansion: newsize = olesize * 2 + 1
Index Calculation Method: Index = (hash & 0x7fffffff) % tab. Length

Hashmap

Underlying array + linked list implementation, availableTo store the null key and null value, ThreadInsecure
The initial size is16, Expansion: newsize = oldsize * 2, the size must be the N power of 2
Expansion targets the entire map. During each expansion, the elements in the original array re-calculate the storage location and re-insert
After an element is inserted, the system determines whether to scale up or not. It may be invalid. (If you scale up after insertion, if you do not insert the element again, the system will generate an invalid scale-up)
When the total number of elements in the map exceeds 75% of the entry array, the expansion operation is triggered. To reduce the length of the linked list, the element distribution is more even.
Index Calculation Method: Index = hash & (Tab. Length-1)

The loading factor should also be considered for the initial value of hashmap:

Hash conflict: After the hash values of several keys are moduled by array size, if the hash values fall on the same Array subscript, an entry chain will be formed, to search for the key, you need to traverse each element in the entry chain and perform equals () comparison.
Load Factor: To reduce the probability of hash conflicts, the expansion is triggered when the key-value pairs in hashmap reach 75% of the array size by default. Therefore, if the estimated capacity is 100, you need to set the array size of 100/0. 75 = 134.
Space Change Time: If you want to accelerate the key search time, You can further reduce the loading Factor and increase the initial size to reduce the probability of hash conflicts.

Hashmap and hashtable both use the hash algorithm to determine the storage of their elements. Therefore, the hashmap and hashtable hash tables contain the following attributes:

Capacity: Number of buckets in the hash table
Initial Capacity: Number of buckets when a hash table is created. hashmap allows you to specify the initialization capacity in the constructor.
Size: number of records in the current hash table
Load factor: the load factor is equal to "size/capacity ". If the load factor is 0, it indicates an empty hash table, and 0.5 indicates a half-full hash table. So on. The lightweight Hash has the characteristics of less conflict and is suitable for insertion and query (but it is slow to use iterator to iterate elements)

In addition, there is a "load limit" in the hash table, and the "load limit" is a 0 ~ The "load limit" determines the maximum fill level of the hash table. When the load factor in the hash table reaches the specified "load limit", the hash table automatically doubles the capacity (number of buckets) and re-allocates the original objects, put in a new bucket, which is called rehashing.

The constructors of hashmap and hashtable allow you to specify a load limit. The default "load limit" of hashmap and hashtable is 0.75, which indicates that when 3/4 of the hash table has been filled up, rehashing occurs in the hash table.

The default value (0.75) of "load limit" is a compromise between time and space costs:

High "load limit" can reduce the memory space occupied by hash tables, but it will increase the time overhead of data query, and query is the most frequent operation (get () of hashmap () queries are required for both put () and put () methods)
Low "load limit" improves the performance of data query, but increases the memory overhead occupied by hash tables.

Programmers can adjust the "load limit" value according to the actual situation.

Concurrenthashmap

The underlying layer uses segmented arrays + linked lists, and threadsSecurity
By dividing the entire map into N segments, the same thread security can be provided, but the efficiency is increased by N times, and the default is increased by 16 times. (The read operation is not locked. Because the value variable of hashentry is volatile, the latest value can be read .)
Hashtable synchronized is for the entire hash table, that is, each time the entire table is locked for thread exclusive, concurrenthashmap allows concurrent modification operations, the key lies in the use of the lock Separation Technology
Some methods need to be cross-segment, such as size () and containsvalue (). They may need to lock the entire table, not just a specific segment. This requires locking all segments in order. After the operation is complete, release locks for all segments in order.
Expansion: expansion within the segment (the expansion is triggered when the number of elements in the segment exceeds 75% of the length of the entry array corresponding to the segment, and the whole map is not resized). expansion is not required before insertion, effectively avoiding invalid expansion.

Both hashtable and hashmap implement the map interface, but the implementation of hashtable is based on the dictionary abstract class. Java 5 provides concurrenthashmap, which is an alternative to hashtable and has better scalability than hashtable.

Hashmap is based on the hash idea to read and write data. When we pass a key-value pair to the put () method, it calls the hashcode () method of the key object to calculate the hashcode, and finds the location of the bucket to store the value object. When obtaining an object, find the correct key-value pair through the equals () method of the key object, and then return the object. Hashmap uses the linked list to solve the collision problem. When a collision occurs, the object will be stored in the next node of the linked list. Hashmap stores key-Value Pair objects in each linked list node. When the hashcode of two different key objects is the same, they are stored in the linked list at the same bucket location. You can use the equals () method of the key object to find the key-value pair. If the linked list size exceeds the threshold (treeify_threshold, 8), the linked list will be transformed into a tree structure.

In hashmap, null can be used as a key. Such a key has only one, but one or more keys can correspond to null values.When the get () method returns a null value, it can indicate that the hashmap does not have the key, or the value corresponding to the key is null.. Therefore, the get () method cannot be used in hashmap to determine whether a key exists in hashmap.Containskey ()Method. In hashtable, neither key nor value can be null.

Hashtable is thread-safe and its method is synchronous and can be directly used in a multi-threaded environment. Hashmap is NOT thread-safe. In a multi-threaded environment, manual synchronization is required.

Another difference between hashtable and hashmap is that the iterator of hashmap is the fail-fast iterator, while the enumerator iterator of hashtable is not the fail-fast iterator. So when other threads change the hashmap structure (adding or removing elements), concurrentmodificationexception will be thrown, but the remove () method of the iterator itself will not throw concurrentmodificationexception. But this is not a certain behavior, depending on JVM.

Let's take a look at the simple class diagram:

From the class diagram, we can see that in the storage structure, concurrenthashmap has an extra class segment than hashmap, and segment is a reentrant lock.

Concurrenthashmap uses the lock Segmentation technology to ensure thread security.

Lock Segmentation technology: First, divide the data into segments for storage, and then assign a lock to each segment of data. When a thread occupies a lock to access data in one segment, data in other segments can also be accessed by other threads.

Concurrenthashmap provides a different lock mechanism than hashtable and synchronizedmap. The lock mechanism used in hashtable is to lock the entire hash table at a time, so that only one thread can operate on it at a time, while concurrenthashmap locks a bucket at a time.

By default, concurrenthashmap divides hash tables into 16 buckets. Common Operations such as get, put, and remove can only lock the buckets currently used. In this way, only one thread can enter, but now there are 16 write threads to execute at the same time, the improvement of concurrency performance is obvious.

Note: partial self-built https://www.cnblogs.com/heyonggang/p/9112731.html

Principles and differences of hashmap, hashtable, and concurrenthashmap

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More