Concurrenthashmap Collections.synchronizedmap and Hashtable discussion

Last Update:2014-12-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first associated collection class that appears in the Java class Library is Hashtable, which is part of the JDK1.0. Hashtable provides an easy-to-use, thread-safe, and associated map feature, which is of course convenient. However, thread safety is a cost-――hashtable all methods are synchronized. At this point, a non-competitive synchronization can result in considerable performance costs. Hashtable's successor, HashMap, appears as part of the collection framework in JDK1.2, which resolves the thread safety problem by providing an unsynchronized base class and a synchronized wrapper collections.synchronizedmap.
By separating the basic functionality from thread security, Collections.synchronizedmap allows users who need synchronization to have synchronization, while users who do not need synchronization pay a price for synchronization. The simple approach taken by Hashtable and Synchronizedmap to get synchronization (each method in a synchronous hashtable or in a synchronized Map wrapper object) has two major deficiencies. First, this approach is a barrier to scalability because only one thread can access the hash table at a time. At the same time, this is still not enough to provide true thread safety, and many common hybrid operations still require additional synchronization.
Although simple operations such as get () and put () can be done safely without the need for additional synchronization, there are some common sequences of operations, such as iterations or put-if-absent (empty), which require external synchronization to avoid data contention. Conditional thread-Safe synchronization of the collection wrapper Synchronizedmap and synchronizedlist, sometimes referred to as conditional ground security-all individual operations are thread-safe, but a sequence of operations consisting of multiple operations can result in data contention. Because the control flow in the action sequence depends on the result of the preceding operation.
If an entry is not in map, add this entry. Unfortunately, the time the ContainsKey () method returns to the put () method is called, there may be another thread that also inserts a value with the same key. If you want to make sure that you have only one insert, you need to wrap the pair of statements with a synchronization block that synchronizes the map. The result of List.size () may become invalid during the execution of a loop because another thread can remove entries from this list. It is possible that an entry after the last iteration of the loop has been deleted by another thread, then List.get () will return null, throwing a nullpointerexception exception. So what measures can be taken to avoid this situation? If another thread may be accessing the list while you are iterating over a list, you must use a synchronized block to wrap the list and synchronize it on the list in order to lock the entire list.
This solves the problem of data contention, but it pays more for concurrency, because locking the entire list during an iteration blocks other threads from accessing the list for a long period of time. The collection framework introduces an iterator that iterates through a list or other collection, optimizing the process of iterating over the elements in a collection. However, Iterators implemented in the Java.util collection class are extremely prone to crash, that is, if another thread is modifying the collection while one thread is traversing the collection through a Iterator, then the next Iterator.hasnext () or Iterator.next () call, the Concurrentmodificationexception exception is thrown.
If you want to prevent concurrentmodificationexception exceptions, when you are iterating, you must wrap the list with a synchronized block synchronized on Listl to lock the entire list. (You can also call List.toarray () to iterate over the array in the case of a different step, but it is expensive if the list is larger).

While Concurrenthashmap is part of the Douglea util.concurrent package, it is now integrated into JDK5.0, providing a higher degree of concurrency than Hashtable or synchronizedmap. Also, for most successful get () operations it tries to avoid a full lock, and the result is a very good throughput for concurrent applications.
1 Optimized for throughput
Concurrenthashmap uses several techniques to achieve high levels of concurrency and avoid locking, including using multiple write locks for different hashbucket (buckets) and using jmm uncertainties to minimize the time that locks are kept-or to avoid acquiring locks at all. It is optimized for most general usages, and these usages often retrieve a value that is likely to exist in the map. In fact, most successful get () operations do not require any locking to run at all. (Warning: Don't try to do it yourself!) It's not as easy as it looks to be smarter than JMM. The Util.concurrent class is written by a concurrency expert and has a rigorous peer review of JMM security. ）
More than 2 write locks
We can recall that the main obstacle to the scalability of Hashtable (or alternative collections.synchronizedmap) is that it uses a map range (Map-wide) lock, in order to ensure insertion, The integrity of the delete or retrieve operation must maintain such a lock, and sometimes even to ensure that the integrity of the iterative traversal operation remains such a lock. In this way, as long as the lock is maintained, it essentially prevents other threads from accessing the map, even if the processor is idle and inaccessible, greatly restricting concurrency.
Instead of a single map-scoped lock, Concurrenthashmap is replaced by a collection of 32 locks, each of which is responsible for protecting a subset of the Hashbucket. Locks are used primarily by the change operation (put () and remove ()). Having 32 independent locks means that you can have up to 32 threads to modify the map at the same time. This does not necessarily mean that when the number of threads that are concurrently writing to the map is less than 32 o'clock, the additional write operation is not blocked--32 is theoretically a number of concurrency limits for a write thread, but may not actually reach this value. However, 32 is still much better than 1 and is sufficient for most applications running on the current generation of computer systems.
3 operation of the map range
There are 32 separate locks, each of which protects a subset of the Hashbucket so that all 32 locks must be obtained for exclusive access to the map. Some map-scoped operations, such as size () and IsEmpty (), may be able to not lock the entire map at once (by properly qualifying the semantics of these operations), but some operations, such as map rearrangement (expanding the number of hashbucket, re-distributing elements as the map grows), You must guarantee exclusive access. The Java language does not provide an easy way to get a variable-sized set of locks. It is very rare to have to do this, and once this happens, it can be done with a recursive approach.
4 JMM Overview
Before entering the implementation of put (), get (), and remove (), let's take a brief look at JMM. JMM governs the way that a thread's action on memory (read and write) affects the behavior of other threads on memory. The Java Language Specification (JLS) allows some memory operations to be not immediately visible to all other threads because of the performance gains caused by using processor registers and preprocessing caches to improve memory access speed. There are two language mechanisms that can be used to ensure consistent--synchronized and volatile across thread memory operations.
According to JLS, "in the absence of explicit synchronization, an implementation is free to update main memory, and the order taken in the update may be unexpected." "It means that if there is no synchronization, a certain sequence of writes in a given thread may appear in a different order for another thread, and the time to propagate memory variables from one thread to another is unpredictable."
Although the most common reason for using synchronization is to guarantee atomic access to critical portions of your code, it actually provides three separate functions-atomicity, visibility, and sequencing. Atomicity is very simple-synchronous implementation of a reentrant (reentrant) mutex prevents more than one thread from executing a block of code that is protected by a given monitor at the same time. Unfortunately, most articles focus only on atomicity, ignoring other aspects. But synchronization also plays an important role in JMM, causing the JVM to execute memory barriers (memorybarrier) when it obtains and releases the monitor.
After a thread obtains a monitor, it executes a read barrier (Readbarrier) that invalidates all the variables in the cached thread's local memory (such as the processor cache or processor register), which causes the processor to re-read the variables used by the synchronous code block from main memory. Similarly, when the monitor is released, the thread executes a write barrier (writebarrier)-Writes all the modified variables back to main memory. The combination of exclusive and memory barriers means that you should follow the correct synchronization rules when you design your program (that is, whenever you write a variable that might be accessed later by another thread, or if you read a variable that might eventually be modified by another thread, use synchronization). Each thread will get the correct value for the shared variable it uses.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Concurrenthashmap Collections.synchronizedmap and Hashtable discussion

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support