Optimizations for concurrent throughput performance
Concurrenthashmap uses some techniques to get high concurrency performance while avoiding locks. These techniques include:
- Use multiple write locks for different hash buckets (the hash value of a hash bucket, which is a key of different range);
- Using the uncertainty of the JMM (Java memory Model,java model) minimizes the time to hold a lock or fundamentally avoids the use of locks.
The Concurrenthashmap is optimized for the most common scenarios, such as getting a value that already exists in the map. In fact, most successful get () operations do not use locks at all in operation, because of the jmm uncertainty.
Multiple write locks
Recall that Hashtable's thread safety is due to the use of a single, all-map-scoped lock, which is held in all insertions, deletions, and queries, even when using iterator to traverse the entire map. When a lock is held by one thread, it is able to prevent other threads from accessing the map, even if other threads are idle. The mechanism of this single lock greatly limits the performance of concurrency.
Instead of using a lock mechanism that uses only the entire map range, Concurrenthashmap uses 32 locks, each of which is responsible for a subset of the hash buckets (that is, the range of hash values that are responsible for a portion of the key). And these locks are only used by actions that change the map content, such as put (), remove (). Having 32 separate locks means that at most 32 threads can modify the map at the same time, which is not to say that if there are less than 32 threads modifying the map at the same time without the thread being blocked, 32 is just the limit of the theoretical concurrent write operation, which in practice is not necessarily achievable. Under the current hardware and software conditions, for most programs, 32 locks are always better than one lock.
Lock for Map range
32 separate locks, each of which is responsible for a subset of the hash buckets, which means that if there are some exclusive access operations, you need to get all 32 locks, such as rehashing (that is, the number of expanded hash buckets, When HashMap key grows, the key element is re-distributed, it must be an exclusive access. However, the Java language itself does not provide a simple way to obtain a variable length lock, and since this does not occur frequently, Concurrenthashmap is a lock that uses recursion to implement the map range.
JMM (Java memory model) overview
JMM is the Java memory model, which defines how Java threads interact with each other through memory. To put it simply: there is a main memory in the JVM (main memories or Java Heap memory), all of the variables in Java are stored in memory and are shared for all threads. Each thread has its own working memory (working memory), which is the copy of some variables in main storage, the operation of the thread to all variables is in working memory, the threads cannot directly access each other, and the variable transfer needs to be done by main memory. The variables in the working memory, in the case of multicore CPUs, are mostly stored in the processor cache, and are not visible between threads when the cache is not in memory. A detailed introduction to JMM will be presented in a later article.
As you know, the CPU uses the data inside its registers as much as possible, which can greatly improve performance. The Java specification (Jls-java Language specification) specifies that some memory operations do not have to be immediately discovered by other threads. It also provides two language-level mechanisms to ensure consistent memory operations across multiple threads: synchronized and volatile. According to the description in Jsl, "in the absence of explicit synchronization, an implementation was free to update the main memory in an order that May is surprising. "(In the absence of synchronization, an action can be a surprisingly free way to update main memory.) This means that when there is no synchronization, the sequence of operations in one thread may be different from the write order of another thread, and the updated memory variable will be discovered by other threads after an indeterminate amount of time.
The most fundamental reason to use synchronized is to ensure that threads access the atomic nature of the critical code segment. Synchronized actually provides three functions atomicity, visibility, and ordering (atomicity, visibility, and sequential lines). The so-called atomicity is very well understood and straightforward, which is to ensure that different threads re-enter the same area of mutual exclusion, preventing more than one thread at a time from accessing the protected code snippet. Unfortunately, many articles only emphasize the synchronized of the atom, not the other two. In JMM, synchronization plays an important role, and when the monitor (lock) is acquired and released, the synchronization operation causes the JVM to execute the memory barrier (execute memory barriers).
When a thread acquires a lock, it executes a memory read barrier (read barriers)-the so-called memory barrier is invalid for any other thread to cache local memory (CPU internal cache or CPU register). It then causes the CPU of all other threads to re-read the values of these variables from main memory. Similarly, when a lock is released, all other threads execute a memory write barrier (write barriers)--flush any variables that have changed to main memory. The combination of mutual exclusion and memory barriers means that as long as the program follows the correct synchronization rules (this synchronization rule is: Synchronizing any of the written variables will be read correctly by another thread the next time, or synchronizing any read one next time will be changed by another thread correctly), each thread will see the correct value of the shared variable it uses.
If there is no synchronization when accessing shared variables, you will encounter strange things, and some changes will quickly react to other threads, while others may take some time to react to other threads. The result is that if you don't have to synchronized, then you can't make sure you see a consistent memory view (that is, the values of the related variables are inconsistent in different threads, and maybe some of the values are dirty data). The usual approach, and also the recommended way to avoid these dirty data is, of course, to adopt synchronized correctly. In such cases, such as in the widely used base class Concurrenthashmap, it is worthwhile to use some additional expertise and effort to develop to achieve high performance.
How Concurrenthashmap improves throughput performance at concurrency