Why does the CONCURRENTHASHMAP read operation need not lock?

Last Update:2018-09-12 Source: Internet

Author: User

Tags modifier visibility volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We know that Concurrenthashmap (1.8) This concurrency collection framework is thread-safe, and when you see the get operation of the source code, you will find that the get operation is not locked any more, which is also the question discussed in this blog post-why does it need not lock?

Concurrenthashmap's introduction

I would like to have the basic students know in the jdk1.7 is the use of segment + Hashentry + reentrantlock way to achieve, and 1.8 abandoned the segment bloated design, instead of using node + CAS + Synchronized to ensure concurrency security is implemented.

JDK1.8 implementation to reduce the granularity of the lock, JDK1.7 version of the granularity of the lock is based on segment, including multiple hashentry, and JDK1.8 lock granularity is hashentry (first node)
JDK1.8 version of the data structure is more simple, so that the operation is also more clear and smooth, because the use of synchronized to synchronize, so do not need the concept of segmented lock, there is no need to segment this data structure, due to the decrease in granularity, the complexity of the implementation has increased
JDK1.8 the use of red and black trees to optimize the list, based on long-length linked list traversal is a long process, and red black tree traversal efficiency is very fast, instead of a certain threshold of the list, so as to form a best partner

Get Operation Source Code

First, the hash value is calculated and positioned to the table index position, if the first node matches the return
If it encounters the expansion, it calls the Find method that the flag is expanding node Forwardingnode, finds the node, and matches returns
If none of the above matches, go down the node and the match will return, otherwise null will be returned.

Will find that the source code does not have a lock public V get (Object key) {node<k,v>[] tab; Node<k,v> e, p; int n, eh;    K Ek; int h = spread (Key.hashcode ()); Calculate the hash if ((tab = table) = null && (n = tab.length) > 0 && (E = tabat (tab, (n-1) & H )) = null) {//Read the node element of the first nodes if (eh = e.hash) = = h) {//If the node is the first node return if (ek = e.key) = = Key | |                (ek! = null && key.equals (EK)))        return e.val; A negative value of//hash indicates that the expansion is being expanded, and this time the Forwardingnode Find method is used to locate the nexttable to//eh=-1, indicating that the node is a forwardingnode and is migrating, at which time the Forwar is called        Dingnode's Find method goes to nexttable.        Eh=-2, indicating that the node is a treebin, call Treebin's Find method to traverse the red black tree, because the red and black trees may be rotating discoloration, so find will have a read-write lock.        Eh>=0, it shows that the node is linked to a list, directly traversing the list. else if (EH < 0) return (p = e.find (H, key))! = null?        P.val:null;  while ((e = e.next) = null) {//Neither the first node nor the Forwardingnode, then go down the if (E.hash = = h && (ek = E.key) = = Key | | (ek! = null && key.equALS (EK))) return e.val; }} return null;}

Get no locking, Concurrenthashmap is how to ensure that the data read is not dirty data? Volatile debut

For visibility, Java provides a volatile keyword to ensure visibility and ordering . but there is no guarantee of atomicity .
Common shared variables do not guarantee visibility, because when a common shared variable is modified, it is indeterminate when it is written to main memory, and when other threads go to read it may still be the original old value, and therefore cannot guarantee visibility.

The volatile keyword for a basic type of modification can subsequently be consistent with the read of multiple threads, but for an array of reference types, the entity bean simply guarantees the visibility of the reference, but does not guarantee the visibility of the reference content.

command reordering is prohibited.

Background: To improve processing speed, the processor does not communicate directly with the memory, but instead reads the system memory data into the internal cache (L1,L2 or other) before operation, but does not know when the memory will be written.

If you write to a variable that declares volatile , the JVM sends an instruction to the processor that writes the data of the cache row that contains the variable back to the system memory. However, even if you write back to memory, if the other processor cache value is still old, then perform the calculation operation will be problematic.
Under multiprocessor, in order to ensure that the cache of each processor is consistent, the cache consistency protocol is implemented , and when a CPU is writing data, if the variable that is found to be a shared variable, the other CPU is notified that the cached row of the variable is invalid, so the other CPU reads the variable. Finding it invalid will reload the data from main memory.

summarize it down :
First: Using the volatile keyword forces the modified value to be immediately written to main memory;
Second: With the volatile keyword, when thread 2 is modified, it causes the cache row of the working in-memory cache variable of thread 1 to be invalid (reflected to the hardware layer, that is, the CPU L1 or the corresponding cache line in the L2 cache is invalid);
Third: Because the cache line of the working in-memory cache variable in thread 1 is not valid, thread 1 reads the value of the variable again when it reads it.

Is it a volatile addition to the array?

    /**     * The array of bins. Lazily initialized upon first insertion.     * Size is always a power of two. Accessed directly by iterators.     */    transient volatile Node<K,V>[] table;

We know that volatile can be used to modify an array, but the meaning is different from what it looks like on the surface. For a chestnut, volatile int array[10] means that the address of an array is volatile and not the value of the array element is volatile.

Node with a volatile modifier

The

Get operation can be unlocked because the element val of node and the pointer next are modified with volatile, and thread A in the multithreaded environment is changed because the hash conflict modifies the node's Val or new nodes to be visible to threads B.

Static Class Node<k,v> implements map.entry<k,v> {final int hash;    Final K Key;    You can see that these are all used for volatile modification of volatile V Val;    Volatile node<k,v> next;        Node (int hash, K key, V Val, node<k,v> next) {This.hash = hash;        This.key = key;        This.val = val;    This.next = Next;    } Public final K GetKey () {return key;}    Public final V GetValue () {return val;}    Public final int hashcode () {return Key.hashcode () ^ Val.hashcode ();    Public final String toString () {return key + "=" + val;}    Public final V SetValue (v value) {throw new unsupportedoperationexception (); Public final Boolean equals (object o) {object k, V, u;        map.entry<?,? > E;                 Return ((o instanceof map.entry) && (k = (E = (map.entry<?,? >) o). GetKey ())! = null &&            (v = e.getvalue ()) = null && (k = = key | | k.equals (key)) &&    (v = = (U = val) | | v.equals (u)); }/** * Virtualized support for map.get ();     overridden in subclasses.        */node<k,v> find (int h, Object K) {node<k,v> E = this;                if (k! = null) {do {k ek; if (E.hash = = h && (ek = e.key) = = k | |                    (ek! = null && k.equals (EK))))            return e;        } while ((e = e.next) = null);    } return null; }}

Since the volatile modifier array has no effect on the get operation, what is the purpose of adding volatile on the array?

In order to make the node array more volatile for other threads when expanding

Summarize

In the 1.8 Concurrenthashmap get operation does not need to lock, this is it than other concurrent collections such as Hashtable, with Collections.synchronizedmap () packaging hashmap, security efficiency is one of the reasons.
The get operation does not need to be locked because the member Val of node is not related to the volatile modification of the array with the volatile modifier.
The use of volatile arrays is primarily guaranteed to ensure visibility when arrays are expanded.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More