Java Concurrency Programming: Concurrenthashmap of Concurrent Containers (reprint)
The following part of the content is reproduced from:
http://www.haogongju.net/art/2350374
New concurrent packages have been added to the JDK5, and concurrent containers have improved concurrency performance through mechanisms relative to the synchronization container. Because the synchronization container will have all access to the container state
serialization, which guarantees thread security, so the cost of this approach is to severely reduce concurrency, and when multiple threads compete for containers, throughput is severely reduced. So Java5.0 Open
The design of concurrent access for multi-threaded, provides concurrent container with good concurrency and introduces the Java.util.concurrent package. With vectors and Hashtable,
Compared to collections.synchronizedxxx () synchronous containers, the concurrent containers introduced in util.concurrent solve two problems:
1) Design According to the specific scene, try to avoid synchronized and provide concurrency.
2) defines some concurrency-safe compound operations and ensures that iterative operations in a concurrent environment are not error-prone.
Util.concurrent containers in the iteration, can not be encapsulated in the synchronized, can be guaranteed not to throw exceptions, but not every time you see the "latest, current" data.
The following is a brief introduction to concurrent containers:
Concurrenthashmap instead of a synchronized map (collections.synchronized (new HashMap)), it is well known that HASHMAP is stored in segments based on the hash value, and the synchronization map locks all the segments at synchronization time. While the Concurrenthashmap lock is based on the hash value lock the corresponding segment of the hash value lock, thus improving the concurrency performance. Concurrenthashmap also adds support for commonly used compound operations, such as "add if not", replace, conditional increase, etc.
Copyonwritearraylist and Copyonwritearrayset, respectively, instead of list and set, mainly in the case of a traversal operation instead of synchronizing the list and synchronous set, which is the idea described above: the iterative process to ensure that no error, In addition to locking, another way is to "clone" a container object.
Concurrentlinkedquerue is the query implementation, which is a FIFO queue. Normal queue implementations do not block, and if the queue is empty, then the operation that takes the element returns NULL. The queue is generally implemented with LinkedList because it removes the random access requirement for the list, so concurrency is better.
Concurrentskiplistmap can replace soredmap in efficient concurrency (for example, TreeMap packaged with Collections.synchronzedmap).
Concurrentskiplistset can replace soredset in efficient concurrency (for example, TreeMap packaged with Collections.synchronzedset).
This article focuses on 2 concurrent containers: Concurrenthashmap and copyonwritearraylist concurrenthashmap,copyonwritearraylist are described in the next article.
Original link: http://www.iteye.com/topic/1103980
Everyone knows that HashMap is non-thread-safe, Hashtable is thread-safe, but because Hashtable is synchronous with synchronized, it is equivalent to competing for a lock when all threads are reading and writing, resulting in very low efficiency.
Concurrenthashmap can be read data without locking, and its internal structure allows it to write operations can keep the granularity of the lock as small as possible, without the entire concurrenthashmap lock.
The internal structure of the Concurrenthashmap
Concurrenthashmap in order to improve their concurrency, in the internal use of a structure called segment, a segment is actually a class hash table structure, segment internal maintenance of a list of arrays, Let's take a look at the internal structure of Concurrenthashmap in the following picture:
From the above structure we can see that the process of locating an element requires two hash operations, the first hash is positioned to segment, the second hash is positioned to the head of the linked list where the element is Concurrenthashmap, so This kind of structure brings the side effect is the hash process to be longer than the ordinary HashMap, but brings the benefit is the writing operation time can only the element segment to carry on the lock, does not affect the other segment, thus, in the ideal situation, Concurrenthashmap can support up to segment number of writes at the same time (just as these write operations are distributed very evenly across all segment), so the concurrency of Concurrenthashmap can be greatly improved by this kind of structure.
Segment
Let's take a look at the data structure of segment in detail:
Static final class Segment<k,v> extends Reentrantlock implements Serializable { transient volatile int count;
transient int modcount; transient int threshold; Transient volatile hashentry<k,v>[] table; Final float Loadfactor;}
Explain in detail the meaning of the member variables inside the segment:
- Number of elements in count:segment
- Modcount: The number of operations that affect the size of the table (such as a put or remove operation)
- Threshold: Threshold value, the number of elements inside the segment exceeds this value will still expand the segment
- Table: An array of linked lists, each of which represents the head of a linked list
- Loadfactor: Load factor, used to determine threshold
Hashentry
The elements in segment are stored as hashentry in the list of linked lists, and look at the structure of the Hashentry:
Static Final class Hashentry<k,v> { final K key; final int hash; volatile V value; Final hashentry<k,v> next;
You can see a feature of Hashentry, in addition to value, a few other variables are final, this is to prevent the structure of the list is broken, there is a concurrentmodification situation.
Initialization of the Concurrenthashmap
Let's combine the source code to concretely analyze the implementation of CONCURRENTHASHMAP, first look at the initialization method:
Public Concurrenthashmap (int initialcapacity, float loadfactor, int concurrencylevel) {if (!) ( Loadfactor > 0) | | Initialcapacity < 0 | | Concurrencylevel <= 0) throw new IllegalArgumentException (); if (Concurrencylevel > max_segments) concurrencylevel = max_segments; Find Power-of-two sizes Best matching arguments int sshift = 0; int ssize = 1; while (Ssize < concurrencylevel) {++sshift; Ssize <<= 1; } segmentshift = 32-sshift; Segmentmask = ssize-1; this.segments = Segment.newarray (ssize); if (initialcapacity > maximum_capacity) initialcapacity = maximum_capacity; int c = initialcapacity/ssize; if (c * ssize < initialcapacity) ++c; int cap = 1; while (Cap < c) Cap <<= 1; for (int i = 0; i < this.segments.length; ++i) this.segments[i] = new segment<k,v> (cap, loadfactor);}
Currenthashmap Initializes a total of three parameters, a initialcapacity, representing the initial capacity, a loadfactor, the load parameters, and the last is Concurrentlevel, Represents the number of segment inside the Concurrenthashmap, Concurrentlevel once specified, immutable, subsequent if the number of elements of the concurrenthashmap increase resulting in conrruenthashmap need to expand, Concurrenthashmap does not increase the number of segment, but only increases the capacity of the list array in segment, and the benefit is that the expansion process does not require rehash for the entire concurrenthashmap, You just need to do a rehash on the elements inside the segment.
The whole Concurrenthashmap initialization method is very simple, first according to Concurrentlevel to new segment, here the number of segment is not greater than the largest 2 index of Concurrentlevel, That is, the number of segment is always 2 of the number of points, such a benefit is convenient to use the shift operation to hash, speed up the process of hashing. The next step is to determine the size of the segment according to Intialcapacity, and the capacity size of each segment is also a 2 index, which also makes the process of speeding up the hash.
We need to pay special attention to this side. Two variables, namely Segmentshift and Segmentmask, are the two variables that will play a big role in the future, assuming the constructor determines that the number of segment is 2 N, then segmentshift equals 32 minus N. And Segmentmask is equal to 2 of the n-th square minus one.
Get operations for Concurrenthashmap
As mentioned earlier, the get operation of Concurrenthashmap is not locked, so let's look at its implementation here:
Public V get (Object key) { int hash = hash (Key.hashcode ()); return Segmentfor (hash). Get (key, hash);}
Look at the third line, segmentfor. This function is used to determine which segment the operation should take, almost all operations of concurrenthashmap need to use this function, we look at the implementation of this function:
Final segment<k,v> segmentfor (int hash) { return segments[(hash >>> segmentshift) & Segmentmask] ;}
This function uses a bitwise operation to determine the segment, to the right of the passed hash value to move the left Segmentshift bit, and then segmentmask with the operation, combined with the segmentshift and Segmentmask values we said earlier, The following conclusions can be drawn: assuming that the number of segment is 2 N-square, it is possible to determine in which segment the element is based on the high N-bit of the hash value of the element.
After determining which segment to work on, the next thing to do is to call the corresponding segment get method:
V Get (Object key, int hash) { if (count! = 0) {//Read-volatile hashentry<k,v> e = GetFirst (hash); while (E! = null) { if (E.hash = = Hash && key.equals (E.key)) { v v = e.value; if (v! = null) return v; Return Readvalueunderlock (e); Recheck } e = E.next; } } return null;}
First look at the second line of code, where count is judged, where count represents the number of elements in segment, we can look at the definition of Count:
transient volatile int count;
You can see that count is volatile, which actually takes advantage of the volatile semantics:
Wrote
Write operations on volatile fields are happens-before to each subsequent read operation of the same field.
Because actually put, remove and other operations will also update the value of count, so when the competition occurs, volatile semantics can ensure that the write operation before the read operation, it is also guaranteed that the write operation for subsequent read operations are visible, This allows the subsequent operation of the get to get the complete element content.
Then, in the third line, call the GetFirst () to get the head of the list:
hashentry<k,v> getfirst (int hash) { hashentry<k,v>[] tab = table; Return Tab[hash & (Tab.length-1)];}
Similarly, here is the use of bit operations to determine the head of the list, hash value and hashtable length minus a do and operation, the final result is the hash value of the low n bits, where n is the length of the Hashtable of 2 as the result of the bottom.
After determining the head of the list, it is possible to traverse the entire list, looking at the 4th row, taking out the value of the key corresponding to the values, if the value of value is null, then maybe this key,value to the process of being put, if this happens, The lock is then added to ensure that the value taken is complete, and if it is not NULL, the value is returned directly.
Put operation for Concurrenthashmap
After reading the get operation, and then look at the put operation, put operation of the front is also determine the process of segment, here no longer repeat, directly see the key segment put method:
v put (K key, int hash, V value, Boolean onlyifabsent) {lock (); try {int c = count; if (c + + > Threshold)//ensure capacity rehash (); hashentry<k,v>[] tab = table; int index = hash & (tab.length-1); Hashentry<k,v> first = Tab[index]; hashentry<k,v> e = first; while (E! = null && (E.hash! = Hash | |!key.equals (E.KEY))) e = E.next; V OldValue; if (E! = null) {oldValue = E.value; if (!onlyifabsent) E.value = value; } else {oldValue = null; ++modcount; Tab[index] = new hashentry<k,v> (key, hash, first, value); Count = C; Write-volatile} return oldValue; } finally {unlock (); }}
First, the put operation for segment is locked, and then on line fifth, if the number of elements in the segment exceeds the threshold (calculated by Loadfactor in the constructor), this requires segment expansion, and rehash, About the process of rehash you can get to know, here is not detailed.
Lines 8th and 9th are the GetFirst process that determines the position of the list head.
Line 11th here This while loop is looking in the list and to put the elements of the same key element, if found, directly update the value of key update, if not found, then enter 21 lines here, Generate a new hashentry and add it to the head of the entire segment, and then update the value of count.
The remove operation of the Concurrenthashmap
The previous part of the remove operation, like the previous get and put operations, is the process of locating segment and then calling the segment Remove method:
V remove (object key, int hash, object value) {lock (); try {int c = count-1; hashentry<k,v>[] tab = table; int index = hash & (tab.length-1); Hashentry<k,v> first = Tab[index]; hashentry<k,v> e = first; while (E! = null && (E.hash! = Hash | |!key.equals (E.KEY))) e = E.next; V oldValue = null; if (E! = null) {v v = e.value; if (value = = NULL | | value.equals (v)) {oldValue = v; All entries following removed node can stay//in list, but all preceding ones need to be Cloned. ++modcount; hashentry<k,v> Newfirst = E.next; for (hashentry<k,v> p = first; P! = e; p = p.next) Newfirst = new Hashentry<k,v> (P.key, P.h Ash, Newfirst, P.value); Tab[index] = NewfIrst Count = C; Write-volatile}} return oldValue; } finally {unlock (); }}
The remove operation is also the location of the element to be deleted, but the method of deleting the element here is not simply pointing to the next one of the elements in front of the element to be deleted, and we have already said that Hashentry's next is final. Once the assignment is not modifiable, after locating the element to be deleted, the program will copy the elements in front of the deleted element all over again, and then one after another to the linked list, look at the following picture to understand the process:
Assuming that the original element in the list is shown, now to delete element 3, the list after deleting element 3 is as follows:
The size operation of the Concurrenthashmap
In the previous chapters, we dealt with operations in a single segment, but Concurrenthashmap had some operations in multiple segment, such as the size operation, Concurrenthashmap's size operation also uses a more ingenious way to avoid locking all segment as much as possible.
We mentioned earlier that there is a Modcount variable in the segment, which represents the number of operations that affect the amount of elements in the segment, this value only increases, and the size operation iterates through the segment two times. The Modcount value of the segment is recorded each time, and then two times the modcount is compared, if the same, it means that there has not been a write operation, the result of the original traversal is returned, if not the same, the process is repeated again, if not the same, Then you need to lock all the segment, and then one by one, the implementation of the concrete can see the source of Concurrenthashmap, here is not posted.
The other 2 articles on the principle of Concurrenthashmap:
Details of the implementation of the Concurrenthashmap: http://www.iteye.com/topic/344876
chat Concurrency (iv) in-depth analysis of Concurrenthashmap: http://ifeve.com/ConcurrentHashMap/
Java Concurrency Programming: Concurrenthashmap of Concurrent Containers (reprint)