Java Collection Concurrenthashmap Source code Analysis __java

Source: Internet
Author: User
Tags cas prev static class volatile lockstate

Note: The source of this article is JDK8 version concurrenthashmap Introduction (JDK 1.8)

Concurrenthashmap is an upgraded version of HashMap, HashMap is a non thread-safe collection, and Concurrenthashmap can support concurrent operations.
HashMap is our usual development process in the use of a relatively many sets, concurrenthashmap even if used less, but have heard certainly a lot.
In jdk1.8, HashMap is implemented through array + linked list + red-black tree, Concurrenthashmap is also so, in the previous contact of many algorithms, we know that to achieve thread safety, or display the use of locks to control, so the code is very simple, Either use the non-lock CAS algorithm to implement, so the code is more complex, concurrent performance is good, it is obvious concurrenthashmap for better concurrency performance, select CAs to implement, so the code is relatively complex.
Since Concurrenthashmap and hashmap all use the same data structure, it is better to understand HashMap before analyzing concurrenthashmap, which is easier to understand, and for HashMap to refer to what is written earlier: Java Collection HashMap source code Analysis Data Structure


Above is the logical storage structure of the CONCURRENTHASHMAP, hash table (table) for hashing, table can be stored in the location of a linked list, but also may be a red-black tree.
From this logical structure, if based on simple analysis, we can know that the operation table in different positions in the data (linked list or tree), is independent of each other, that is, thread-safe, while the operation of the table in the same position in the data that is needed to sync, In addition to the size of the table also needs to be synchronized, this is just a simple analysis, not completely accurate, and below we take a step-by-step look at how concurrenthashmap do. linked list node –node

The node defines the
static class Node<k,v> implements map.entry<k,v> {
     final int hash;
     Final K key;
     Volatile V Val; Volatile modified to ensure visibility
     volatile node<k,v> next;
     Node (int hash, K key, V Val, node<k,v> next) {
         This.hash = hash;
         This.key = key;
         This.val = val;
         This.next = next;
    The value that is not allowed to change value public  
    final V setValue (v value) {  
        throw new unsupportedoperationexception ();  
    }
     ..//Omit partial method
}

node is the CONCURRENTHASHMAP storage unit, the node in this hashmap is similar, but here val,next are volatile decorated to ensure visibility in multiple threads.

The following are the definitions of some of the attributes in the Concurrenthashmap, some of which are the same as in HashMap.

/** * The array of bins.
 lazily initialized upon the insertion. * The Size is always a power of two.
 Accessed directly by iterators. * * transient volatile node<k,v>[] table; Hash Table/** * The next table to use;
 Non-null only while resizing. * * Private transient volatile node<k,v>[] nexttable;  For the expansion of the transition table/** * largest possible table capacity. This value must is * exactly 1<<30 to stay within Java array allocation and indexing * bounds for power of two tab
 Le sizes, and is further required * because the top two bits of 32bit hash fields are for * control used. 

The maximum size of the *///hash table must be 2 n times this private static final int maximum_capacity = 1 << 30;  /** * The default initial table capacity.
 Must is a power of 2 * (i.e., at least 1) and at most maximum_capacity.

* *//hash The initial default size of the table private static final int default_capacity = 16; /** * The load factor for this table. Overrides of this value in * constructors affect only the initial tabLe capacity. The * actual floating point value isn ' t normally used--it's * simpler to use expressions such as {@code n-(n &GT;&G
 T;> 2)} for * The associated resizing threshold.

* Private static final float load_factor = 0.75f;  /** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes.  The value must be greater * than 2, and should is at least 8 to mesh with assumptions about removal
 Back to plain bins upon * shrinkage. /** * The bin count threshold for untreeifying a (split) bin during a * resize operation.
 Should be less than treeify_threshold, and in * Most 6 to mesh with shrinkage detection under.

* *///Red black tree to list threshold static final int untreeify_threshold = 6;
 /** * The smallest table capacity for which bins May is treeified. * (otherwise the table is resized if too Many nodes in a bin.)
 * The value should is at least 4 * treeify_threshold to avoid * conflicts between resizing and treeification thresholds.

*///storage mode from the list to the red-black tree The minimum threshold of the table capacity static final int min_treeify_capacity = 64; /** * Minimum number of rebinnings/transfer step.  Ranges are * subdivided to allow multiple resizer.  This value * serves as a lower bound to avoid resizers encountering * excessive memory.
 The value should is at least * default_capacity.

*///For the expansion of the hash table, move the data step (the following attributes are used to enlarge or control the SIZECTL variable) private static final int min_transfer_stride = 16;
 /** * The number of bits used for generation stamp in Sizectl.
 * Must to least 6 for 32bit arrays.

* * private static int resize_stamp_bits = 16;
 /** * The maximum number of threads can help resize.
 * Must fit in 32-resize_stamp_bits BITS.

* private static final int max_resizers = (1 << (32-resize_stamp_bits))-1;
/** * The bit shift for recording size stamp in Sizectl. * private static final int resize_stamp_shift = 32-resize_stamp_bits; * * Encodings for Node hash fields.
 Above for explanation. * * Static final int moved =-1; Indicates that this is a Forwardnode node static final int treebin =-2; Indicates that a Treebin node//is used to generate the HASH value static final int hash_bits = 0x7fffffff;
 Usable bits of normal node hash

Some of the above attributes, combined with the following code to analyze, here is a general understanding on it. Red-Black tree node –treenode

Static final class Treenode<k,v> extends Node<k,v> {
        treenode<k,v> parent;  Red-black Tree links
        Treenode<k,v> left;
        Treenode<k,v> right;
        Treenode<k,v> prev;    needed to unlink next upon deletion
        Boolean red;
        TreeNode (int hash, K key, V Val, node<k,v> Next,
                 treenode<k,v> parent) {
            super (hash, Key, Val, next);
            this.parent = parent;
        }
        ...
    }

TreeNode is used to build a red-black tree node, but concurrenthashmap in TreeNode and HashMap are somewhat different in TreeNode, where a tree is stored in part of the hash table in the HashMap, The specific storage is the TreeNode-type root node, and Concurrenthashmap is different, its hash table is stored by the treebin wrapped tree, that is, the storage is Treebin object, not TreeNode object, while Treebin With read and write locks, when you need to adjust the tree, in order to ensure the safety of the thread, must be locked. Treebin Objects

/**
* TreeNodes used at the heads of bins. Treebins don't hold user
* keys or values, but instead point to List of TreeNodes and
* their root.  They also maintain a parasitic read-write lock
* Forcing writers (who hold bin lock) to wait for readers
not) to complete before restructuring operations.
* *
static final class Treebin<k,v> extends Node<k,v> {
   treenode<k,v> root;//root
   volatile treenode<k,v>; Chain structure of the tree
   volatile Thread waiter//wait person
   volatile int lockstate;//Lock status
   /values for lockstate
   static final int WRITER = 1;  Set while holding write lock
   static final int waiter = 2;//Set when waiting for write lock
   static final int READER = 4; Increment value for setting read lock
...
}

The concrete usage We later combine the code again to analyze. transition Node –forwardingnode

/**
  * A node inserted at head of bins during transfer operations.
  * *
static final class Forwardingnode<k,v> extends Node<k,v> {
  final node<k,v>[] nexttable;
  Forwardingnode (node<k,v>[] tab) {
      super (moved, NULL, NULL, NULL);//hash value is moved identity 
      this.nexttable = tab ;
  }

Forwardingnode used in the process of hash table expansion of the transition node, when the hash table to expand the data transfer, other threads if they continue to add data to the original hash table, this is certainly bad, so the introduction of the Forwardingnode node, When the original hash table data transfer, if the location of the hash table has not been occupied, then store the Forwardingnode node, indicating that the hash table is in the expansion of the transfer data phase, so that other threads in the operation, encountered Forwardingnode node , you know the status of the hash now, you can help participate in the hash table expansion process.

Here, concurrenthashmap important data structure is basically understood, one is a hash table (table), one is linked table node, in fact, is the red-black tree node TreeNode. Construction Method

1. Non-parametric construction

/**
  * Creates a new, empty map with the default initial table size.
  *
 /Public Concurrenthashmap () {
 }

There is nothing in it, the initialization of the hash table is initialized when the data is first put.
2. Specify the hash table size

Public concurrenthashmap (int initialcapacity) {
    if (initialcapacity < 0)
        throw new IllegalArgumentException ();
    int cap = (initialcapacity >= (maximum_capacity >>> 1))?
               Maximum_capacity:
               tablesizefor (initialcapacity + (initialcapacity >>> 1) + 1));
    This.sizectl = cap;
}
/**
 * Returns a power of two table size for the given desired capacity.
 * Hackers Delight, sec 3.2
 ///Ensure that the size of the table is always 2 power square
private static final int tablesizefor (int c) {
    int n = c-1;
    n |= n >>> 1;
    N |= n >>> 2;
    N |= n >>> 4;
    N |= n >>> 8;
    n |= n >>>;
    Return (n < 0)? 1: (n >= maximum_capacity)? Maximum_capacity:n + 1;
}

3, initialization through the collection

Public Concurrenthashmap (MAP< extends K,? extends v> m) {
     this.sizectl = default_capacity;
     Putall (m);
 }

There are other construction methods that you can look at yourself, which are not fully enumerated here. Put data

Public V-Put (K key, v. value) {return
    putval (key, value, false);
}
Final V Putval (K key, V value, Boolean onlyifabsent) {if (key = NULL | | value = NULL) throw new Nullpointerexceptio
    N ();
    Compute the hash value int hash = spread (Key.hashcode ());
    int bincount = 0; for (node<k,v>[) tab = table;;)
        {node<k,v> F; int n, I, FH; Will initialize table if (tab = = NULL | |
        (n = tab.length) = = 0) tab = inittable ();
            The index in the table is computed by the hash value, and if there is no data at that location, you can put else if ((f = tabat (tab, i = (n-1) & hash) = = null) { CAS sets the data to table, and if the setting succeeds, this put is basically finished if (Castabat tab, I, NULL, new Node<k,v                   > (hash, key, value, NULL)) break;  No lock when adding to empty bin}//If move is the node state on the table position, it indicates that the hash is in the process of expanding and moving the data. else if (FH =
        F.hash) = = moved)//help enlarge tab = Helptransfer (tab, f);
           else {//hash table There is data at this location, it may be a linked list, or it may be a tree V oldval = null; Synchronized (f) {//Lock the hash table to ensure that the thread is safe//locked, and only if the position data is consistent with the lock before it can be recycled if (tabat) (Ta
                        b, i) = = f) {//hash value >=0 indicates that this is a linked list structure if (FH >= 0) {
                        Bincount = 1;
                            Traversal list for (node<k,v> e = f;; ++bincount) {K ek;
                                If the same key exists, overwrite its value if (E.hash = hash &&
                                 (ek = e.key) = = Key | | (Ek!= null && key.equals (EK)))
                                {oldval = E.val;
                                if (!onlyifabsent) e.val = value;
                            Break
                            } node<k,v> pred = e; The key is not present, and new data is added to the end of the list if ((E = e.next)= = null) {Pred.next = new node<k,v> (hash, key,
                                value, NULL);
                            Break
                    }}//The location is a red-black tree and is a Treebin object (note is Treebin, not TreeNode)
                        else if (f instanceof treebin) {node<k,v> p;
                        Bincount = 2;
                                                       Add data to the red-black tree if (P = ((treebin<k,v>) f) in the Treebin method. Puttreeval (hash, key,
                            Value)!= null) {oldval = P.val;
                        if (!onlyifabsent) p.val = value;
                /}}//Added data, need to check if (bincount!= 0) { If it is established, it is a link-list structure that is traversed and exceeds the threshold, you need to convert the list to tree if (bincount >= treeify_threshold)///Convert the list of table index I to a red-black tree
                Treeifybin (tab, i);
                if (oldval!= null) return oldval;
            Break
    }}//Concurrenthashmap capacity increased by 1 to check whether the expansion Addcount (1L, bincount) is required;
return null; }

Let's comb the rough logic first:
1, the hash value of the calculation key

int hash = spread (Key.hashcode ());
static final int spread (int h) {return
    (h ^ (h >>>)) & hash_bits;
}

2. If the table (hash table) is not initialized, the initialization of the table is performed.
3, through the hash value to get the specific location of the table I, if the location does not have data, the data directly stored in that location.
4. If there is data in the table, if the hash value of the data is moved, it indicates that the capacity of the table is enlarged, then the capacity of the table and data removal work are assisted.
5. If the data in that position on the table is valid (the real data stored), lock the location, and then perform the following action.
6, if the position is a linked list, then traverse the list, if the key exists, then according to Onlyifabsent decide whether to overwrite the value, if it does not exist, then add to the end of the list.
7, if the position is a tree structure, then the tree to perform the insert operation.
8, if the data added to the list, you need to check whether the length of the linked list exceeds the threshold, if it is necessary to convert the list to a red-black tree.
9, if the above process in multiple threads, execution failure (in advance by other threads change), you need to restart from step 2.
10, increase the capacity of the map, and check whether the need for expansion (Addcount).

Overall, the whole process is clear, and HashMap has the same logic, because the CONCURRENTHASHMAP to ensure thread safety, so in the existence of competitive operations with CAS or lock on the way, when the execution fails, you need to start again, With a general understanding, next we analyze the specific steps next to each other. Initialize Table

Under multithreading, you must ensure that the initialization of a table can only be performed once.

/** * Initializes table, using the size recorded in Sizectl.
    * Private Final node<k,v>[] inittable () {node<k,v>[] tab; int SC;  while (tab = table) = = NULL | | tab.length = 0) {//If Sizectl <0 indicates that other threads are initializing the table, wait for initialization to complete if (SC = Sizectl) < 0)//yield CPU Thread.yield (); Lost initialization race;
            Just spin//will perform table initialization, the CAS setting sizectl value is-1 else if (U.compareandswapint (this, Sizectl, SC,-1)) { try {if (tab = table) = = NULL | | tab.length = 0) {int n = (sc > 0)? s
                    c:default_capacity;
                    @SuppressWarnings ("Unchecked") node<k,v>[] nt = (node<k,v>[)) New node<?,? >[n];
                    Table = Tab = NT; sc = n-(n >>> 2); 0.75*n}} finally {//There is no need for CAS, the front has guaranteed that only one thread can perform the initialization work Sizec TL = SC; SizecTL set to 0.75*n} break;
} return tab; }

Sizectl defaults to 0, if the Concurrenthashmap is instantiated with a capacity parameter, then SIZECTL is the value of a power exponent of 2. So the thread that performs the first put operation executes the Unsafe.compareandswapint method to modify Sizectl 1, and only one thread can modify it successfully, and the other threads pass the Thread.yield () Let the CPU time slice wait for table initialization to complete, the whole process is clear.
For the Table expansion section, a little later to analyze, first to lay a point of foundation. add data to a linked list or tree

For the content of the synchronized package in Putval, it is to add the data to the linked list or red-black tree, the specific code is no longer posted here, you can look at the previous code.
There is a definition in the previous attribute definition: If it is a red-black tree, then its root (that is, the data in the table) is a hash value of Treebin

static final int Treebin   =-2;//hash for roots of trees

Thus, the hash value of the data at a location in the table is analyzed, and if it is a negative number it is a red-black tree, otherwise it is the list structure.
For the data to add to the list, this should be very simple, easy to understand, its bincount used to identify the length of the linked list, its use is very ingenious.

If the position is a tree, then its table is the Treebin object, Treebin is a reference to the root in the same time has a series of red and black tree operations, so directly invoke the Treebin object Add Method (Puttreeval), for the red and Black tree operation, In this article will not say, if interested or not very clear, you can see the other two content, which has a detailed introduction to the red and black trees in the red and black trees to draw the treemap source code Analysis of Java set to convert the list to a red-black tree

When you add data to a linked list, the list is converted to a red-black tree if the length of the list exceeds the threshold.

/** * Replaces all linked nodes into bin at given index unless the table is * too small, in which case resizes instead.
    * Private final void Treeifybin (node<k,v>[] tab, int index) {node<k,v> b; int N, SC; if (tab!= NULL) {//If the table's capacity does not meet the threshold requirements for the list to be converted to a red-black tree, you need to enlarge the table if ((n = tab.length) < Min_treeify_capa
        City) trypresize (n << 1); else if ((b = tabat (tab, index))!= null && b.hash >= 0) {//Lock the position in the table Synchroniz
                    Ed (b) {if (Tabat (tab, index) = = b) {treenode<k,v> HD = null, TL = NULL;
                        Traverse list, construct TreeNode list for (node<k,v> e = b; e!= null; e = e.next) {
                                              treenode<k,v> p = new Treenode<k,v> (E.hash, E.key, E.val,
                        NULL, NULL);
                 if ((P.prev = tl) = = null)           HD = p; 
                        else//using the next field in the TreeNode, construct the TreeNode list tl.next = p;
                    TL = p; //new treebin<k,v> (HD) is a TreeNode linked list into a red-black tree//To change the list TreeNode object from the table's original location
                For Treebin object Settabat (tab, index, New treebin<k,v> (HD)); }
            }
        }
    }
}

Approximate process:
1, if the table's capacity does not meet the list conversion to red-black tree threshold requirements, you need to enlarge the table
2, lock the table tables the location of the data, traversing the linked list, the node list into TreeNode linked list
3, through the Treebin object to construct the red-black tree structure through the TreeNode linked list, the root is stored in the Treebin object
4, the Treebin object stored in the table of the original linked table node location, the list to the red and black tree to complete.
The TreeNode list is constructed before the linked list is converted to a red-black tree. In this way, after constructing the red-black tree, not only can traverse the structure through the tree, but also there is a chain structure between the TreeNode, which is the relationship between the initial link list and the red-black tree. This structure also exists in HashMap, and there are related diagrams (in the last part), which can refer to: java set of HashMap source code Analysis

When constructing a Treebin object, a red-black tree is constructed by passing in the TreeNode linked list, which can look at the construction method of Treebin, which is not posted here. Expansion

When the table has insufficient capacity, the number of elements in the table reaches the capacity threshold Sizectl, and the table needs to be enlarged.
The whole expansion is divided into two parts: Build a nexttable, twice times the size of table. Copy the table data into the nexttable.

Expansion in both HashMap and Concurrenthashmap is a play, Concurrenthashmap is supporting concurrent inserts, there are two ways to expand the operation, one is like the initialization table, the entire process is controlled by only one thread to operate, This is certainly easy to implement, but this will affect performance, when the amount of data is larger, the move data will be a cumbersome operation, the pursuit of the perfect JDK is certainly not the case, build nexttable this must have only one thread to execute, but the table Data in the nexttable, this can be replicated concurrently, so that the implementation is more complex, in the lock-free thread-safe algorithm, have used a thought: auxiliary

The following passage has been elaborated in the analysis Synchronousqueue:
Do not use locks to ensure the integrity of data structures, to ensure that other threads are not only able to determine whether the first thread has completed the update or is midway through the update, but also to determine what action is required to complete the update if the first thread is operating state. If a thread discovers a data structure that is in the middle of an update, it can " help " The thread that is performing the update to complete the update before doing its own operation. When the first thread comes back to try to complete its own update, it will find that it is no longer needed and can be returned.

As a result, when the table replicates data, other threads are able to participate in the replication together, which can greatly improve efficiency, but also must ensure that the data structure is not destroyed, let's take a step-by-step look at how to achieve this process.

To analyze the expansion, we start with the Addcount method, and when the data is added, the count of size in the map is incremented, and the table is checked for expansion

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.