Java non-lock hashmap principle and implementation of detailed _java

Source: Internet
Author: User
Tags cas rehash

There are several options for applying HashMap in a Java multithreaded environment: using thread-safe java.util.Hashtable as an alternative Use the Java.util.Collections.synchronizedMap method to wrap an existing HashMap object as thread-safe. Using the Java.util.concurrent.ConcurrentHashMap class as an alternative, it has very good performance.
And the above several methods in the implementation of the specific details, are more or less used to the mutual exclusion lock. Mutual exclusion locks can cause thread congestion, reduce operational efficiency, and may lead to a series of problems such as deadlock, priority flip, and so on.

CAS (Compare and Swap) is a feature provided by the underlying hardware that can be atomized to determine and change the operation of a value.

Atomic operations in Java

In the Java.util.concurrent.atomic package, Java provides us with a number of convenient atomic types, all of which are based entirely on CAS operations.

For example, if we want to implement a globally common counter, we can:

Copy Code code as follows:

Privateatomicinteger counter =newatomicinteger (3);

Publicvoidaddcounter () {

for (;;) {

Intoldvalue = Counter.get ();

Intnewvalue = OldValue +1;

if (Counter.compareandset (OldValue, NewValue))

Return

}

}

Where the Compareandset method checks whether the existing value of counter is oldvalue and, if it is, sets it to the new value newvalue, the operation succeeds and returns true, otherwise the operation fails and returns FALSE.

When the counter new value is computed, Compareandswap fails if other threads change the value of the counter. At this point we only need to add a layer of loops outside, and constantly try this process, then eventually will be successful counter value of +1. (In fact, Atomicinteger has defined the Incrementandget and Decrementandget method for commonly used +1/-1 operations, we can simply call it later)

In addition to Atomicinteger, The Java.util.concurrent.atomic package also provides the atomicreference and Atomicreferencearray types, which represent the atomic reference and the Atomic Reference array (the referenced array), respectively.

Realization of chain-free table
Before implementing the unlocked HashMap, let's take a look at a simpler way to implement a chain-free table.

Take the insert operation for example:

First we need to find the node A and the rear node B at the front of the insertion point.
Then create a new node C and make its next pointer point to Node B. (See Figure 1)
The next pointer to Node A in the last point is Point C. (See Figure 2)

But in the middle of the operation, it is possible that other threads in A and b directly also inserted some nodes (assuming D), if we do not make any judgments, may cause other threads to insert the loss of the node. (See Figure 3) We can use the CAS operation to determine whether the next pointer to node A is still pointing to B, and retry the entire insertion if the next pointer of Node a changes. The approximate code is as follows:

Copy Code code as follows:

Privatevoidlistinsert (node Head, node C) {


for (;;) {


Node a = Findinsertionplace (head), B = A.next.get ();


C.next.set (b);

if (A.next.compareandswap (b,c))

Return
}
}

(The next field of the node class is atomicreference<node> type, that is, an atomic reference to the node type)

The lookup operation of a chain-free table is no different from a regular list. And its deletion, you need to find node A and rear node B in front of the node to be deleted, validate and update Node A's next pointer to Node B using the CAS operation.

The difficulty and breakthrough of lock-free HashMap
HashMap has four basic operations, including inserting, deleting, searching and rehash. A typical HASHMAP implementation will use an array of each element of the array as a linked list of nodes. For this list, we can use the action method mentioned above to perform insert, delete, and lookup operations, but it is more difficult for rehash operations.

As shown in Figure 4, a typical operation during the rehash process is to traverse each node in the old table, calculate its position in the new table, and then move it to the new table. During this time we need to manipulate 3 pointers:

Point A's next pointer to D
Point B's next pointer to C
Point C's next pointer to E
This three-pointer operation must be completed at the same time to ensure the atomic nature of the mobile operation. But it is easy to see that a CAS operation can only guarantee that the value of one variable is validated and updated atomically, and that it cannot satisfy the need to validate and update three pointers at the same time.

So we may as well change the idea, since the operation of mobile node is so difficult, we can make all nodes remain orderly state, thus avoiding the move operation. In a typical HASHMAP implementation, the length of an array is always kept at 2i, and the process of mapping an array subscript from a hash value is simply a modulo operation (that is, preserving only the back-I of the hash binary). When rehash, the length of the array doubles to 2i+1, the old array, the J-Necklace table of each node, either move to the J item in the new array or move to the j+2i item in the new array, and their only difference is the difference in the i+1 bit of the hash value (the i+1 bit 0 is still the J, otherwise it is the first j+ 2i items).

As shown in Figure 5, we arrange all nodes from small to large according to the flip bit order of the hash value (such as 1101->1011). When the array size is 8 o'clock, 2, 18 in a group, 3, 11, 27 in another group. At the beginning of each group, insert a sentinel node to facilitate subsequent operations. In order for the Sentinel node to be properly ranked at the front of the group, we set the top bit of the normal node hash (the flip to the lowest bit) to 1, and the Sentinel node does not.

When the array expands to 16 o'clock (see Figure 6), the second group splits into a group containing only 3 and a group of 11, 27, but the relative order between nodes does not change. So when rehash, we don't need to move the nodes.

Implementation Details

Since the expansion of the array will take a lot of time to copy, here we use the whole array block, lazy to establish the method. In this way, when a subscript is accessed, it is only necessary to determine whether the block in which the subscript is located has been established (if not).

Also define size is the currently used subscript range with an initial value of 2, and the array expands to only double size; defining count represents the total number of nodes contained in the current HashMap (not the Sentinel node).

Initially, all the items in the array are null except for item NO. 0. The No. 0 item points to a linked list with only one sentinel node, representing the starting point of the entire chain. The initial panorama is shown in Figure 7, where the light green represents the currently unused subscript range, and the dashed arrows represent logically existing blocks that are not actually built.

Initialize subscript operation

The null entry in the array is considered uninitialized and initialization of a subscript represents the establishment of its corresponding sentinel node. Initialization is recursive, and where its parent subscript is uninitialized, the parent subscript is initialized first. (a subscript's parent subscript is the subscript it gets when it removes the highest bits) the approximate code is as follows:

Copy Code code as follows:

Privatevoidinitializebucket (INTBUCKETIDX) {

Intparentidx = bucketidx ^ integer.highestonebit (bucketidx);

if (Getbucket (PARENTIDX) ==null)

Initializebucket (PARENTIDX);

Node dummy =newnode ();

Dummy.hash = Integer.reverse (BUCKETIDX);

Dummy.next =newatomicreference<> ();

Setbucket (Bucketidx, Listinsert (Getbucket (PARENTIDX), dummy));


}

Where Getbucket is the encapsulated way to get an array of subscript content, setbucket the same. Listinsert will start at the specified location to locate the appropriate insertion point, insert the given node, and return the existing node if a hash is already present in the list, otherwise the newly inserted node is returned.

Insert operation

First, we use the size of HashMap to hashcode the key, and get the index of the array to be inserted.
It then determines whether the subscript is null, or initializes the subscript if null.
Constructs a new node and inserts it into the appropriate location, noting that the hash value in the node should be the value of the original hashcode after the bit flip and the lowest position 1.
Add the number of nodes counter 1, if added 1 after the node is too much, then only need to change the size to size*2, on behalf of the Logarithmic group expansion (ReHash).

Find operations

Find the subscript for the node to find in the array.
Determines whether the subscript is null and returns a lookup failure if null.
From the corresponding position into the list, search in sequence, until find the node to find or beyond the scope of this group of nodes.

Delete operation

Find the subscript that should delete the node in the array.
Determines whether the subscript is null, or initializes the subscript if null.
Locate the node you want to delete and remove it from the list. (Note that because of the presence of sentinel nodes, any normal element is referenced by its only predecessor, and there is no case where the pointer is referenced simultaneously by the predecessor node and the array, so that there is no need to modify multiple pointers at the same time)
Subtract the number of nodes counter by 1.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.