Implement lock-independent data structure under Java--Reprint

Last Update:2014-10-31 Source: Internet

Author: User

Tags cas

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduced

Usually in a multithreaded environment, we need to share some data, but in order to avoid competition conditions causing inconsistent data, some pieces of code need to become atomic operations to execute. At this point, we need to use various synchronization mechanisms such as mutual exclusion (mutex) to lock these snippets, so that a thread can monopolize the sharing of data, avoid competition conditions, ensure data consistency. Unfortunately, this is a blocking synchronization, and all the other threads can do is wait. The multi-threaded design based on lock based is more likely to cause deadlocks, priority inversion, starvation, and so on, so that some threads cannot continue their progress.

The lock free algorithm, as the name implies, does not involve the use of locks. Such algorithms can synchronize individual threads without using locks. Comparing the lock-based multi-threaded design, the lock-independent algorithm has the following advantages:

Immune to deadlocks, priority inversion, and so on: it is non-blocking synchronization because it does not use locks to coordinate individual threads, so the deadlock, priority inversion, etc. are immune to problems caused by locks;
Ensure the overall progress of the program: because the lock-independent algorithm to avoid deadlocks and other situations, so it can ensure that the thread is in operation, so as to ensure the overall progress of the program;
Ideal performance: Because the use of locks is not involved, the use of lock-independent algorithms can achieve the desired performance improvement in a general load environment.

Since JDK 1.5 was launched, java.util.concurrent.atomic a set of classes provided an important basis for implementing the lock-independent algorithm. This article describes how to apply a lock-independent algorithm to a basic data structure to avoid race conditions, allowing multiple threads to access and consume shared data in the collection at the same time. If a data structure is not thread-safe in itself, once you use this data structure in a multithreaded environment, you must apply some kind of synchronization mechanism, otherwise the race condition is likely to occur. The lock-independent data structure we are about to design is thread-safe, so there is no need to write extra code to ensure that the race condition does not occur.

Design of data structure

In this paper, the implementation of the lock-independent stack (stack) is presented, which provides the basic knowledge for the reader, and the stack is a fundamental data structure of first in and out. As the reader grasps the necessary techniques, we begin to design a relatively complex list of linked lists (Linked) data structures, which are the fundamental components of many other data structures. However, compared to the stack, the list can face more difficult thread synchronization problems.

Before we start the design, we need to understand a very important primitive Compare-and-swap (CAS) , Herlihy proved to CAS be a universal primitive that implements lock-independent data structures, CAS can atomically compare the contents of a memory location and a desired value, if the two are the same, The contents of this memory bit move place are replaced with a specified value, and the result is provided to indicate whether the operation was successful. Many modern processors have provided CAS hardware implementations, such as instructions under the x86 architecture CMPXCHG8 . In Java, the java.util.concurrent.atomic inside of the AtomicReference<V> class also provides the implementation of the CAS primitives, and there are many other extension functions. The CAS operation will be a non-missing instruction for the lock-independent data algorithm implemented later.

Stack

Stacks can use arrays or linked lists as the underlying storage structure, although a list-based implementation takes a little more space to store the nodes that represent the elements, but avoids the problem of dealing with array overflows. Therefore, we will use the linked list as the basis of the stack.

First, let's examine a non-thread-safe version. In order to clearly express and focus on the topic of the article, the code does not contain the exception and improper operation of the processing, readers please note. Its code is as follows:

Listing 1. Non-thread-safe stack implementations

Class Node<t> {     node<t> next;     T value;         Public Node (T value, node<t> next) {         this.next = next;         This.value = value;     }  }  public class Stack<t> {     node<t> top;         public void push (T value) {         node<t> newtop = new Node<t> (value, top);         top = newtop;     }         Public T pop () {         node<t> Node = top;         top = Top.next;         return node.value;     }         Public T Peek () {         return top.value;     }  }

The data member stores the node at the top top of the stack, which is of the type Node<T> , because our stack is based on the linked list. Node<T>represents a node that has two data members, value stores the elements in the stack, and next stores the next node. This class has three methods, respectively, push pop and peek , they are basic stack operations. peekIn addition to the method being thread safe, the remaining two methods are likely to cause a race condition under a multithreaded environment.

Push method

Let's consider push the method first, which can put an element into the stack. pushwhen called, it first establishes a new node and value sets the data member to the passed parameter, while the next data member is assigned the current top of the stack. It then top sets the data member to the newly created node. Assuming that two threads A and B call the method at the same time push , thread A gets the node at the top of the current stack to create a new node (the push first line of the method code), but because the time slice is exhausted, thread a temporarily hangs. At this point, thread B gets the node at the top of the current stack to create a new node and sets it to top the newly created node (the push second line of the method code). Then, thread A resumes execution, updating the top of the stack. When the call to thread B is push complete, thread A's original stack top has been "expired" because thread B replaces the original stack top with a new node.

Pop method

As for the pop method, it pops up the top element of the stack. The pop method puts a local variable at the top node of the stack, then updates the top of the stack with the next node, and finally returns node the data member of the variable value . If two threads call this method at the same time, it can cause a race condition. This thread hangs when a thread assigns the top of the current stack to a variable node and prepares to update the top of the stack with the next node. Another thread also calls pop the method, finishes and returns the result. The thread that has just been suspended resumes execution, but because the top of the stack is changed by another thread, continuing execution can cause synchronization problems.

Peek method

The peek method simply returns the element currently at the top of the stack, which is thread-safe and has no synchronization problems to solve.

In Java to solve push and pop methods of synchronization problems, you can use synchronized this keyword, this is a lock-based solution. Now let's look at the lock-independent solution, the following is the code for the lock-independent stack implementation:

Listing 2. Lock-Independent stack implementations

 Import java.util.concurrent.atomic.*;     Class Node<t> {node<t> next;         T value;         Public Node (T value, node<t> next) {This.next = next;     This.value = value; }} public class Stack<t> {atomicreference<node<t>> top = new Atomicreference<node<t>&gt         ;();         public void push (T value) {Boolean sucessful = false;             while (!sucessful) {node<t> oldtop = Top.get ();             node<t> newtop = new Node<t> (value, oldtop);         sucessful = Top.compareandset (OldTop, newtop);     };     Public T Peek () {return Top.get (). value;         Public T Pop () {Boolean sucessful = false;         Node<t> newtop = null;         Node<t> oldtop = null;             while (!sucessful) {oldtop = Top.get ();             Newtop = Oldtop.next;         sucessful = Top.compareandset (OldTop, newtop); } RetuRN Oldtop.value; }  }

This new implementation is very different from what it is, and it seems more complicated. The type of the member data is top Node<T> changed AtomicReference<Node<T>> , and AtomicReference<V> the class can top manipulate the data member, that is, CAS it can allow the top atom to be compared with an expected value, and the two will be replaced with a specified one. As we can see above, we need to solve the problem of "expiring" at the top of the stack.

Push method

Now let's analyze how the new push approach deals with this problem and ensure that competitive conditions do not arise. In the while loop, top AtomicReference.get() oldTop The top node of the current stack is held by the data member call, and the stack will be replaced later. The variable is newTop initialized to the new node. In the most important step, top.compareAndSet(oldTop, newTop) it compares the top oldTop two references to ensure that oldTop the top of the stack is not "expired", i.e. it is not changed by another thread. If it does not expire, it is newTop updated top so that it becomes the new stack top and returns the boolean value true . Otherwise, the compareAndSet method returns false , and the loop continues execution until it succeeds. Because compareAndSet it is an atomic operation, the data is guaranteed to be consistent.

Pop method

popmethod to pop the elements of the top of the stack, it is implemented in a push very similar way. In the while loop, compareAndSet Check that the top of the stack has not been changed by other threads, the data is consistent, then update the top data members and pop up the original stack top. If it fails, try again until it succeeds.

pushand pop neither uses any locks, so all threads do not have to stop to wait.

Linked list

Stack is a fairly simple data structure, to solve the synchronization problem is also relatively straightforward and easy. But in many circumstances, the stack does not meet our needs. We will introduce the linked list, which has a wider range of applications. To keep it simple, the list offers fewer methods. The following is a non-thread-safe version of the list:

Listing 3. Non-thread-safe linked list implementations

 Class Node<t> {node<t> next;         T value;         Public Node (T value, node<t> next) {this.value = value;     This.next = Next;         }} class Linkedlist<t> {node<t> head;     Public LinkedList () {head = new node<t> (null, NULL);     public void AddFirst (T value) {addafter (head.value, value); public boolean addafter (T-after, T-value) {for (node<t> node = head; Node! = NULL; node = node.ne XT) {if (IsEqual (Node.value, after)) {node<t> NewNode = new Node<t> (Value, node.                 Next);                 Node.next = NewNode;             return true;     }} return false;             The public boolean remove (T value) {for (node<t> node = head; Node.next! = null; Node = Node.next) {                 if (IsEqual (Node.next.value, value)) {node.next = Node.next.next; Return TruE     }} return false;         } Boolean isequal (t arg0, t arg1) {if (arg0 = = null) {return arg0 = = Arg1;         } else {return arg0.equals (arg1); }     }  }

A data member is the head of a head linked list, which does not store any elements, but rather points directly to the first element, which makes later methods easier to remove implement. This list has three common methods, one of which addAfter is remove more important.

Addafter method

Consider the addAfter method first, which adds a new element to the position after the specified element in the collection, and returns a boolean value indicating whether the element was added to the collection, because the element was not added because the specified element is not in the collection. It first for looks for the node of the specified element in a loop, and after successful discovery of the specified node, a new node is established. The data member of this new node is next connected to the specified node next , and the specified node next is attached to the new node. On the other hand, the remove method looks for the specified element and removes it from the collection, and returns a boolean value indicating that the element has not been removed and returns an false element that is not specified in the collection. This method finds the elements to be removed in a loop, and then reconnect the left and right elements.

In a multithreaded environment, if two threads call addAfter or method at remove the same time, or if one thread calls a method and another thread invokes the method at the addAfter same moment, remove there is an opportunity to raise the race condition.

Imagine that there are now three elements in the list, namely a, B, and C. If a thread is ready to add an element to a, it first determines the position of the a node, and then creates a new node, A1, whose value data members store the newly added element, while the next data member stores a reference to the B node. When the thread is ready to connect A and A1 through next members, the thread is suspended because the time slice is exhausted. Another thread is also ready to add a new element after a, which establishes the A2 node and unlocks the original link of a and B, and then re-connects the three nodes in the order of A-a2-b, and the operation is completed. Now, the thread that has just been suspended resumes execution and re-connects three nodes in the order of A-a1-b. The problem arises and the newly added A2 node is missing. The workaround is that each time you are ready to connect the a element to the newly established node, check whether section A next has been altered by another thread and is not changed before connecting, which is done by CAS Atomic operation.

The conflict between Addafter and remove

It is also addAfter possible to enforce competition conditions at the same time remove . Similarly, there are now three elements a, B, and C in a linked list. When a thread prepares the calling remove method to remove the B element from this set, it first obtains a node, and then prepares to connect the A and C to each other by changing the members of the A node, and next the thread is suddenly suspended. At this point another thread calls addAfter the method to insert a new element B2 after the B node. After the insert operation is complete, the thread that was just suspended resumes execution, and the removal is done by changing the next members to connect A and C to each other. However, the B2 element that has just been added is lost because the A node skips the B node and connects directly to the C node. Therefore, we have to have a solution. Timothy L. Harris provides a method by which he divides the entire removal process into two steps, both tombstone and physical deletion. Tombstone does not really remove a node, but instead marks the node to be removed as deleted. Physical deletions, on the other hand, actually remove a node from the set by the left. Each time you add a new element to the specified node, you must first check that the node is not marked for deletion, and then connect the new node to the collection. This is AtomicMarkableReference <V> done atomically through the methods in the class compareAndSet .

Remove method

There are many possible conflicts in the list, and another problem is that two threads execute the same time remove method. This problem is somewhat similar to the same time execution addAfter . Now suppose a collection has four elements a, B, C, and D, and one thread invokes the remove method to remove element B. It first determines the location of a and C, and then prepares to release a and B links, then connect A and C, and the actual removal is not implemented, when the thread is suspended. The other thread also calls the remove method to remove the C element, which removes the links between B and C and joins the B and D, and the removal operation is complete. After the thread has resumed running, continue with the remaining operations and connect a and C so that the previous removal of C has been compromised. The elements in the final list become a-c-d,c elements that have not been removed. So, our remove method needs to determine if the element to be removed has next been changed. For example, when removing B, check if a has next been changed by another thread, and if it has not been marked as having been deleted logically. This is also CAS done through operation.

As can be seen from the above, we have to apply certain inspection mechanisms atomically to ensure the consistency of the data. Let's look at how the lock-independent list that solves these problems is implemented, and the code should be very different from what the reader sees in the algorithm book. Here is the code for it:

Listing 4. Lock-Independent linked list implementation

 Import java.util.concurrent.atomic.*;     Class Node<t> {atomicmarkablereference<node<t>> next;         T value; Public Node (T value, node<t> next) {This.next = new atomicmarkablereference<node<t>> (Next, Fals         e);     This.value = value;     }} class Linkedlist<t> {atomicmarkablereference<node<t>> head;         Public LinkedList () {node<t> Headnode = new node<t> (null, NULL);     Head = new Atomicmarkablereference<node<t>> (Headnode, false);     public void AddFirst (T value) {Addafter (Head.getreference (). value, value);         } public boolean Addafter (t after, T value) {Boolean sucessful = false;             while (!sucessful) {Boolean found = false;             for (node<t> node = head.getreference (); node! = null &&!isremoved (node); node = node.next.getReference ()) {if (IsEqual (node.valUE, after) &&!node.next.ismarked ()) {found = true;                     Node<t> NextNode = Node.next.getReference ();                     node<t> NewNode = new Node<t> (value, nextnode);                     sucessful = Node.next.compareAndSet (NextNode, NewNode, False, false);                 Break             }} if (!found) {return false;     }} return true;         public boolean remove (T value) {Boolean sucessful = false;             while (!sucessful) {Boolean found = false;             for (node<t> Node = Head.getreference (), NextNode = Node.next.getReference (); NextNode! = NULL; node = nextnode, NextNode = NextNode.next.getReference ()) {if (!isremoved (nextnode) && isequal (NE                     Xtnode.value, value)) {found = true;                    Logicallyremove (NextNode); sucessful = physicallyremove (node, nextnode);                 Break             }} if (!found) {return false;     }} return true; } void Logicallyremove (Node<t> Node) {while (!node.next.attemptmark (Node.next.getReference (), True)  ) {}} Boolean Physicallyremove (node<t> leftnode, node<t> Node) {node<t> Rightnode         = node;         do {rightnode = RightNode.next.getReference ();         } while (Rightnode! = null && isremoved (Rightnode));     Return LeftNode.next.compareAndSet (node, Rightnode, False, false);     } Boolean isremoved (Node<t> Node) {return node.next.isMarked ();         } Boolean isequal (t arg0, t arg1) {if (arg0 = = null) {return arg0 = = Arg1;         } else {return arg0.equals (arg1); }     }  }

Unlike before, Node member data in a class belongs to a next AtomicMarkableReference<V> class, not Node<T> , nor AtomicReference<V> . This is because we need not only to next operate on members CAS , but also to add tags to next. AtomicMarkableReference<V>is represented by a token on the boolean We will set it to true represent a node that has been tombstoned, which false means that the node is not tombstoned. When a node is marked as having been logically deleted, its next data member's tag bit is set to a boolean value true . On the other hand, the list head also belongs to AtomicMarkableReference<V> this type, because we also need to do the same thing.

Addafter method

First consideraddAfterMethodaddAfterThe design of the method must be concerned with the conflict between two threads and one thread insertion and one thread removal at a time. In aforLoop, it iterates through the collection to findafterThe node that is located, and by calling thegetReference()PutafterThe next node is assigned a value ofnextNodeVariable. Then, create a new node to accommodate the newly added element, which isnextData members andnextNodevariable to connect. Next, innodeVariable called on thecompareAndSet。 The difference is that it not only compares two references to ensurenextThe data member is not changed by another thread, it will also comparebooleanWhether the tag bit and the expected value are the same. As mentioned above,AtomicMarkableReference<V>Class has a tag bit, we use it to check whether a node has been logically deleted. If we find that the node has beenlogicallyRemovemethod is marked as being deleted logically.compareAndSetMethod fails and continues to loop to find the next compliant node. If the specified element is not found after the end of the loop,addAfterMethod will returnfalseTo indicate that the element cannot be added to the collection because the specified element does not exist within the collection.compareAndSetReturntruemeans that the element is inserted successfully, and the method returns and ends.

AddFirst method

addFirstThe method simply invokes the addAfter method to insert a new element into the beginning of the collection.

Remove method

removeMethod finds the previous node in a loop to remove the element, and then determines that the node has not been logically removed. Once determined, it first invokes logicallyRemove the node that is logically deleted, and then calls the physicallyRemove node to delete it physically. In logicallyRemove , we call AtomicMarkableReference <V> in attemptMark to set the marker bit. Because attemptMark it is possible to fail, it is placed in a while loop, and the attemptMark node that is marked on behalf of the node has been logically deleted. In the physicallyRemove method, it first checks whether the neighboring nodes have been marked as tombstoned and physically removes them if they have been marked. This is done by checking that the compareAndSet compareAndSet node has been logically deleted, and that the members of the previous node have next not been changed by other threads. This ensures that two threads are calling the method at the same remove time, or that remove a ddAfter the race condition does not occur when methods and methods are called separately at the same times.

ABA issues

Because the CAS operation compares the contents of a memory location with the same expected value, if the contents of a memory location change from a to B, then B to a, it CAS will still be considered the same. However, some algorithms cannot tolerate this behavior because of the demand relationship. This problem can occur when some memory locations are reused. In an environment where there is no garbage collection mechanism, the ABA problem requires mechanisms such as labeling to solve. But because the JVM will handle memory management issues for us, the above implementation is sufficient to avoid the emergence of ABA problems.

Conclusion

In the past, a lot of lock-independent data structures are immutable object to achieve thread safety, which is much like in Java String , but because of the excessive replication operation, the performance is low. But after more than 10 years, the lock-independent data structure has developed very mature, performance is not inferior to the traditional realization way. The lock-independent algorithm is very difficult, but because the data structure is often reused part of the first to apply this concept to the data structure, you can easily let the program into the lock-independent world, experience the benefits it brings.

Original: http://www.ibm.com/developerworks/cn/java/j-lo-lockfree/

Implement lock-independent data structure under Java--Reprint

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More