[reproduced] The nature of lock-free programming

Source: Internet
Author: User

Original: http://weibo.com/p/1001603876869958445266

Sina Weibo (@NP not equal to p)

Computer Learning Public Number (JSJ_XX)

Lock-free programming really doesn't involve locks? What is the nature of the lock-free programming implementation? Need an operating system or compiler support? This article tries to answer these questions.

1 issues caused by locks

Special protection against deadlocks or live locks when using locks. The deadlock situation is simple, that is, the applicant in the application process due to sequential reasons (multiple locks are not in a fixed order) into the blocked state, the specified order can be circumvented. We only see an example of a live lock:

Two threads are trying to avoid deadlocks, but it is possible (awkwardly) to enter a live lock scenario: Two threads do not enter do_something_x () as expected, nor do they exit while (), but are in the process of repeated attempts! Visible, the meaning of the live lock is that the applicant is alive (not blocked) but still locked.

The above example of a live lock, in fact, and the following (do not use any lock) example is a meaning:

Without considering Cpu/compiler optimizations, if two threads are running this while (), it is possible that two threads will not be able to get out of this while (). Since two threads are modifying x, it is possible to cause the X of while (x==0) to be 0!

2 Lock-Free instances

The essence of lock-free programming is to deal with the most critical point (typically for a field in the most streamlined data structure), the so-called "good steel is used on the blade": the use of CAS atomic operation to compress the lock operation to a minimum range (see no Lock the nature of the lock, the atomic operation is also the lock nature), CAS atomic operations are encapsulated in the following form: (Example of a 32bit machine)

BOOL Cas32 (int * pVal, int oldval, int newval);

PVal represents the address of the value to be examined, Oldval represents the expected old value, and newval represents the new value when it can be replaced. This function is equivalent to:

int Compare_and_swap (int* reg, int oldval, int newval)
{
int old_reg_val = *reg;
if (Old_reg_val = = oldval)
*reg = newval;
return old_reg_val;
}

Thus, a lock-free equivalent to converting the lock factor into an atomic CAS operation compresses to the most streamlined data structure, so that it looks like no lock! We look at the non-locking of a queue: (This example comes from the 1994 implementing Lock-free Queues, "^." in the code.) Equivalent to "-")

Figuring out this lock-free queue requires a few points to understand:

    • The queue always contains a dummy redundant node, including the initial time. During subsequent deletions, the dummy node is switched (pointing to the old node) at any time.

    • For Enqueue (), the first CAs ensures that the first operator to get to tail joins the queue successfully, and the second CAs lets you quickly advance the update of the tail pointer if it is not successfully joined (repeated attempts); The third CAs is clearly The first successful completion of the queue operation naturally requires the completion of the tail pointer update, although it is possible that the second CAS has been updated in advance.

    • For Dequeue (), the only CAs is to ensure that the first operator to take the head value successfully deletes the first node of the queue.

In summary, the EnQueue and dequeue already have support for the multi-threaded feature base (in the case of queue and out queue at least one node in the queue, is not required to lock protection), through the use of CAs, EnQueue () and dequeue () respectively can support multithreading. So, it looks beautiful: no lock!

Also, be aware of anomalies, such as the above Enqueue () if a thread hangs during execution, does it affect other threads? If you delete the second CAS operation, then Enqueue () will have a problem: Thread1 successfully completed the first CAs, started the third CAs (because the second CAS was deleted, in fact, this should be the second one) when the hang off, Then the other thread will die loop: Death next field is nul tail, but in fact this tail's next forever stuck in the thread1 just into the queue of the node.

As you can see, the second CAs in EnQueue () is important, but there are other ways to prevent one of these threads from crashing in the middle, such as this EnQueue () version:

The principle of treatment is the same: even if CAS fail to succeed also need to advance tail hands!

3 Lock-free support in GCC

BOOL __sync_bool_compare_and_swap (Type *ptr, type Oldval type newval, ...)

Type __sync_val_compare_and_swap (Type *ptr, type Oldval type newval, ...)

These builtins perform an atomic compare and swap. That's, ifthe current value of *ptr was Oldval, Thenwrite newval into *ptr.

The "bool" version returns True if the comparison is Successfuland newval was written. The "Val" version returns the Contentsof *ptr before the operation.

Let's take a closer look at the __sync_bool_compare_and_swap ():

We focus on CMPXCHG:

Cmpxchg%ecx,%EBX; if eax and ebx are equal ecx send ebx and ZF 1; otherwise ebx send eax and ZF 0

%eax is the old value,%ECX is the new value,%EBX is the value to be examined. Originally, this instruction is the lock-free source! Almost all CPUs support CAS atomic operations, and the corresponding atomic operations under x86 are CMPXCHG directives.

4 the ABA problem in CAs

ABA problem refers to the omission of state change caused by process switching. For example, the variable that the process sees is "A", but it may have been changed by several states: "A->b->a".

The solution is generally to add a reference count field. In this way, CAS checks both the reference count and the target content for two values that have not changed. But the trouble is that the reference count field also has an overflow problem.
5 No lock in the kernel

Looking at the kernel Kfifo mechanism, you can see that the kernel uses a different lock-free method: Memory barrier. For memory barrier topics, please refer to our previous "Understanding Memory Barrier" section five.

The memory barrier is also encapsulated in GCC:

__sync_synchronize (...)

This builtin issues a full memory barrier.

6 Summary

It can be seen that the essence of no lock is the use of CAs and memory barriers (these will involve the machine architecture or compiler) just, not really no lock ...

About Us

Sina Weibo (@NP not equal to p)

Computer Learning Public Number (JSJ_XX)

Original technical article, sentiment computer, thorough understanding computer!

[reproduced] The nature of lock-free programming

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.