Starting from this blog, there will be some small series of blog posts that are expected to be introduced in 4-5. The application and practice of lock, we often hear about spin lock, wait-free, lock-free, what is this? Can we implement a spin lock by ourselves? What is the principle? This small series will discuss this content.
First, let's take a look at two basic operations: compare_and_swap and fetch_and_add. Basically, the lock-free operation depends on these two basic atomic operations. In particular, the compare_and_swap atomic operation originated from IBM System 370. It contains three parameters: (1) shared memory address (* P), (2) the expected value of this address (old_value), (3) a new value (new_value ). The exchange operation is generated only when * P = old_value, and the true value is returned. Otherwise, the false value is returned, which is equivalent to the following code: Template <class T>
Bool CAS (T * ADDR, t exp, t Val) // It is correct only when the entire function process is atomic. For actual code, refer to the following assembly code.
{
If (* ADDR = exp ){
* ADDR = val;
Return true;
}
Return false;
}
In the following code, we can see that compare_and_swap uses the lock command to lock the bus. setz will determine whether the ZF symbol bit is set after the cmpxchg command and whether an exchange has occurred. The following is a piece of executable code. The Void * sum (void *) function uses different compilation commands to generate different codes, the result is a simple operation to add a global variable using 10 threads. However, the mutex and fetch_and_add methods provided by pthread are used respectively. There is no lock at all. Three Methods of CAs are applied. The sum_with_cas_imp_yield method is basically the implementation of spinlock.
In the next article, I will announce the experiment results on my testing machine and continue to discuss other lock-free topics.
# Include <pthread. h>
# Include <stdio. h>
# Include <string. h>
# Include <stdlib. h>
# Include <unistd. h>
# Include <syscall. h>
# If defined (_ x86_64 __)
# Define atomicops_word_suffix "Q" // use the cmpxchgq command in a 64-bit environment
# Else
# Define atomicops_word_suffix "L" // use the cmpxchgl command in a 32-bit environment
# Endif
Static inline boolCompare_and_swap(Volatile size_t * P, size_t val_old, size_t val_new ){
Char ret;
_ ASM _ volatile _ ("lock; cmpxchg" atomicops_word_suffix "% 3, % 0; setz % 1" // lock command to lock the bus, thus ensuring multi-core Synchronization
: "= M" (* P), "= Q" (RET) // setz indicates whether to set the ZF symbol bit. It is used to set the return value.
: "M" (* P), "R" (val_new), "a" (val_old)
: "Memory ");
Return (bool) ret;
}
Static inline size_tFetch_and_add(Volatile size_t * P, size_t add ){
Unsigned int ret;
_ ASM _ volatile _ ("lock; xaddl % 0, % 1"
: "= R" (RET), "= m" (* P)
: "0" (ADD), "M" (* P)
: "Memory ");
Return ret;
};
Struct my_cas
{
My_cas (unsigned char T): m_val_old (t ){}
Size_t m_val_old;
Inline void try_continue (size_t val_old, size_t val_new ){
While (! Compare_and_swap (& m_val_old, val_old, val_new )){};
}
Inline void add (size_t val_new ){
Fetch_and_add (& m_val_old, val_new );
}
};
Volatile size_t g_ucount;
Pthread_mutex_t g_tlck = pthread_mutex_initializer;
My_cas mutex (1 );
Const size_t cnt_num = 10000000;
Void * sum_with_mutex_lock (void *)
{
For (INT I = 0; I <cnt_num; ++ I ){
Pthread_mutex_lock (& g_tlck );
G_ucount + = 1;
Pthread_mutex_unlock (& g_tlck );
}
};
Void * sum_with_f_and_a (void *) // use the fetch_and_add atomic operation to ensure the correctness of the result.
{
For (INT I = 0; I <cnt_num; ++ I ){
Fetch_and_add (& g_ucount, 1 );
}
};
Void * sum_with_cas (void *) // use the CAS atomic operation to simulate the lock operation.
{
For (INT I = 0; I <cnt_num; ++ I)
{
Mutex. try_continue (1, 0 );
G_ucount + = 1;
Mutex. try_continue (0, 1 );
}
}
Void * sum_with_cas_imp (void *)
{
For (INT I = 0; I <cnt_num; ++ I ){
For (;;){
Size_t u = g_ucount;
If (compare_and_swap (& g_ucount, u, u + 1) {// between the previous statement and the current statement, if g_ucount is not tampered, add 1,
Break; // break indicates the loop. Otherwise, retry until the loop is successful.
}
}
}
}
Void * sum_with_cas_imp_yield (void *)
{
For (INT I = 0; I <cnt_num; ++ I ){
For (;;){
Register size_t c = 1000 ;//
While (c ){
Size_t u = g_ucount;
If (compare_and_swap (& g_ucount, u, u + 1 )){
Break;
}
C --;
}
If (! C ){
Syscall (sys_sched_yield );// Increase the chance of CPU transfer. This policy is usually applicable to spin lock.
}
}
}
}
Void * sum_just_free (void *)
{
For (INT I = 0; I <cnt_num; ++ I) {// No lock at all, no wait, but the execution result is usually incorrect.
G_ucount + = 1;
}
}
Void * sum (void *)
{
# Ifdef m_lock
Sum_with_mutex_lock (null );
# Endif
# Ifdef fetch_and_add
Sum_with_f_and_a (null );
# Endif
# Ifdef free
Sum_just_free (null );
# Endif
# Ifdef cas
Sum_with_cas (null );
# Endif
# Ifdef cas_imp
Sum_with_cas_imp (null );
# Endif
# Ifdef cas_imp_yield
Sum_with_cas_imp_yield (null );
# Endif
};
Int main ()
{
Pthread_t * thread = (pthread_t *) malloc (10 * sizeof (pthread_t ));
For (INT I = 0; I <10; ++ I ){
Pthread_create (& Thread [I], null, sum, null );
}
For (INT I = 0; I <10; ++ I ){
Pthread_join (thread [I], null );
}
Printf ("g_ucount: % d/N", g_ucount );
}
Use the following compilation command to compile 6 programs:
G ++ test. cpp-O test_free-D free-lpthread
G ++ test. cpp-O test_fetchandadd-D fetch_and_add-lpthread
G ++ test. cpp-O test_mlock-D m_lock-lpthread
G ++ test. cpp-O test_cas-D cas-lpthread
G ++ test. cpp-O test_cas_imp-D cas_imp-lpthread
G ++ test. cpp-O test_cas_imp_yield-D cas_imp_yield-lpthread
Other articles in this series: http://blog.csdn.net/pennyliang/category/746545.aspx