Atomic, spinlock and mutex Performance Comparison

Source: Internet
Author: User

Atomic, spinlock and mutex Performance Comparison
I am very curious about the performance of different synchronization principles, so I made the following experiments to compare atomic, spinlock, and mutex:
1. No Synchronization
# Include <future>
# Include <iostream>

Volatile int value = 0;

Int loop (bool inc, int limit ){
Std: cout <"Started" <inc <"" <limit <std: endl;
For (int I = 0; I <limit; ++ I ){
If (inc ){
++ Value;
} Else {
-- Value;
}
}
Return 0;
}

Int main (){
Auto f = std: async (std: launch: async, std: bind (loop, true, 20000000); // enable a thread to execute the loop function, advanced features of c ++ 11
Loop (false, 10000000 );
F. wait ();
Std: cout <value <std: endl;
} Clang Compiler:
Clang ++-std = c ++ 11-stdlib = libc ++-O3-o test. cpp & time./test run
SSttaarrtteedd 10 2100000000000000

11177087

Real 0m0. 070 s
User 0m0. 089 s
Sys 0m0. 002s we can see from the running results that the increase or decrease is not an atomic operation, and the final value of the variable value is uncertain (garbage ).


2. Compile LOCK
# Include <future>
# Include <iostream>

Volatile int value = 0;

Int loop (bool inc, int limit ){
Std: cout <"Started" <inc <"" <limit <std: endl;
For (int I = 0; I <limit; ++ I ){
If (inc ){
Asm ("LOCK ");
++ Value;
} Else {
Asm ("LOCK ");
-- Value;
}
}
Return 0;
}

Int main (){
Auto f = std: async (std: launch: async, std: bind (loop, true, 20000000); // enable a thread to execute the loop function, advanced features of c ++ 11
Loop (false, 10000000 );
F. wait ();
Std: cout <value <std: endl;
} Run:
SSttaarrtteedd 10 2000000100000000

10000000

Real 0m0. 481 s
User 0m0. 779 s
Sys 0m0. seconds get the correct value in the final variable value, but these codes are not portable (platform incompatible) and can only be run on the hardware of the X86 architecture, in addition, you must use the-O3 compilation option to run and compile the program correctly. In addition, because the compiler injects other commands between the LOCK instruction and the increase or decrease of the instruction, the program is prone to an "illegal instruction" exception, causing the program to crash.


3. atomic operation atomic
# Include <future>
# Include <iostream>
# Include "boost/interprocess/detail/atomic. hpp"

Using namespace boost: interprocess: ipcdetail;

Volatile boost: uint32_t value = 0;

Int loop (bool inc, int limit ){
Std: cout <"Started" <inc <"" <limit <std: endl;
For (int I = 0; I <limit; ++ I ){
If (inc ){
Atomic_inc32 (& value );
} Else {
Atomic_dec32 (& value );
}
}
Return 0;
}

Int main (){
Auto f = std: async (std: launch: async, std: bind (loop, true, 20000000 ));
Loop (false, 10000000 );
F. wait ();
Std: cout <atomic_read32 (& value) <std: endl;
} Run:
SSttaarrtteedd 10 2100000000000000

10000000

Real 0m0. 457 s
User 0m0. 734 s
The final result of sys 0m0. 004s is correct, which is similar to that of Assembly LOCK in terms of time used. Of course, the underlying layer of atomic operations is also implemented by using LOCK assembly, but only by using a portable method.


4. spin lock
# Include <future>
# Include <iostream>
# Include "boost/smart_ptr/detail/spinlock. hpp"

Boost: detail: spinlock lock;
Volatile int value = 0;

Int loop (bool inc, int limit ){
Std: cout <"Started" <inc <"" <limit <std: endl;
For (int I = 0; I <limit; ++ I ){
Std: lock_guard <boost: detail: spinlock> guard (lock );
If (inc ){
++ Value;
} Else {
-- Value;
}
}
Return 0;
}

Int main (){
Auto f = std: async (std: launch: async, std: bind (loop, true, 20000000 ));
Loop (false, 10000000 );
F. wait ();
Std: cout <value <std: endl;
} Run:
SSttaarrtteedd 10 2100000000000000

10000000

Real 0m0. 541 s
User 0m0. 675 s
The last result of sys 0m0. 089s is correct. It is slower than the previous one, but it is not too slow.


5. mutex
# Include <future>
# Include <iostream>

Std: mutex;
Volatile int value = 0;

Int loop (bool inc, int limit ){
Std: cout <"Started" <inc <"" <limit <std: endl;
For (int I = 0; I <limit; ++ I ){
Std: lock_guard <std: mutex> guard (mutex );
If (inc ){
++ Value;
} Else {
-- Value;
}
}
Return 0;
}

Int main (){
Auto f = std: async (std: launch: async, std: bind (loop, true, 20000000 ));
Loop (false, 10000000 );
F. wait ();
Std: cout <value <std: endl;
} Run:
SSttaarrtteedd 10 2010000000000000

10000000

Real 0m25. 229 s
User 0m7. 011 s
Sys 0m22. 667s mutex lock is much slower than the previous ones
Benchmark
Method Time (sec .)
No synchronization 0.070
Locks 0.481
Atomic 0.457
Spinlock 0.541
Mutex 22.667

 

Of course, the test results depend on different platforms and compilers (I did the test on Mac Air and clang ). But for me it was quite interesting to see that spinlock, in spite of its more sophisticated implementation comparing to atomics, works not much slower.
Sadly, my clang 3.1 still doesn't support atomic, and I had to use boost.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.