Talking about the efficiency of Java atom variables and synchronization--to subvert your outlook on life

Source: Internet
Author: User

The idea of thinking makes us think that atomic variables are always faster than synchronous operations, and I have always believed that, until one test in the process of implementing an ID generator, it happened that it was not all.


Test code:

Import Java.util.arraylist;import Java.util.list;import Java.util.concurrent.atomic.atomicinteger;public class    Concurrentadder {private static final Atomicinteger Atomic_integer = new Atomicinteger (0);    private static int I = 0;    Private static Final Object o = new Object ();    Private static volatile long start;        public static void Main (final string[] args) {//each thread executes how many times the summation int round = 10000000;        Thread count int threadn = 20;        Start = System.currenttimemillis ();        Atomicadder (THREADN, round);    Syncadder (THREADN, round);        } static void Atomicadder (int threadn, int addtimes) {int stop = THREADN * addtimes;        list<thread> list = new arraylist<thread> ();        for (int i = 0; i < Threadn; i++) {List.add (startatomic (Addtimes, stop));        } for (Thread each:list) {Each.start (); }} static Thread startatomic (final int addtimes, final int stop) {threadret = new Thread (new Runnable () {@Override public void run () {for (int i = 0; I &lt ; Addtimes;                    i++) {int v = atomic_integer.incrementandget ();                        if (stop = = v) {System.out.println ("value:" + V);                        System.out.println ("Elapsed (ms):" + (System.currenttimemillis ()-start));                    System.exit (1);        }                }            }        });        Ret.setdaemon (FALSE);    return ret;        } static void Syncadder (int threadn, int addtimes) {int stop = THREADN * addtimes;        list<thread> list = new arraylist<thread> ();        for (int i = 0; i < Threadn; i++) {List.add (Startsync (Addtimes, stop));        } for (Thread each:list) {Each.start ();         }} static thread Startsync (final int addtimes, final int stop) {Thread ret = new Thread (new Runnable () {   @Override public void Run () {for (int i = 0; i < addtimes; i++) {Sync                        Hronized (o) {i++;                            if (stop = = i) {System.out.println ("value:" + i);                            System.out. println ("Elapsed (ms):" + (System.currenttimemillis ()-start));                        System.exit (1);        }                    }                }            }        });        Ret.setdaemon (FALSE);    return ret; }}


This is a very simple accumulator, n threads accumulate concurrently, each thread accumulates r times.


Comment separately

Atomicadder (THREADN, round);//Atom variable accumulation syncadder (THREADN, round);//Synchronous summation
Executes another row in a row


The configuration of the author machine: i5-2520m 2.5G four core

N=20

r=10000000

Results:

Atomic summation: 15344 MS

Synchronous accumulation: 10647 ms


The question comes out, why does the synchronization accumulate about 50% faster than the atomic summation?



@ We know that the Java lock process is (built-in sync is similar to an explicit lock), and the thread to be locked checks if the lock is occupied. Joins the waiting queue to the target lock if it is occupied. If not, add lock.


Here each thread acquires the lock accumulation and immediately goes to acquire the lock, when the other thread has not been awakened, and the lock is taken by the current thread. This is the problem of starvation that can be caused by unfair locking.


But does this reason not explain the 50% performance improvement? Theoretically, in an absolute time, there is always a thread that accumulates successfully, so the time-consuming of the two accumulators should be approximate.


So what has improved the performance of synchronous accumulation, or what has reduced the performance of atomic accumulation?


@ Next I perf the execution of the two accumulators separately:

The first time an atomic accumulator is executed, and a second synchronous accumulator is executed.


[Email protected]:/data$ perf stat-e cs-e l1-dcache-load-misses java concurrentaddervalue:100000000elapsed (ms): 8580 Performance counter stats for ' Java concurrentadder 1 1000000 ':       21,841 cs                                                                 233,140,754 l1-dcache-load-misses< c2/>8.633037253 seconds Time Elapsed
[Email protected]:/data$ perf stat-e cs-e l1-dcache-load-misses java concurrentaddervalue:100000000elapsed (ms): 5749 Performance counter stats for ' Java concurrentadder 2 1000000 ':       55,522 cs                                                                 28,160,673 l1-dcache-load-misses< c2/>5.811499179 seconds Time Elapsed


As we can see, the context switch of synchronous accumulation is more than that of atoms, which is understandable, and the lock itself increases the switching of threads.

Again, the L1 cache failure of the Atom Accumulator is one order of magnitude higher than the synchronous accumulator.


The author enlightened that atomic operations cause cache consistency problems, which results in frequent cache rows being invalidated. Cache consistency Protocol MESI see: Http://en.wikipedia.org/wiki/MESI_protocol

However, the synchronization accumulator acquires the lock operation repeatedly over a CPU cycle, and the cache is not invalidated.

The output of each cumulative thread ID is then revealed that the distribution of the atoms ' cumulative threads is much more dispersed.


Back to the question, why do we always think that atomic operations are faster than locking? The example in this article is very special, in the normal business scenario, we add up after, to go through a lot of business code logic to accumulate again, here has crossed many CPU time slices. Thus synchronous accumulators are difficult to acquire to the lock all the time, in which case the synchronous accumulator will have the performance penalty of waiting for lock-up and the performance loss due to cache consistency. So in general, the synchronization accumulator will be much slower.





Talking about the efficiency of Java atom variables and synchronization--to subvert your outlook on life

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.