Tips for efficient use of locks in Java-reproduced

Source: Internet
Author: User

Competitive locking is a major cause of multi-threaded application performance bottlenecks

It is important to differentiate between competitive and non-competitive locks on performance. If a lock is used by only one thread from start to finish, the JVM has the ability to optimize most of the losses it brings. If a lock is used by more than one thread, but at any given time, only one thread tries to acquire the lock, it is more expensive. We refer to these two types of locks as non-competitive locks. The worst-case scenario occurs when multiple threads try to acquire a lock at the same time. This situation is not optimized by the JVM and usually occurs from the user state to the kernel state. Modern JVMs have made a lot of optimizations for non-competitive locks, making it almost impossible to affect performance. The following are some of the common optimizations.

    • If a lock object can only be accessed by the current thread, the other thread cannot obtain the lock and the synchronization occurs, so the JVM can remove the request for the lock.
    • Escape analysis identifies whether a reference to a local object is exposed in the heap. If not, you can change the local object's reference to a thread-local (thread locally).
    • The compiler can also make lock coarsening. The adjacent synchronized blocks are combined with the same locks to reduce the acquisition and release of unnecessary locks.

So don't worry too much about the overhead of a non-competitive lock, focus on the performance optimizations in the critical areas where the lock competition really occurs.

Back to top of page

Ways to reduce lock competition

Many developers try to minimize the use of locks because they are worried about the performance loss of synchronization, and they do not use lock protection even for critical sections that appear to have a very low probability of error. Doing so often does not lead to performance improvements, and it introduces errors that are difficult to debug. Because these errors usually occur in very low probability and are difficult to reproduce.

Therefore, in the premise of guaranteeing the correctness of the program, the first step to solve the performance loss caused by synchronization is not to remove the lock, but to reduce the competition of the lock. In general, there are three ways to reduce the competition for locks: to reduce the time to hold locks, to reduce the frequency of request locks, or to replace exclusive locks with other coordination mechanisms. There are a number of best practices included in these three methods, as described in the following sections.

Avoid time-consuming calculations in critical areas

The technique that often makes code thread-safe is to add a "big lock" to the entire function. For example, in Java, the entire method is declared as synchronized. However, what we need to protect is only the shared state of the object, not the code.

Holding locks over a long period of time can limit the scalability of the application. In the Java Concurrency in practice book, Brian Goetz mentions that if an operation holds a lock for more than 2 milliseconds, and each operation requires this lock, the throughput of the application will not exceed 500 operations per second, regardless of the number of idle processors For If you can reduce the time it takes to hold this lock to 1 milliseconds, you can increase the lock-related throughput to 1000 operations per second. In fact, there is a conservative estimate of the overhead of holding a lock for too long because it does not calculate the cost of the lock competition. For example, CPU time is wasted because of the busy and thread switching caused by obtaining a lock failure. The most effective way to reduce the likelihood of a lock competition is to minimize the time it takes to hold the lock. This can be achieved by removing code that does not need to be protected with locks, especially those that cost "expensive" operations, and those that are potentially blocking operations, such as I/O operations.

In Example 1, we use JLM (Java lock Monitor) to view the use of locks in Java. Foo1 protects the entire function with synchronized, FOO2 only the variable maph. Aver_htm shows the holding time for each lock. You can see that the hold time of the lock is reduced and the program execution time is shortened when irrelevant statements are moved out of the synchronization block.

Example 1. Avoid time-consuming calculations in critical areas
 Import Java.util.Map;  Import Java.util.HashMap; public class Timeconsuminglock implements Runnable {private final map<string, string> maph = new Hashmap<str     ING, string> ();     private int opnum;     Public Timeconsuminglock (int.) {opnum = on;         } public synchronized void foo1 (int k) {String key = integer.tostring (k);         String value = key+ "value";         if (null = = key) {return;                }else {maph.put (key, value);         }} public void Foo2 (int k) {String key = integer.tostring (k);         String value = key+ "value";         if (null = = key) {return;             }else {synchronized (this) {Maph.put (key, value);                    }}} public void Run () {for (int i=0; i<opnum; i++) {  Foo1 (i);  Time consuming Foo2 (i); //this would be better}}}  results from JLM reportResults of using foo1 mon-name [08121048] [email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3% UTIL aver_htm 0 5318465 5318465 0 349190349 8419428 the 5032 execution time:16106 Millis    Econds using Foo2 results mon-name [d594c53c] [email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 5635938 5635938 0 373087821 8968423 3322 Execution Time: 12157 milliseconds
Split-Lock and split-lock

Another way to reduce lock contention is to reduce the frequency of thread request locks. The split lock (lock splitting) and the detach lock (lock striping) are two ways to achieve this. Independent state variables, which should be protected with a separate lock. Sometimes developers mistakenly use a lock to protect all state variables. These technologies reduce the granularity of locks and achieve better scalability. However, these locks need to be carefully assigned to reduce the risk of deadlock.

If a lock guards multiple state variables that are independent of each other, you may be able to improve scalability by splitting the locks so that each lock guards against different variables. With this change, each lock is asked to become less frequently. Split lock for medium competitive intensity locks, can effectively convert most of them into non-competitive locks, resulting in improved performance and scalability.

In Example 2, we will split the lock that was originally used to protect two independent object variables into two locks that protect each object variable individually. In the JLM results, you can see that one of the previous locks [email protected] became two locks Java/util/[email protected] and Java/util/[email protected]. And the number of locks requested (GETS) and the level of competition (SLOW, TIER2, TIER3) are greatly reduced. Finally, the execution time of the program is reduced from 12981 milliseconds to 4797 milliseconds.

When a lock competes fiercely, it splits it into two, and is likely to get two fiercely competitive locks. Although this allows two threads to execute concurrently, there are some minor improvements to scalability. However, it is still not possible to significantly increase the concurrency of multiple processors in the same system.

Split locks can sometimes be expanded into a set of lock blocks, and they belong to separate objects, which is the case of a separate lock. For example, the implementation of CONCURRENTHASHMAP uses an array of 16 locks, each of which guards 1/16 of HashMap. Assuming that the hash value is evenly distributed, this will reduce the lock request to about 1/16 of the original. This technology allows the CONCURRENTHASHMAP to support 16 concurrent writers. The number of locks can also increase when heavy-duty access to multiprocessor systems requires better concurrency.

In Example 3, we simulated the use of a separate lock in Concurrenthashmap. Protect different parts of an array with 4 locks. In the JLM results, you can see that one of the previous locks [email protected] became four locks Java/lang/[email protected] and so on. and the degree of competition for locks (TIER2, TIER3) has been greatly reduced. Finally, the execution time of the program is reduced from 5536 milliseconds to 1857 milliseconds.

Example 2. Split lock
 Import Java.util.HashSet;  Import Java.util.Set;     public class Splittinglock implements runnable{private final set<string> users = new hashset<string> ();         Private final set<string> queries = new hashset<string> ();     private int opnum;     Public Splittinglock (int.) {opnum = on;     Public synchronized void AddUser1 (String u) {users.add (U);     Public synchronized void AddQuery1 (String q) {queries.add (q);         The public void AddUser2 (String u) {synchronized (users) {users.add (U);         }} public void AddQuery2 (String q) {synchronized (queries) {queries.add (q);              }} public void Run () {for (int i=0; i<opnum; i++) {String user = new String ("user");             User+=i;                         AddUser1 (user);             String query = new String ("query");             Query+=i; AddQuery1 (qUery);   }     }  }results from JLM reportUsing the results of AddUser1 and AddQuery1 Mon-name [d5848cb0] [email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 9004711 9004711 101 0 482982391 10996987 3393 execution Tim     e:12981 milliseconds using AddUser2 and AddQuery2 results mon-name [d5928c98] java/util/[email protected] (Object)%miss  GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 1875510 1875510 38 0 108706364      2546875 5173 mon-name [d5928c98] java/util/[email protected] (Object)%miss GETS Nonrec SLOW       REC TIER2 TIER3%util aver_htm 0 272365 272365 0 0 15154239 352397 1 3042 Execution time:4797 milliseconds
Example 3. Detach lock
 public class Strippinglock implements runnable{private final object[] locks;     private static final int n_locks = 4;     Private final String [] share;     private int opnum;     private int n_anum;         public Strippinglock (int in, int anum) {opnum = on;         N_anum = Anum;         Share = new String[n_anum];         Locks = new Object[n_locks];     for (int i = 0; i<n_locks; i++) locks[i] = new Object ();    } public synchronized void put1 (int indx, String k) {share[indx] = k; Acquire the object lock} public void Put2 (int indx, String k) {synchronized (locks[indx%    N_locks]) {share[indx] = k; Acquire the corresponding lock}} public void Run () {//the expensive put/*FO         R (int i=0; i<opnum; i++) {put1 (I%n_anum, integer.tostring (i+1));   }*///the cheap put for (int i=0; i<opnum; i++) {          Put2 (I%n_anum, integer.tostring (i+1));   }     }     } results from JLM report    Results of using PUT1 mon-name [08121228] [email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 4830690 4830690 460 0 229538313 5010789-2552 execution time:5536 m       Illiseconds using PUT2 results mon-name [08121388] java/lang/[email protected] (Object)%miss GETS Nonrec SLOW        REC TIER2 TIER3%util aver_htm 0 4591046 4591046 1517 0 151042525 3016162 13 1925    Mon-name [08121330] java/lang/[email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 1717579 1717579 523 0 50596994 958796 5 1901 Mon-name [081213       E0] java/lang/[email protected] (Object)%miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 1814296 1814296 536 0 58043786 1113454 5 1799 mon-name [08121438] Java/lang/[email&n Bsp;protected] (Object) %miss GETS nonrec SLOW REC TIER2 TIER3%util aver_htm 0 3126427 3126427 901 0 96 627408 1857005 9 1979 execution time:1857 milliseconds
Avoid hotspot domains

In some applications, we use a shared variable to cache commonly used calculation results. Each update operation requires that the shared variable be modified to ensure its validity. For example, queue Size,counter, header node references for linked lists, and so on. In multithreaded applications, this shared variable needs to be protected with a lock. This optimization method, commonly used in single-threaded applications, becomes a hot field in multi-threaded applications, limiting scalability. If a queue is designed to maintain high throughput during multithreaded access, consider not updating the queue size for each queued and out-of-team operation. Concurrenthashmap in order to avoid this problem, maintain a separate counter in the array of each shard, using separate lock protection instead of maintaining a global count.

Alternative methods for exclusive locks

The third technique used to mitigate the performance impact of a competitive lock is to abandon exclusive locks and manage shared state with more efficient concurrency. For example, concurrent containers, read-write locks, immutable objects, and atomic variables.

Java.util.concurrent.locks.ReadWriteLock implements a multi-reader-single-writer lock: Multiple readers can access shared resources concurrently, but the writer must acquire the lock exclusively. For data structures where most operations are read, Readwritelock provides better concurrency than exclusive locks.

Atomic variables provide a way to avoid a "hotspot domain" update that results in lock contention, such as counters, sequence generators, or updates to a linked list data structure header node reference.

In Example 4, we use atomic operations to update each element of an array, avoiding exclusive locks. The execution time of the program is reduced from 23550 milliseconds to 842 milliseconds.

Example 4. Arrays that use atomic manipulation
 Import Java.util.concurrent.atomic.AtomicLongArray;     public class Atomiclock implements runnable{private final long d[];         Private final Atomiclongarray A;          private int a_size;         public atomiclock (int size) {a_size = size;         D = new Long[size];     A = new Atomiclongarray (size);     } public synchronized void Set1 (int idx, long val) {D[idx] = val;         } Public synchronized long get1 (int idx) {LONG ret = D[idx];     return ret;       } public void Set2 (int idx, long val) {a.addandget (idx, Val);         } public long Get2 (int idx) {LONG ret = a.get (IDX);     return ret;             } public void Run () {for (int i=0; i<a_size; i++) {//the slower operations             Set1 (i, I);                          Get1 (i);             The quicker operations Set2 (I, I);         Get2 (i);   }     }      }results of Set1 and Get1Execution time:23550 millisecondsresults of Set2 and Get2Execution time:842 milliseconds
Using concurrent containers

Starting with Java1.5, the Java.util.concurrent package provides efficient, ground-safe concurrent containers. Concurrent containers themselves guarantee thread safety and are optimized for common operations in the case of a large number of thread accesses. These containers are suitable for use in multi-threaded applications running on multicore platforms, with high performance and scalability. The amino project provides more efficient concurrent containers and algorithms.

Using immutable data and Thread Local data

Immutable data remains constant throughout its lifecycle, so it is safe to copy a share in each thread for quick reading.

ThreadLocal data is only locked by the thread itself, so there is no problem with sharing data between different threads. Threallocal can be used to improve many of the existing shared data. For example, all thread-shared object pools, wait queues, and so on, can become the object pool and wait queue for each thread. Using Work-stealing Scheduler instead of the traditional Fifo-queue scheduler is also an example of using Thread Local data.

Back to top of page

Conclusion

Locks are an indispensable tool for developing multithreaded applications. With the multi-core platform becoming mainstream today, the proper use of locks will become a basic skill for developers. Although lock-free programming and transactional Memory have appeared in the eyes of software developers, the use of locks is still the most important parallel programming skill in the foreseeable future. We hope that the approach presented in this article will help you to use the lock tool correctly.

Original address: http://www.ibm.com/developerworks/cn/java/j-lo-lock/

Tips for efficient use of locks in Java-reproduced

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.