Java concurrency Programming: Synchronized bottom-level optimizations (biased, lightweight locks)

Source: Internet
Author: User
Tags cas

Java Concurrent Programming Series "Not finished":

    • Java concurrency Programming: core theory

    • Java Concurrency Programming: synchronized and its implementation principles

    • Java concurrency Programming: Synchronized bottom-level optimization (lightweight lock, biased lock)

First, the weight of class lock

In the previous article, we introduced the usage of synchronized and the principle of its realization. Now we should know that synchronized is implemented by a monitor lock inside the object. But the nature of the monitor lock is implemented by the mutex lock which relies on the underlying operating system. and the operating system to implement the switch between the threads that need to transition from the user state to the nuclear mentality, the cost is very high, the transition between States need a relatively long time, which is why the low efficiency of synchronized reasons. Therefore, the lock that relies on the operating system mutex lock is what we call the "heavyweight lock." The core of the various optimizations made to synchronized in the JDK is to reduce the use of this heavyweight lock. After JDK1.6, "lightweight locks" and "biased locks" have been introduced to reduce the performance cost of obtaining locks and release locks, and to improve performance.

Second, lightweight lock

There are four types of locks: lock-free, biased, lightweight, and heavy-lock. As the lock competes, the lock can be upgraded from a biased lock to a lightweight lock, and then to a heavyweight lock (but the lock upgrade is one-way, that is, it can only be upgraded from low to high, and there is no downgrade of the lock). In JDK 1.6, the default is to turn on biased and lightweight locks, and we can also disable biased locks via-xx:-usebiasedlocking. The state of the lock is saved in the object's header file, as an example of a 32-bit JDK:

Lock status

-bit

4bit

1bit

2bit

23bit

2bit

Whether the lock is biased

Lock Flag bit

Lightweight lock

Pointer to lock record in stack

00

Heavy-weight lock

Pointer to mutex (heavyweight lock)

10

GC Flag

Empty

11

Biased lock

Thread ID

Epoch

Age of Object generation

1

01

No lock

Hashcode of objects

Age of Object generation

0

01

"Lightweight" is relative to traditional locks implemented using operating system mutexes. However, the first thing to emphasize is that lightweight locks are not used in place of a heavyweight lock, and it is intended to reduce the performance cost of traditional heavy-lock use without multi-threaded competition. Before explaining the execution of a lightweight lock, it is important to understand that a lightweight lock adapts to a scenario where a thread alternately executes a synchronization block, and if there is a condition that accesses the same lock at the same time, it causes the lightweight lock to expand to a heavyweight lock.

1, the lock process of lightweight lock

(1) When the code enters the synchronization block, if the synchronization object lock state is no lock state (the lock flag bit is "01" state, whether the bias lock is "0"), the virtual machine will first establish in the current thread's stack frame a space named lock record, to store the lock object the current mark Word's copy, officially called displaced Mark Word. This is the thread stack as shown in state 2.1 of the object header.

(2) Mark Word in the Copy object header is copied to the lock record.

(3) After the copy succeeds, the virtual machine uses the CAS operation to attempt to update the object's mark Word to a pointer to the lock record and to point the owner pointer in the lock record to object Mark Word. If the update succeeds, perform step (3), otherwise perform step (4).

(4) If the update action succeeds, the thread has a lock on the object, and the object Mark Word's lock flag bit is set to "00", which means that the object is in a lightweight lock state, when the thread stack is as shown in state 2.2 of the object header.

(5) If this update fails, the virtual machine first checks to see if the object's mark word points to the current thread's stack frame, and if it means that the current thread already has the lock on the object, it can go straight to the synchronization block and proceed. Otherwise, multiple threads compete for a lock, the lightweight lock expands to a heavyweight lock, the status value of the lock flag changes to "ten", and Mark Word stores a pointer to a heavyweight lock (mutex), and the thread that waits for the lock goes into a blocking state. The current thread attempts to use a spin to acquire the lock, and the spin is the process of taking a loop to get the lock to keep the thread from blocking.

Figure 2.1 Lightweight lock CAs operation before stack with the state of an object

Figure 2.2 The state of the stack and object after a lightweight lock CAS operation

2. The process of unlocking the lightweight lock:

(1) The CAS operation attempts to replace the displaced mark Word object copied in the thread with the current Mark Word.

(2) If the replacement succeeds, the entire synchronization process is complete.

(3) If the substitution fails, indicating that another thread has attempted to acquire the lock (when the lock is inflated), it is necessary to wake the suspended thread while releasing the lock.

Third, biased lock

Biased locking is introduced to minimize unnecessary lightweight lock execution paths in the absence of multi-thread contention, since the acquisition and release of lightweight locks relies on multiple CAs atomic instructions, The bias lock only needs to rely on the CAS Atom directive at the time of displacement threadid (because of the multi-threaded competition, the bias lock must be revoked, so the performance loss of the reverse lock operation must be less than the performance consumption of the saved CAs atomic instructions). As mentioned above, lightweight locks are designed to improve performance when a synchronization block is alternately executed on a thread, whereas a biased lock can further improve performance if only one of the threads executes the synchronization block.

1, biased lock acquisition process:

(1) Access Mark Word in favor of the identity of the lock is set to 1, whether the lock flag bit is 01--is considered to be biased.

(2) If it is a biased state, the test thread ID points to the current thread, if it is, enter step (5), otherwise enter step (3).

(3) If the thread ID does not point to the current thread, a competition lock is operated through CAs. If the competition succeeds, the mark Word line ID is set to the current thread ID, and then (5) is executed (4) if the competition fails.

(4) If the CAS acquires a biased lock failure, it indicates a competition. When the global security Point (SafePoint) is reached, the thread that obtains the bias lock is suspended, the bias lock is promoted to a lightweight lock, and then the thread that is blocked at the security point continues to execute the synchronization code.

(5) Execute the synchronization code.

2, favor the release of the Lock:

  The revocation of a bias lock is mentioned in the above step . biased lock the thread that holds the biased lock releases the lock only when it encounters another thread trying to compete for a biased lock, and the thread does not voluntarily release the biased lock. A bias lock revocation, which waits for a global security point (at which no bytecode is executing), first pauses the thread that has a biased lock, determines whether the lock object is locked, reverts to an unlocked (flag bit "01"), or a lightweight lock (the flag bit is "00").

3. Conversion between heavyweight lock, lightweight lock and bias lock

Figure 2.3 Conversion diagram of the three

The figure is mainly a summary of the above content, if you have a better understanding of the above content, the diagram should be easy to read.

Iv. Other Optimizations

1. Adaptive Spin (Adaptive Spinning): We know from the process of lightweight locks that when a thread fails to perform a CAS operation during the acquisition of a lightweight lock, it is to obtain a heavyweight lock by spin. The problem is that the spin is CPU-intensive, and if the lock is not acquired, then the thread has been in a spin state, wasting CPU resources. The simplest way to solve this problem is to specify the number of spins, for example, to loop 10 times, and then enter the blocking state if the lock is not acquired. But the JDK uses a smarter approach-adaptive spin, which simply means that if the spin succeeds, the next spin will be more, and if the spin fails, the spin will be reduced.

2. Lock coarsening (lock coarsening): The concept of lock coarsening should be better understood, that is, the multiple connected locking, the unlocking operation is merged into one time, and several successive locks are expanded into a larger range of locks. As an example:

1  Packagecom.paddx.test.string;2 3  Public classStringbuffertest {4StringBuffer StringBuffer =NewStringBuffer ();5 6      Public voidappend () {7Stringbuffer.append ("a");8Stringbuffer.append ("B");9Stringbuffer.append ("C");Ten     } One}

Each call to the Stringbuffer.append method here requires lock and unlock, if the virtual machine detects a series of interlocking and unlocking operations on the same object, it will be merged into a more extensive lock and unlock operation, that is, when the first append method is locked, after the last Append method ends To unlock.

3, Lock elimination: lock elimination is to remove unnecessary lock operation. According to the code escape technique, if the data on the heap is not escaped from the current thread in a piece of code, you can assume that the code is thread-safe and does not have to be locked. Look at the following procedure:

1  Packagecom.paddx.test.concurrent;2 3  Public classSynchronizedTest02 {4 5      Public Static voidMain (string[] args) {6SynchronizedTest02 test02 =NewSynchronizedTest02 ();7         //start preheating8          for(inti = 0; I < 10000; i++) {9i++;Ten         } One         LongStart =System.currenttimemillis (); A          for(inti = 0; i < 100000000; i++) { -Test02.append ("abc", "Def"); -         } theSystem.out.println ("time=" + (System.currenttimemillis ()-start)); -     } -  -      Public voidAppend (String str1, String str2) { +StringBuffer SB =NewStringBuffer (); - sb.append (STR1). Append (str2); +     } A}

Although StringBuffer's append is a synchronous method, the StringBuffer in this program belongs to a local variable and does not escape from the method, so the process is thread-safe and the lock can be eliminated. Here are the results of my local execution:

To minimize the impact of other factors, the biased lock (-xx:-usebiasedlocking) is disabled here. Through the above program, you can see the elimination of the lock after the performance is still relatively large increase.

Note: There may be different results from each version of the JDK, and I'm using a JDK version of 1.6.

V. Summary

This article mainly introduces the use of lightweight and biased locking in the JDK synchronized optimization, but these two locks are not completely no shortcomings, such as competition is more intense, not only can not improve efficiency, but will reduce efficiency, because a lock upgrade process, this time need to pass-XX: -usebiasedlocking to disable biased locking. Here are the comparison of these types of locks:

Lock

Advantages

Disadvantages

Applicable scenarios

Biased lock

Locking and unlocking does not require additional consumption, and the execution of a non-synchronous method is less than the nanosecond-level gap.

If there is a lock contention between threads, additional lock revocation consumption is brought.

Applies to only one thread to access the synchronization block scenario.

Lightweight lock

The competing threads do not block, increasing the responsiveness of the program.

If a thread that does not have a lock contention is always using spin, it consumes the CPU.

The pursuit of response time.

Synchronization blocks execute very quickly.

Heavy-weight lock

Thread contention does not use spin and does not consume CPU.

The thread is blocked and the response time is slow.

Pursuit of throughput.

The synchronization block executes more slowly.

Reference documents:

http://www.iteye.com/topic/1018932

Http://www.infoq.com/cn/articles/java-se-16-synchronized

http://frank1234.iteye.com/blog/2163142

Https://www.artima.com/insidejvm/ed2/threadsynch3.html

Http://www.tuicool.com/articles/2aeAZn

Java concurrency Programming: Synchronized bottom-level optimizations (biased, lightweight locks)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.