"Die-knock Java Concurrency"-----In-depth analysis of the implementation principle of synchronized __java

Source: Internet
Author: User
Tags cas mutex stringbuffer
"Die knock Java Concurrency"-----Deeply analyze the realization principle of synchronized

I remember just beginning to learn Java, one encounter multithreading situation is synchronized, relative to us at that time synchronized is so magical and powerful, at that time we give it a name "sync", Also become our solution to multithreading situation of the hundred test bad medicine. But as we learn, we know that synchronized is a heavyweight lock, which, as opposed to lock, looks so unwieldy that we think it is less efficient and slowly abandons it.
Admittedly, synchronized does not look so heavy with the various optimizations that Javs SE 1.6 has on synchronized. Following the LZ together to explore the implementation mechanism of synchronized, Java is how to optimize it, lock optimization mechanism, lock storage structure and upgrade process; Realization principle

Synchronized can guarantee that a method or block of code runs at the same time only one method can enter the critical section, while it can also guarantee the memory visibility of the shared variables

Each object in Java can be used as a lock, which is the basis for synchronized to achieve synchronization:
1. Normal synchronization method, the lock is the current instance object
2. Static synchronization method, the lock is the class object of the current classes
3. Sync method block, lock is the object in parentheses

When a thread accesses a synchronized code block, it first needs to get a lock to execute the synchronization code, and when the lock must be released when exiting or throwing an exception, how does it implement the mechanism? Let's look at a simple piece of code:

public class Synchronizedtest {public
    synchronized void Test1 () {

    } public

    void Test2 () {
        synchronized ( This) {

        }}}
1 2 3 4 5 6 7 8 9 10 11

Using the JAVAP tool to view the generated class file information to analyze the implementation of synchronize

As you can see from the above, the synchronized code block is implemented using the Monitorenter and Monitorexit directives, and the synchronization method (which does not look at the JVM's underlying implementation) relies on the acc_synchronized implementation on the method modifier.
Sync code block: The monitorenter instruction is inserted at the beginning of the sync code block, the monitorexit instruction is inserted at the end of the sync code block, and the JVM needs to ensure that each monitorenter has a monitorexit corresponding to it. Any object has a monitor associated with it, and when a monitor is held, he is locked. When the thread executes to the monitorenter instruction, it attempts to acquire the monitor ownership of the object, that is, attempts to acquire the lock of the object;
Synchronization method: The Synchronized method will be translated into ordinary method calls and return instructions such as: invokevirtual, Areturn instructions, at the VM byte-code level without any special instructions to implement the synchronized modified method, Instead, the synchronized flag position 1 in the Access_flags field of the method is placed in the method table of the class file to indicate that the method is a synchronization method and uses the object that called the method or the class that the method belongs to represent the Klass as the lock object in the inner object of the JVM. (Excerpt from: http://www.cnblogs.com/javaminer/p/3889023.html)

Let's continue with the analysis, but before we go any further we need to understand two important concepts: Java object header, Monitor. Java object headers, monitor

Java object Headers and monitor are the basis for implementing synchronized. Here are the two concepts to do a detailed introduction. Java Object Header

The synchronized lock is in the Java object's head, so what is the Java object header? The object headers of the hotspot virtual machine consist primarily of two pieces of data: Mark Word (Mark Field), Klass pointer (type pointer). Where Klass point is a pointer to its class metadata, the virtual machine uses this pointer to determine which class the object is an instance of, and Mark word is used to store the Run-time data of the object itself, which is the key to implementing lightweight and biased locks, so the following focuses on

Mark Word.
Mark word is used to store run-time data for the object itself, such as hash code (HASHCODE), GC generational age, lock status flag, thread-held lock, biased thread ID, biased timestamp, and so on. Java object headers typically occupy two machine yards (in 32-bit virtual machines, the 1 machine code equals 4 bytes, or 32bit, but if the object is an array type, three machine yards are required because the JVM virtual machine can determine the size of the Java object through the metadata information of the Java object. However, you cannot confirm the size of the array from the metadata of the array, so use a piece to record the length of the array. The following figure is the storage structure of the Java object Header (32-bit virtual machine):

Object header information is an additional storage cost independent of the data defined by the object itself, however, given the space efficiency of the virtual machine, Mark Word is designed to be an unfixed data structure that stores as much of it as possible in a tiny amount of space, and it uses its own storage space based on the state of the object, which means that mark Word changes as the program runs (32-bit virtual machines):

The Java object header is briefly introduced, and we'll look at monitor below. Monitor

What is monitor. We can interpret it as a synchronization tool, or as a synchronization mechanism, which is often described as an object.
As with all objects, all Java objects are native to monitor, and every Java object has the potential to be a monitor, because in Java design, every Java object comes out of the womb with an invisible lock called an internal lock or monitor lock.
Monitor is a thread-private data structure, with each thread having a list of monitor record available and a global list of available. Each locked object is associated with a monitor (the Lockword in the Markword of the object header points to the start address of monitor), while a owner field in Monitor holds the unique identity of the thread that owns the lock, indicating that the lock is occupied by this thread. The structure is as follows:

Owner: null initially indicates that no thread currently owns the monitor record, and when the thread succeeds in owning the lock, it saves the thread's unique identity, which is set to NULL when the lock is released;
ENTRYQ: Associates a system mutex (semaphore) that blocks all threads attempting to lock the monitor record.
Rcthis: The number of all threads that represent blocked or waiting on the monitor record.
Nest: The count used to implement a reentrant lock.
Hashcode: Saves the hashcode value that is copied from the header of the object (and may also contain GC age).
Candidate: Used to avoid unnecessary blocking or waiting for a thread to wake up, because only one thread can successfully hold the lock each time, if the previous thread that releases the lock wakes up all threads that are blocking or waiting. Can cause significant performance degradation by causing unnecessary context switching (from blocking to ready and then blocking because of competitive lock failures). Candidate only two possible values of 0 indicate that no thread 1 is required to wake to invoke a successor thread to compete for the lock.
Excerpt from: The implementation principle and application of synchronized in Java
We know that synchronized is a heavyweight lock, not very efficient, and this concept has always been in our minds, but the implementation of synchronize in JDK 1.6 has been optimized to make it look less heavy, so the JVM uses those optimizations.Lock Optimization

jdk1.6 the implementation of the lock introduces a large number of optimizations, such as spin lock, adaptive spin Lock, lock elimination, lock coarsening, bias Lock, lightweight lock technology to reduce the cost of lock operation.
The lock mainly exists in four states, in order: No lock state, preference lock state, lightweight lock state, heavyweight lock state, they will gradually upgrade with the fierce competition. Note that locks can be upgraded to be non degraded, and this strategy is designed to improve the efficiency of acquiring locks and releasing locks. Spin lock

Thread blocking and awakening requires the CPU to switch from user state to kernel mentality, frequent blocking and wake-up is a heavy burden on the CPU, which is bound to bring great pressure on the system's concurrent performance. At the same time we found that in many applications, the lock state of the object lock will only last for a short time, so it is not worth to block and wake the thread frequently for a short time. So the spin lock is introduced. &NBSP
What is a spin lock. &NBSP
The so-called spin lock, which allows the thread to wait for a period of time, will not be immediately suspended to see if the thread holding the lock will soon release the lock. How to wait for it. Perform a meaningless loop (spin). &NBSP
Spin wait does not replace blocking, not to mention the number of processors (multi-core, seemingly now there is no single core processor), although it can avoid the overhead of thread switching, but it takes up the processor time. If the thread holding the lock soon releases the lock, so the efficiency of the spin is very good, on the contrary, the spinning thread will be consumed in vain to deal with the resources, it will not do any meaningful work, typical of the manger does not poop, this will bring about the performance of waste. So, the time of the spin wait (the number of spins) must have a limit, if the spin exceeds the defined time still not acquire the lock, it should be suspended. &NBSP
Spin lock is introduced in JDK 1.4.2, closed by default, but can be opened by using-xx:+usespinning, which is opened by default in JDK1.6. At the same time, the default number of spin is 10 times, can be adjusted by parameter-xx:preblockspin,  
If the spin lock is adjusted by the parameter-xx:preblockspin, it will cause a lot of inconvenience. If I adjust the parameters to 10, but many of the system threads are waiting for you just quit when the release of the lock (if you spin more than one or two times to get the lock), you are not very embarrassed. So JDK1.6 introduces adaptive spin locks to make virtual opportunities more and more intelligent. Adaptive spin lock

JDK 1.6 introduces a smarter spin lock, an adaptive spin lock. The so-called adaptive means that the number of spins is no longer fixed, it is determined by the previous time in the same lock and the state of the owner of the lock. How does it do it. If the spin succeeds, the next spin will be more frequent, because the virtual machine thinks that since it was last successful, the spin is likely to succeed again, so it will allow the spin to wait more often. Conversely, if for a lock, very few spins can be successful, then in the future or this lock when the number of spins will be reduced or even omit the spin process, so as not to waste processor resources.
With the adaptive spin Lock, with the continuous improvement of program running and performance monitoring information, the virtual machine is more and more accurate to the situation of the program lock, and the virtual opportunity becomes more and more intelligent. Lock Elimination

In order to ensure the integrity of the data, we need to synchronize this part of the operation when we do the operation, but in some cases the JVM detects that there is no way to compete for shared data, which is what the JVM does to lock up the synchronization locks. The basis of lock elimination is data support for escape analysis.
If there is no competition, why do you need to lock it? So lock elimination can save time for meaningless request locks. Whether a variable escapes is a need for a virtual machine to be determined using data flow analysis, but it's not clear to US programmers. We're going to add synchronization before we know that there is no data competition in the code block. But sometimes the program is not what we think it is. We do not show the use of locks, but we use some of the JDK's built-in APIs, such as StringBuffer, Vector, Hashtable, and so on, this time there will be stealth lock operation. For example, StringBuffer's append () method, Vector's Add () method:

    public void Vectortest () {
        vector<string> Vector = new vector<string> ();
        for (int i = 0; i < i++) {
            Vector.add (i + "");
        }

        System.out.println (vector);
    
1 2 3 4 5 6 7 8

When this code is run, the JVM can clearly detect that the variable vector does not escape the method Vectortest (), so the JVM can boldly eliminate the lock operation within the vector. Lock Coarsening

We know that when we use the sync lock, you need to make the synchronization block as small as possible-only in the actual scope of the shared data, so that you can minimize the number of operations that need to be synchronized, and if there is a lock competition, the thread waiting for the lock will get the lock as soon as possible.
In most cases, the above view is correct, the LZ has been adhering to this view. However, if a series of interlocking unlock operation, may cause unnecessary performance loss, so introduce the concept of the lock.
The concept of lock language is better understood, that is, multiple consecutive lock, unlock operations connected together to expand into a larger range of locks. As in the example above: the vector needs to be locked each time it is added, the JVM detects a continuous lock, unlocking operation on the same object (vector), merging a wider range of lock, unlocking operations, where the lock unlock operation moves outside the for loop. Lightweight Locks

The main purpose of introducing lightweight locks is to reduce the performance consumption of traditional heavyweight locks resulting from operating system mutexes with no multiple threads competition. When the bias lock feature is turned off or multiple threads compete for the lock, which causes a bias lock to be upgraded to a lightweight lock, a lightweight lock is attempted, with the following steps:
Get lock
1. Determine whether the current object is in an unlocked state (hashcode, 0, 01), and if so, the JVM will first establish a space in the current thread's stack frame called the lock record, which is used to store the lock object the current mark Copy of Word (the official adds a displaced prefix to the copy, that is, displaced Mark Word); otherwise the execution step (3);
2. The JVM attempts to update the object's Mark Word with a reference to a lock record by using CAS operations, and if a successful presentation of the contention to the lock, the lock flag bit is changed to 00 (indicating that the object is in a lightweight lock state), the synchronization is performed, and the Step (3) if the
3. Determine if Mark word for the current object points to the current thread's stack frame. If it means that the current thread already holds a lock on the current object, the synchronized code block is executed directly, otherwise only the lock object has been preempted by another thread, and the lightweight lock needs to be inflated to a heavyweight lock, with the lock sign bit turning to 10. The thread waiting behind will enter the blocking state;

Release lock
The release of lightweight locks is also done through CAS operations, with the following main steps:
1. Remove the data stored in displaced Mark word while acquiring a lightweight lock;
2. Use the CAS operation to replace the extracted data with the current object's Mark Word, if successful, the release lock succeeds, otherwise execute (3);
3. If a CAS operation fails to replace, and another thread attempts to acquire the lock, it needs to wake the suspended thread while releasing the lock.

For lightweight locks, the basis for performance improvement is "for most locks, there will be no competition throughout the life cycle" if this is broken, there are additional CAS operations in addition to the mutually exclusive overhead, so lightweight locks are slower than heavy locks in the case of multiple-threading competition;

The following figure is the acquisition and release process for lightweight locks
bias Lock

The main purpose of introducing bias locks is to minimize unnecessary lightweight lock execution paths in the absence of multiple-threading competition. It is mentioned above that the lock unlock operation of lightweight locks is dependent on multiple CAs atomic directives. So how does a preference lock reduce unnecessary CAS operations? We can see the structure of Mark work. The process flow is as follows: Only check for biased locks, lock identities, and ThreadID.
Get lock
1. Detect Mark Word is a biased state, that is, whether the bias lock 1, lock identification bit is 01;
1. If the state is biased, the test thread ID is the current thread ID, and if so, the step (5) is performed, otherwise the steps (3) are performed;
1. If the thread ID is not the current thread ID, competition for the lock through the CAS operation is successful, and the thread ID of Mark Word is replaced with the current thread ID, otherwise the execution thread (4);
4. The failure to compete for locks through CAS proves that there are currently multi-threaded competition situations where a thread that obtains a bias lock is suspended when a global security point is reached, a biased lock is upgraded to a lightweight lock, and the thread that is blocked at the safe point continues to execute the synchronized code block;
5. Execute Synchronized code block

Release lock
The release of the biased lock uses a mechanism that only competition releases the lock, and threads do not take the initiative to release the biased locks, which need to wait for other threads to compete. A lock-biased revocation needs to wait for a global security point (this point is not executing code on it). The steps are as follows:
1. Suspend the lock-biased thread to determine whether the lock object stone is still in the locked state;
2. To undo the bias of Su, restore to the state of No lock (01) or lightweight lock;

The following figure is the process of acquiring and releasing the biased locks
Heavy-weight lock

The heavy lock realizes through the object Internal Monitor (monitor), in which Monitor's essence relies on the underlying operating system's mutex lock realization, the operating system realizes the switch between the thread needs from the user state to the kernel state switching, the switching cost is very high. reference Zhou Zhiming: "Deep understanding of the Java Virtual machine" side take-off: "The Art of Java concurrent Programming" Java in the implementation of synchronized principles and applications)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.