Lock Release-Get established happens before relationship
Locks are the most important synchronization mechanism in Java concurrent programming. The lock allows the thread that releases the lock to send a message to the thread that acquires the same lock, except for the exclusion of the critical zone from execution.
Here is the sample code for lock Release-get:
classMonitorexample {intA = 0; Public synchronized voidWriter () {//1a++;//2}//3 Public synchronized voidReader () {//4 inti = A;//5...} //6}
Assume that thread a executes the writer () method, and then thread B executes the reader () method. According to the happens before rule, this process contains a happens before relationship that can be divided into two categories:
- According to the Rules of Procedure order, 1 happens before 2, 2 happens before 3; 4 happens before 5, 5 happens before 6.
- According to the monitor lock rule, 3 happens before 4.
- According to the transitivity of happens before, 2 happens before 5.
The graphical representation of the above happens before relationship is as follows:
In, each arrow links the two nodes that represent a happens before relationship. The black arrows represent the program order rules, the orange arrows represent the monitor lock rules, and the blue arrows indicate the happens before guarantees provided after the rules are combined.
Indicates that thread a releases the lock, and then threads B acquires the same lock. In, 2 happens before 5. Therefore, thread A will immediately become visible to the B thread after all the shared variables that are visible before the lock is released, and Threads B acquires the same lock.
Memory semantics for lock release and acquisition
When a thread releases a lock, JMM flushes the shared variable in the local memory corresponding to that thread into main memory. As an example of the Monitorexample program above, the status of shared data is as follows when a thread releases a lock:
When a thread acquires a lock, JMM will place the local memory corresponding to that thread as invalid. Thus, the critical section code protected by the monitor must read the shared variable from the main memory. Here is the status of the lock acquisition:
Comparing the memory semantics of lock release-acquisition with the volatile write-read memory semantics, it can be seen that lock release has the same memory semantics as volatile writes; lock fetching has the same memory semantics as volatile read.
The following is a summary of the memory semantics of lock release and lock acquisition:
- Thread A releases a lock, essentially a thread A sends a message (modified by thread A to the shared variable) to a thread that is going to acquire the lock.
- Thread B acquires a lock, in essence, thread B receives a message from a previous thread (modified to a shared variable before releasing the lock).
- Thread A releases the lock, and then thread B acquires the lock, which is essentially thread A sends a message to thread B through main memory.
Implementation of lock memory semantics
This article will use the source code of Reentrantlock to analyze the specific implementation mechanism of lock memory semantics.
Take a look at the following sample code:
classReentrantlockexample {intA = 0; Reentrantlock Lock=NewReentrantlock (); Public voidwriter () {lock.lock (); //Get lock Try{a++; } finally{lock.unlock (); //Release Lock }} Public voidReader () {lock.lock (); //Get lock Try { inti =A; ...... } finally{lock.unlock (); //Release Lock }}}
In Reentrantlock, call the Lock () method to get the lock, and call the Unlock () method to release the lock.
The implementation of Reentrantlock relies on the Java Synchronizer Framework Abstractqueuedsynchronizer (this article is called Aqs). Aqs uses an integer volatile variable (named state) to maintain the synchronization state, and immediately we see that this volatile variable is the key to the Reentrantlock memory semantics implementation. The following is the class diagram for Reentrantlock (only the sections related to this article are drawn):
Reentrantlock is divided into fair lock and non-fair lock, we first analyze fair lock.
When using a fair lock, the method call trace for Lock Method Lock () is as follows:
- Reentrantlock:lock ()
- Fairsync:lock ()
- Abstractqueuedsynchronizer:acquire (int arg)
- Reentrantlock:tryacquire (int acquires)
In the 4th step to really start locking, here is the source code of the method:
protected Final BooleanTryacquire (intacquires) { FinalThread current =Thread.CurrentThread (); intc = GetState ();//gets the start of the lock, reading the volatile variable state first if(c = = 0) { if(IsFirst (current) &&Compareandsetstate (0, acquires)) {Setexclusiveownerthread (current); return true; } } Else if(Current = =Getexclusiveownerthread ()) { intNEXTC = C +acquires; if(NEXTC < 0) Throw NewError ("Maximum Lock count Exceeded"); SetState (NEXTC); return true; } return false;}
As we can see from the source code above, the lock method reads the volatile variable state first.
When using a fair lock, the method call trace for the Unlock method unlock () is as follows:
- Reentrantlock:unlock ()
- abstractqueuedsynchronizer:release (int arg)
- Sync:tryrelease (int releases)
In the 3rd step to really start releasing the lock, here is the source code for the method:
protected Final BooleanTryrelease (intreleases) { intc = getState ()-releases; if(Thread.CurrentThread ()! =Getexclusiveownerthread ())Throw Newillegalmonitorstateexception (); BooleanFree =false; if(c = = 0) { free=true; Setexclusiveownerthread (NULL); } setState (c); //release the last of the lock, write the volatile variable state returnFree ;}
From the source code above we can see that at the end of the release lock write the volatile variable state.
The fair lock writes the volatile variable state at the end of the release lock, and the volatile variable is read first when the lock is acquired. According to the volatile Happens-before rule, a shared variable that is visible before the thread that releases the lock is read before the volatile variable is written, and immediately becomes visible to the thread that acquires the lock after the thread that acquires the lock reads the same volatile variable.
Now we analyze the implementation of the memory semantics of the non-fair lock.
The release of the non-fair lock is exactly the same as the fair lock, so here only the acquisition of the unfair lock is analyzed.
When using a fair lock, the method call trace for Lock Method Lock () is as follows:
- Reentrantlock:lock ()
- Nonfairsync:lock ()
- Abstractqueuedsynchronizer:compareandsetstate (int expect, int update)
In the 3rd step to really start locking, here is the source code of the method:
protected Final boolean compareandsetstate (intint update) { return Unsafe.compareandswapint (This, Stateoffset, expect, update);}
This method updates the state variable in an atomic operation, and the Java Compareandset () method call is referred to as CAs. The JDK document describes the method as follows: If the current state value equals the expected value, the synchronization state is atomically set to the given update value. This operation has volatile read and write memory semantics.
Here we analyze from the perspective of the compiler and processor, how CAS has both volatile read and volatile write memory semantics.
As we mentioned earlier, the compiler does not reorder any memory operations that are followed by volatile reads and volatile reads, and the compiler does not reorder any memory operations that precede volatile writes with volatile writes. Combining these two conditions means that in order to implement the memory semantics of both volatile read and volatile writes, the compiler cannot reorder CAs with any memory operations in front and behind CAs.
Let's analyze how CAs can have both volatile read and volatile write memory semantics in common Intel x86 processors.
The following is the source code for the Compareandswapint () method of the Sun.misc.Unsafe class:
Public Final native Boolean Long Offset, int expected, int x);
You can see that this is a local method call. The native method calls the C + + code in OPENJDK in order: Unsafe.cpp,atomic.cpp and ATOMICWINDOWSX86.INLINE.HPP. The final implementation of this local method is in the following location in OpenJDK: openjdk-7-fcs-src-b147-27jun2011\openjdk\hotspot\src\oscpu\windowsx86\vm\ ATOMICWINDOWSX86.INLINE.HPP (corresponds to the Windows operating system, X86 processor). The following is a fragment of the source code corresponding to the Intel x86 processor:
//Adding a lock prefix to an instruction on MP machine//VC + + doesn ' t like the lock prefix//So we can ' t insert a label after the lock prefix.//by emitting A-lock prefix, we can define a label after it.#define LOCK_IF_MP (MP) __asm CMP MP, 0__asm JE L0 __asm _emit0xF0__asm l0:inline jint atomic::cmpxchg (jint exchange_value,volatilejint*dest, Jint compare_value) { //Alternative for InterlockedCompareExchange intMP =Os::is_mp (); __asm {mov edx, dest mov ecx, exchange_value mov eax, compare_value lock_if_mp (MP) cmpxchg dword ptr [edx] , ECX}}
As shown in the source code above, the program determines whether to add a lock prefix to the CMPXCHG directive based on the current processor type. If the program is running on a multiprocessor, add the lock prefix (lock CMPXCHG) to the cmpxchg instruction. Conversely, if the program is running on a single processor, the lock prefix is omitted (the single processor itself maintains sequential consistency within a single processor and does not require the memory barrier effect provided by the lock prefix).
The Intel manual describes the lock prefix as follows:
- Ensure that the read-change-write operation of the memory is performed atomically. In processors prior to Pentium and Pentium, instructions with a lock prefix lock the bus during execution, leaving other processors temporarily unable to access memory through the bus. Obviously, this will cost you dearly. Starting with the Pentium 4,intel Xeon and P6 processors, Intel has made a significant optimization on the basis of the original bus lock: If the area of memory to be accessed Memory) is locked in the cache inside the processor during the lock prefix instruction (that is, the cache row that contains the memory area is currently exclusive or modified), and the region is fully contained in a single cache line, and the processor executes the instruction directly. Because the cache row is locked during instruction execution, the other processor cannot read/write the memory area to which the instruction is to be accessed, thus guaranteeing the atomicity of the instruction execution. This procedure is called cache locking, and the cache lock will significantly reduce the execution overhead of the lock prefix instruction, but will still lock the bus when there is a high degree of contention between multiple processors or if the memory address of the instruction access is misaligned.
- It is forbidden to re-order the instruction with the previous and subsequent read and write instructions.
- Flushes all the data in the write buffer to memory.
The upper 2nd and 3rd have a memory barrier effect that is sufficient for both volatile read and volatile write memory semantics.
Through the above analysis, we can now finally understand why the JDK document says that CAS has both volatile read and volatile write memory semantics.
Now let's summarize the memory semantics of fair lock and non-fair lock:
- When a fair lock and an unfair lock are released, the last one is to write a volatile variable state.
- When a fair lock is acquired, the volatile variable is read first.
- When an unfair lock is acquired, the volatile variable is first updated with CAs, which has the memory semantics of both volatile read and volatile write.
From the analysis of Reentrantlock in this paper, we can see that the implementation of lock release-acquired memory semantics is at least in the following two ways:
- Use the memory semantics of the write-read of the volatile variable.
- Use the memory semantics of the volatile read and volatile writes that are included with CAs.
Implementation of concurrent Package
Because Java's CAs have both volatile read and volatile write memory semantics, communication between Java threads now has the following four ways:
- A thread writes the volatile variable, and then the B thread reads the volatile variable.
- A thread writes the volatile variable, and then the B thread updates the volatile variable with CAs.
- A thread updates a volatile variable with CAS, and then the B thread updates the volatile variable with CAs.
- A thread updates a volatile variable with CAS, and then the B thread reads the volatile variable.
The CAs in Java use the high-efficiency machine-level atomic instructions available on modern processors that atomically perform read-and-write operations on memory, which is the key to achieving synchronization in a multiprocessor (essentially, a computer that supports atomic read-change-write instructions, is an asynchronous equivalent machine for calculating Turing machines sequentially, so any modern multiprocessor will support some atomic instruction that performs atomic read-and-write operations on memory. At the same time, the read/write and CAS of volatile variables can implement communication between threads. The integration of these features forms the cornerstone of the entire concurrent package. If we carefully analyze the source code implementation of the concurrent package, we will find a generalized implementation pattern:
- First, declare the shared variable to be volatile;
- Then, the synchronization between threads is realized by using the atomic condition update of CAs.
- At the same time, the communication between threads is implemented with volatile read/write and the volatile reading and writing memory semantics of CAs.
AQS, Non-blocking data structures and atomic variable classes (classes in the Java.util.concurrent.atomic package), the underlying classes in these concurrent packages, are implemented using this pattern, and the high-level classes in the concurrent package are dependent on these base classes for implementation. Overall, the concurrent package is implemented as follows:
Deep understanding of Java Memory Model (v)--lock