Deep understanding of JMM (Java memory model)--(v) lock

Source: Internet
Author: User
Tags cas flushes volatile

Lock Release-Get established happens before relationship

Locks are the most important synchronization mechanism in Java concurrent programming. The lock allows the thread that releases the lock to send a message to the thread that acquires the same lock, except for the exclusion of the critical zone from execution.

Here is the sample code for lock Release-get:

[Java]View Plaincopy
  1. Class Monitorexample {
  2. int a = 0;
  3. public synchronized void writer () { //1
  4. a++; //2
  5. } //3
  6. public synchronized Void Reader () { //4
  7. int i = A; //5
  8. ......
  9. } //6
  10. }

Assume that thread a executes the writer () method, and then thread B executes the reader () method. According to the happens before rule, this process contains a happens before relationship that can be divided into two categories:

    1. According to the Rules of Procedure order, 1 happens before 2, 2 happens before 3; 4 happens before 5, 5 happens before 6.
    2. According to the monitor lock rule, 3 happens before 4.
    3. According to the transitivity of happens before, 2 happens before 5.

The graphical representation of the above happens before relationship is as follows:

In, each arrow links the two nodes that represent a happens before relationship. The black arrows represent the program order rules, the orange arrows represent the monitor lock rules, and the blue arrows indicate the happens before guarantees provided after the rules are combined.

Indicates that thread a releases the lock, and then threads B acquires the same lock. In, 2 happens before 5. Therefore, thread A will immediately become visible to the B thread after all the shared variables that are visible before the lock is released, and Threads B acquires the same lock.

Memory semantics for lock release and acquisition

When a thread releases a lock, JMM flushes the shared variable in the local memory corresponding to that thread into main memory. As an example of the Monitorexample program above, the status of shared data is as follows when a thread releases a lock:

When a thread acquires a lock, JMM will place the local memory corresponding to that thread as invalid. Thus, the critical section code protected by the monitor must read the shared variable from the main memory. Here is the status of the lock acquisition:

Comparing the memory semantics of lock release-acquisition with the volatile write-read memory semantics, it can be seen that lock release has the same memory semantics as volatile writes; lock fetching has the same memory semantics as volatile read.

The following is a summary of the memory semantics of lock release and lock acquisition:

    • Thread A releases a lock, essentially a thread A sends a message (modified by thread A to the shared variable) to a thread that is going to acquire the lock.
    • Thread B acquires a lock, in essence, thread B receives a message from a previous thread (modified to a shared variable before releasing the lock).
    • Thread A releases the lock, and then thread B acquires the lock, which is essentially thread A sends a message to thread B through main memory.
Implementation of lock memory semantics

This article will use the source code of Reentrantlock to analyze the specific implementation mechanism of lock memory semantics.

Take a look at the following sample code:

[Java]View Plaincopy
  1. Class Reentrantlockexample {
  2. int a = 0;
  3. Reentrantlock lock = new Reentrantlock ();
  4. Public void writer () {
  5. Lock.lock (); //Get lock
  6. try {
  7. a++;
  8. } finally {
  9. Lock.unlock (); //Release lock
  10. }
  11. }
  12. Public void Reader () {
  13. Lock.lock (); //Get lock
  14. try {
  15. int i = A;
  16. ......
  17. } finally {
  18. Lock.unlock (); //Release lock
  19. }
  20. }
  21. }

In Reentrantlock, call the Lock () method to get the lock, and call the Unlock () method to release the lock.

The implementation of Reentrantlock relies on the Java Synchronizer Framework Abstractqueuedsynchronizer (this article is called Aqs). Aqs uses an integer volatile variable (named state) to maintain the synchronization state, and immediately we see that this volatile variable is the key to the Reentrantlock memory semantics implementation. The following is the class diagram for Reentrantlock (only the sections related to this article are drawn):

Reentrantlock is divided into fair lock and non-fair lock, we first analyze fair lock.

When using a fair lock, the method call trace for Lock Method Lock () is as follows:

    1. Reentrantlock:lock ()
    2. Fairsync:lock ()
    3. Abstractqueuedsynchronizer:acquire (int arg)
    4. Fairsync:tryacquire (int acquires)

In the 4th step to really start locking, here is the source code of the method:

[Java]View Plaincopy
  1. Protected Final boolean tryacquire (int acquires) {
  2. Final Thread current = Thread.CurrentThread ();
  3. int c = getState (); //Get the start of the lock, read the volatile variable state first
  4. if (c = = 0) {
  5. if (IsFirst (current) &&
  6. Compareandsetstate (0, acquires)) {
  7. Setexclusiveownerthread (current);
  8. return true;
  9. }
  10. }
  11. Else if (current = = Getexclusiveownerthread ()) {
  12. int NEXTC = c + acquires;
  13. if (NEXTC < 0)
  14. throw New Error ("Maximum lock count Exceeded");
  15. SetState (NEXTC);
  16. return true;
  17. }
  18. return false;
  19. }

As we can see from the source code above, the lock method reads the volatile variable state first.

When using a fair lock, the method call trace for the Unlock method unlock () is as follows:

    1. Reentrantlock:unlock ()
    2. abstractqueuedsynchronizer:release (int arg)
    3. Sync:tryrelease (int releases)

In the 3rd step to really start releasing the lock, here is the source code for the method:

[Java]View Plaincopy
  1. Protected Final boolean tryrelease (int releases) {
  2. int c = getState ()-releases;
  3. if (Thread.CurrentThread ()! = Getexclusiveownerthread ())
  4. throw new Illegalmonitorstateexception ();
  5. Boolean free = false;
  6. if (c = = 0) {
  7. free = true;
  8. Setexclusiveownerthread (null);
  9. }
  10. SetState (c); //Release lock last, write volatile variable state
  11. return free;
  12. }

From the source code above we can see that at the end of the release lock write the volatile variable state.

The fair lock writes the volatile variable state at the end of the release lock, and the volatile variable is read first when the lock is acquired. According to the volatile Happens-before rule, a shared variable that is visible before the thread that releases the lock is read before the volatile variable is written, and immediately becomes visible to the thread that acquires the lock after the thread that acquires the lock reads the same volatile variable.

Now we analyze the implementation of the memory semantics of the non-fair lock.

The release of the non-fair lock is exactly the same as the fair lock, so here only the acquisition of the unfair lock is analyzed.

When using a non-fair lock, the method call trace for Lock Method Lock () is as follows:

    1. Reentrantlock:lock ()
    2. Nonfairsync:lock ()
    3. Abstractqueuedsynchronizer:compareandsetstate (int expect, int update)

In the 3rd step to really start locking, here is the source code of the method:

[Java]View Plaincopy
    1. Protected Final boolean compareandsetstate (int expect, int update) {
    2. return Unsafe.compareandswapint (this, Stateoffset, expect, update);
    3. }

This method updates the state variable in an atomic operation, and the Java Compareandset () method call is referred to as CAs. The JDK document describes the method as follows: If the current state value equals the expected value, the synchronization state is atomically set to the given update value. This operation has volatile read and write memory semantics.
Here we analyze from the perspective of the compiler and processor, how CAS has both volatile read and volatile write memory semantics.
As we mentioned earlier, the compiler does not reorder any memory operations that are followed by volatile reads and volatile reads, and the compiler does not reorder any memory operations that precede volatile writes with volatile writes. Combining these two conditions means that in order to implement the memory semantics of both volatile read and volatile writes, the compiler cannot reorder CAs with any memory operations in front and behind CAs.
Let's analyze how CAs can have both volatile read and volatile write memory semantics in common Intel x86 processors.
The following is the source code for the Compareandswapint () method of the Sun.misc.Unsafe class:

[Java]View Plaincopy
  1. Public final native boolean compareandswapint (Object o, long offset,
  2. int expected,
  3. int x);

You can see that this is a local method call. The native method calls the C + + code in OPENJDK in turn: Unsafe.cpp,atomic.cpp and AtomicWindowsx86.inline.hpp. The final implementation of this local method is in the following position in OPENJDK: openjdk-7-fcs-src-b147-27June2011\openjdk\hotspot\src\oscpu\windows x86\vm\ atomicWindowsx86.inline.hpp (corresponds to the Windows operating system, X86 processor). The following is a fragment of the source code corresponding to the Intel x86 processor:

[CPP]View Plaincopy
  1. Adding a lock prefix to an instruction on MP machine
  2. VC + + doesn ' t like the lock prefix
  3. So we can ' t insert a label after the lock prefix.
  4. By emitting A-lock prefix, we can define a label after it.
  5. #define LOCK_IF_MP (MP) __asm cmp MP, 0 \
  6. __asm JE L0 \
  7. __asm _emit 0xF0 \
  8. __asm L0:
  9. Inline Jint atomic::cmpxchg (jint exchange_value, volatile jint* dest, Jint compare_value) {
  10. //Alternative for InterlockedCompareExchange
  11. int MP = OS::IS_MP ();
  12. __asm {
  13. mov edx, dest
  14. mov ecx, exchange_value
  15. mov eax, compare_value
  16. LOCK_IF_MP (MP)
  17. CMPXCHG DWORD ptr [edx], ECX
  18. }
  19. }

As shown in the source code above, the program determines whether to add a lock prefix to the CMPXCHG directive based on the current processor type. If the program is running on a multiprocessor, add the lock prefix (lock CMPXCHG) to the cmpxchg instruction. Conversely, if the program is running on a single processor, the lock prefix is omitted (the single processor itself maintains sequential consistency within a single processor and does not require the memory barrier effect provided by the lock prefix).

The Intel manual describes the lock prefix as follows:

    1. Ensure that the read-change-write operation of the memory is performed atomically. In processors prior to Pentium and Pentium, instructions with a lock prefix lock the bus during execution, leaving other processors temporarily unable to access memory through the bus. Obviously, this will cost you dearly. Starting with the Pentium 4,intel Xeon and P6 processors, Intel has made a significant optimization on the basis of the original bus lock: If the area of memory to be accessed Memory) is locked in the cache inside the processor during the lock prefix instruction (that is, the cache row that contains the memory area is currently exclusive or modified), and the region is fully contained in a single cache line, and the processor executes the instruction directly. Because the cache row is locked during instruction execution, the other processor cannot read/write the memory area to which the instruction is to be accessed, thus guaranteeing the atomicity of the instruction execution. This procedure is called cache locking, and the cache lock will significantly reduce the execution overhead of the lock prefix instruction, but will still lock the bus when there is a high degree of contention between multiple processors or if the memory address of the instruction access is misaligned.
    2. It is forbidden to re-order the instruction with the previous and subsequent read and write instructions.
    3. Flushes all the data in the write buffer to memory.

The upper 2nd and 3rd have a memory barrier effect that is sufficient for both volatile read and volatile write memory semantics.

Through the above analysis, we can now finally understand why the JDK document says that CAS has both volatile read and volatile write memory semantics.

Now let's summarize the memory semantics of fair lock and non-fair lock:

    • When a fair lock and an unfair lock are released, the last one is to write a volatile variable state.
    • When a fair lock is acquired, the volatile variable is read first.
    • When an unfair lock is acquired, the volatile variable is first updated with CAs, which has the memory semantics of both volatile read and volatile write.

From the analysis of Reentrantlock in this paper, we can see that the implementation of lock release-acquired memory semantics is at least in the following two ways:

    1. Use the memory semantics of the write-read of the volatile variable.
    2. Use the memory semantics of the volatile read and volatile writes that are included with CAs.
Implementation of concurrent Package

Because Java's CAs have both volatile read and volatile write memory semantics, communication between Java threads now has the following four ways:

    1. A thread writes the volatile variable, and then the B thread reads the volatile variable.
    2. A thread writes the volatile variable, and then the B thread updates the volatile variable with CAs.
    3. A thread updates a volatile variable with CAS, and then the B thread updates the volatile variable with CAs.
    4. A thread updates a volatile variable with CAS, and then the B thread reads the volatile variable.

The CAs in Java use the high-efficiency machine-level atomic instructions available on modern processors that atomically perform read-and-write operations on memory, which is the key to achieving synchronization in a multiprocessor (essentially, a computer that supports atomic read-change-write instructions, is an asynchronous equivalent machine for calculating Turing machines sequentially, so any modern multiprocessor will support some atomic instruction that performs atomic read-and-write operations on memory. At the same time, the read/write and CAS of volatile variables can implement communication between threads. The integration of these features forms the cornerstone of the entire concurrent package. If we carefully analyze the source code implementation of the concurrent package, we will find a generalized implementation pattern:

    1. First, declare the shared variable to be volatile;
    2. Then, the synchronization between threads is realized by using the atomic condition update of CAs.
    3. At the same time, the communication between threads is implemented with volatile read/write and the volatile reading and writing memory semantics of CAs.

AQS, Non-blocking data structures and atomic variable classes (classes in the Java.util.concurrent.atomic package), the underlying classes in these concurrent packages, are implemented using this pattern, and the high-level classes in the concurrent package are dependent on these base classes for implementation. Overall, the concurrent package is implemented as follows:




Reference documents
      1. Concurrent Programming in Java:design principles and Pattern
      2. JSR 133 (Java Memory Model) FAQ
      3. Jsr-133:java Memory Model and Thread specification
      4. Java Concurrency in practice
      5. Java? Platform, Standard Edition 6 API specification
      6. The JSR-133 Cookbook for Compiler writers
      7. Intel? IA-32 architecturesvsoftware Developer ' s Manual Volume 3a:system Programming Guide, Part 1
      8. The Art of multiprocessor programming

Deep understanding of JMM (Java memory model)--(v) lock

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.