Characteristics of volatile
When we declare that the shared variable is volatile, the read/write to this variable will be very special. A good way to understand the volatile characteristics is to use a single read/write of the volatile variable as a synchronization of these individual read/write operations using the same monitor lock. Let's take a look at the sample code below for a concrete example:
classVolatilefeaturesexample {volatile LongVL = 0L;//declaring a 64-bit long variable with volatile Public voidSetLongl) {VL= L;//the write of a single volatile variable } Public voidgetandincrement () {VL++;//read/write of composite (multiple) volatile variables } Public Longget () {returnVl//reading of a single volatile variable }}
Assume that there are multiple threads calling the three methods of the above program, the program is semantically equivalent to the following program:
classVolatilefeaturesexample {LongVL = 0L;//64-bit long generic variable Public synchronized voidSetLongL) {//write to a single normal variable synchronously with the same monitorVL =l; } Public voidGetandincrement () {//Normal method Invocation Longtemp = get ();//call a synchronized Read methodtemp + = 1L;//Normal write OperationSet (temp);//calling a synchronized write method } Public synchronized Longget () {//read a single common variable to synchronize with the same monitor returnVL; }}
As shown in the example program above, a single read/write operation on a volatile variable is synchronized with the same monitor lock used for a read/write operation on a common variable, and they perform the same effect.
The Happens-before rule of the monitor lock guarantees the memory visibility between releasing the monitor and getting the monitor's two threads, which means that reading a volatile variable will always see (any thread) the last write to the volatile variable.
The semantics of the monitor lock determines that the execution of the critical section code is atomic. This means that even a long and double variable of 64 bits, as long as it is a volatile variable, will have the atomicity to read and write to the variable. In the case of multiple volatile operations or a compound operation similar to volatile++, these operations are not atomic in nature.
In short, the volatile variable itself has the following characteristics:
- Visibility. To read a volatile variable, you can always see (any thread) the last write to the volatile variable.
- Atomicity: The read/write of any single volatile variable is atomic, but a composite operation similar to volatile++ is not atomic.
Volatile write-read established happens before relationship
As described above is the nature of the volatile variable itself, for programmers, the impact of volatile on the memory visibility of the thread is more important than the characteristics of volatile itself, but also need our attention.
Starting with JSR-133, the write-read of volatile variables allows for communication between threads.
In terms of memory semantics, volatile has the same effect as a monitor lock: volatile writes and the release of the monitor have the same memory semantics, and volatile reads have the same memory semantics as the capture of the monitor.
Take a look at the sample code that uses the volatile variable below:
classVolatileexample {intA = 0; volatile BooleanFlag =false; Public voidwriter () {a= 1;//1Flag =true;//2 } Public voidReader () {if(flag) {//3 inti = A;//4...} }}
Assuming thread A executes the writer () method, thread B executes the reader () method. According to the happens before rule, the happens before relationship established by this process can be divided into two categories:
- According to the Rules of Procedure order, 1 happens before 2; 3 happens before 4.
- According to the volatile rule, 2 happens before 3.
- According to the transitive rules of happens before, 1 happens before 4.
The graphical representation of the above happens before relationship is as follows:
In, each arrow links the two nodes that represent a happens before relationship. The black arrows represent the program order rules, the orange arrows represent the volatile rules, and the blue arrows indicate the happens before guarantees provided after the rules are combined.
Here a thread writes a volatile variable, and the B thread reads the same volatile variable. A thread all visible shared variables before the volatile variable is written, and immediately becomes visible to the B thread after the B thread reads the same volatile variable.
Volatile write-Read memory semantics
The memory semantics for volatile writes are as follows:
- When a volatile variable is written, jmm flushes the shared variable in the local memory corresponding to the thread to main memory.
Take the example program Volatileexample above as an example, assuming that thread a first executes the writer () method, then thread B executes the reader () method, and the initial two-thread local memory flag and a are in the initial state. Is the state of the shared variable after thread A performs a volatile write:
As shown, after thread A writes the flag variable, the values of the two shared variables that are updated by thread A in local memory A are flushed to main memory. At this point, the values of the shared variables in local memory A and main memory are consistent.
The memory semantics for volatile reads are as follows:
- When a volatile variable is read, JMM will place the local memory corresponding to that thread as invalid. The thread next reads the shared variable from the main memory.
The following is the state of a shared variable after thread B reads the same volatile variable:
As shown, local memory B has been invalidated after reading the flag variable. At this point, thread B must read the shared variable from main memory. The read operation of thread B will cause the values of local memory B and shared variables in main memory to become consistent.
If we combine the two steps of volatile writing and volatile reading, reading thread B reads a volatile variable, the value of all shared variables visible to thread A before writing the volatile variable will immediately become visible to read thread B.
The following is a summary of the memory semantics for volatile and volatile reads:
- Thread A writes a volatile variable, essentially thread A sends a message to a thread that is going to read the volatile variable (which modifies the shared variable).
- Thread B reads a volatile variable, essentially thread B receives a message from a previous thread (modified to a shared variable before writing the volatile variable).
- Thread A writes a volatile variable, and then thread B reads the volatile variable, which is essentially a thread A sends a message to thread B through main memory.
Implementation of volatile memory semantics
Let's take a look at how JMM implements volatile write/read memory semantics.
In the previous article we mentioned that the overloaded sort is divided into compiler reordering and handler reordering. In order to implement volatile memory semantics, JMM restricts the reordering types of these two types separately. The following is a list of volatile reordering rules jmm for the compiler:
Is it possible to reorder |
A second action |
First action |
General Read/write |
Volatile read |
Volatile write |
General Read/write |
|
|
NO |
Volatile read |
NO |
NO |
NO |
Volatile write |
|
NO |
NO |
For example, the last cell in the third row means: In program order, when the first operation is read or write of a normal variable, if the second action is volatile, the compiler cannot reorder the two operations.
From the table above we can see:
- When the second operation is volatile, no matter what the first action is, it cannot be reordered. This rule ensures that operations before volatile writes are not sorted by the compiler until after the volatile write.
- When the first action is volatile read, no matter what the second action is, it cannot be reordered. This rule ensures that operations after the volatile read are not sorted by the compiler until the volatile read.
- When the first operation is volatile, the second operation is volatile and cannot be reordered.
To implement volatile memory semantics, the compiler inserts a memory barrier in the instruction sequence to suppress a particular type of handler reordering when generating bytecode. For the compiler, it is almost impossible to find an optimal placement to minimize the number of insertion barriers, so JMM takes a conservative approach. The following is a conservative policy-based JMM memory barrier insertion strategy:
- Insert a storestore barrier in front of each volatile write operation.
- Insert a storeload barrier behind each volatile write operation.
- Insert a loadload barrier behind each volatile read operation.
- Insert a loadstore barrier behind each volatile read operation.
The memory barrier insertion strategy described above is very conservative, but it guarantees correct volatile memory semantics on any processor platform and in any program.
Here is the sequence of instructions generated by the volatile write inserted into the memory barrier under the Conservative policy:
The Storestore barrier in the process ensures that any normal write operations preceding it are already visible to any processor before volatile writes. This is because the Storestore barrier will ensure that all of the above normal writes are flushed to the main memory before volatile writes.
What's interesting here is the storeload barrier behind the volatile writing. The purpose of this barrier is to avoid volatile writes and re-ordering of possible volatile read/write operations. Because the compiler often cannot determine exactly after a volatile write, whether to insert a storeload barrier (for example, a volatile write method immediately return). To ensure the correct implementation of volatile memory semantics, JMM has adopted a conservative strategy here: Insert a storeload barrier behind each volatile write or before each volatile read. From the perspective of overall execution efficiency, JMM has chosen to insert a storeload barrier after each volatile write. Because the common usage pattern for volatile write-read memory semantics is: A write thread writes a volatile variable, and multiple read threads read the same volatile variable. When the number of read threads significantly exceeds the write thread, choosing to insert the storeload barrier after volatile writes will result in a considerable increase in execution efficiency. From here we can see that JMM is a feature of implementation: first of all to ensure correctness, and then to pursue the implementation of efficiency.
Below is the sequence of instructions generated by the volatile read insert memory barrier under the Conservative policy:
The Loadload barrier is used to prohibit the processor from using the above volatile reading with the following normal read order. The Loadstore barrier is used to prohibit the processor from using the above volatile reads with the following normal write-down ordering.
The memory barrier insertion strategy described above for volatile write and volatile reads is very conservative. In actual execution, the compiler can omit unnecessary barriers as long as the volatile write-read memory semantics are not changed. Here we illustrate with specific sample code:
classVolatilebarrierexample {intA; volatile intV1 = 1; volatile intV2 = 2; voidReadAndWrite () {inti = v1;//first volatile Read intj = v2;//a second volatile readA = i + j;//General WriteV1 = i + 1;//The first volatile writeV2 = J * 2;//a second volatile write} ...//Other methods}
For the ReadAndWrite () method, the compiler can do the following optimizations when generating bytecode:
Note that the final storeload barrier cannot be omitted. Because the second volatile is written, the method immediately return. At this point the compiler may not be able to determine exactly if there will be volatile read or write, for security reasons, the compiler will often insert a storeload barrier here.
The above optimization is for any processor platform, because different processors have different "tightness" of the processor memory model, memory barrier insertion can also be based on the specific processor memory model continue to optimize. In the case of the x86 processor, the other barriers are omitted except for the final storeload barrier.
The volatile read and write under the previous conservative strategy, the x86 processor platform can be optimized to:
As mentioned earlier, the x86 processor will only reorder write-read operations. X86 does not reorder read-read, read-write, and write-write operations, so the memory barrier corresponding to these three types of operations is omitted from the x86 processor. In x86, JMM only needs to insert a storeload barrier after the volatile write to correctly implement the volatile write-read memory semantics. This means that in x86 processors, the cost of volatile writes is much higher than that of volatile reads (because the overhead of executing the storeload barrier is larger).
JSR-133 why to enhance the volatile memory semantics
In the old Java memory model prior to JSR-133, although reordering between volatile variables is not allowed, the old Java memory model allows for reordering between volatile variables and normal variables. In the old memory model, the Volatileexample sample program may be reordered to perform the following timings:
In the old memory model, when there is no data dependency between 1 and 2, 1 and 2 can be reordered (similar to 3 and 4). The result is that read thread B performs 4 o'clock and does not necessarily see write thread A modifying the shared variable at 1 o'clock.
So in the old memory model, the volatile write-read did not release the Monitor-the memory semantics that were given. To provide a mechanism for communicating between threads that are more lightweight than monitor locks, the JSR-133 Expert Group decided to enhance the memory semantics of volatile: strictly restricting the compiler and processor reordering of volatile variables and ordinary variables, Ensure that volatile write-read and monitor release-get the same memory semantics. From the compiler reordering rules and the processor memory barrier insertion strategy, as long as the reordering between the volatile variable and the normal variable can break the volatile memory semantics, this reordering is suppressed by the compiler collation and the processor memory barrier insertion policy.
Because volatile only guarantees that read/write to a single volatile variable is atomic, the mutex execution of the monitor lock ensures that the execution of the entire critical section code is atomic. In functionality, monitor locks are more powerful than volatile, and volatile is more advantageous in scalability and execution performance. If the reader wants to replace the monitor lock with volatile in the program, be cautious.
Reference documents
- Concurrent Programming in Java™: Design principles and Pattern
- JSR 133 (Java Memory Model) FAQ
- Jsr-133:java Memory Model and Thread specification
- The JSR-133 Cookbook for Compiler writers
- Java Theory and Practice: using Volatile variables correctly
- Java theory and practice:fixing the Java Memory Model, part 2
Deep understanding of the Java memory model-volatile