The volatile keyword in Java

Source: Internet
Author: User

I. Related concepts of computer memory model

When the computer executes the program, each instruction is executed in the CPU, while the execution of the instruction may involve reading and writing the data. Since the temporary data is stored in main memory (physical memory) during the program running, the process of reading data from memory and writing data to memory is much slower than CPU executing instructions, so if the operation of the data is carried out by interacting with the memory at any time, Will greatly reduce the speed of instruction execution, so there is a cache from the CPU.

That is, when the program is running, the data required for operation is copied from main memory to the CPU cache, then the CPU can read the data directly from its cache and write data to it, when the operation is finished, the data in the cache is flushed to main memory. For example, the following statement:

i = i + 1;
When the thread executes this statement, it reads the value of I from main memory, then copies a copy to the cache, then the CPU executes the instruction to add 1 to I, then writes the data to the cache, and finally flushes the most recent value of I in the cache to main memory.

This code runs in a single thread without any problems, but running in multiple threads can be problematic. In a multi-core CPU, each thread may run on a different CPU, so each thread runs with its own cache (which, in the case of a single-core CPU, can actually be done separately, in the form of thread scheduling).

Assuming that there are 2 threads executing this code at the same time, I has an initial value of 0, and we mean that the value of I is changed to 2 after two threads have been executed, but is that the case? Consider the following scenario:

It is highly probable that, initially, two threads read the value of I in the cache of their respective CPUs, and then thread 1 adds 1 operations to write the latest value of I to memory. At this point in the cache of thread 2, the value of I is still 0, after adding 1, then thread 2 writes the value of I to memory.

The end result is 1, not 2. This is a well-known cache consistency issue. It is commonly said that the variable accessed by multiple threads is a shared variable . That is, if a variable exists in multiple CPUs (typically in multithreaded programming), there may be a problem with cache inconsistencies.

Ii. three concepts in concurrent programming

1. atomicity

That is, one operation or multiple operations are executed in whole and the process is not interrupted by any factor, then it is not executed.

2. Visibility

Visibility means that when multiple threads access the same variable, a thread modifies the value of the variable, and other threads can immediately see the modified value. Take a look at the following code:

Thread 1 executes the code int i = 0;i = 10; Thread 2 Executes the code j = i;
If thread 1 is CPU1, execution thread 2 is CPU2. From the above analysis, when thread 1 executes I =10 this sentence, I will first load the initial value into the CPU1 cache, and then assign a value of 10, then in the CPU1 cache I value becomes 10, but not immediately write to the main memory.
At this point, thread 2 executes j = i, it will go to main memory to read the value of I and load into the cache of CPU2, note that at this point in memory I is the value of 0, then the value of J will be 0, instead of 10. This is the visibility issue, thread 2 does not immediately see the value modified by thread 1 after thread 1 has modified the variable i.

3. Order

Ordering: The order in which the program executes is executed in the order of the Code. For a simple example, look at the following code:

int i = 0;               Boolean flag = False;i = 1;                Statement 1   flag = true;          Statement 2
The above code defines an int variable, defines a Boolean variable, and assigns two variables to each. From the Code order, statement 1 is in front of statement 2, then the JVM will actually execute this code to ensure that statement 1 must be executed before statement 2? Not necessarily, why? Command reordering (instruction Reorder) may occur here.
The following explains what is Order reordering, in general, the processor may optimize the input code in order to improve the efficiency of the program, and it does not guarantee that the execution order of the individual statements in the program is consistent with the order in the code, but it will ensure that the results of the final execution of the program and the execution of the code sequence are consistent.
For example, in the above code, statement 1 and statement 2 who first executed the final program results have no effect, then it is possible to execute the procedure, statement 2 executes first and statement 1 after execution.

Although reordering does not affect the results of program execution within a single thread, what about multithreading? Let's look at an example:

Thread 1:context = Loadcontext ();   Statement 1inited = true;             Statement 2//thread 2:while (!inited) {  sleep ()}dosomethingwithconfig (context);
In the above code, because statement 1 and statement 2 have no data dependencies, they may be reordered. If there is a reordering, thread 1 executes the first statement 2, and this is the case that the initialization of the work is done, then will jump out of the while loop, to execute the Dosomethingwithconfig (context) method, When the context is not initialized, it causes the program to fail.
As can be seen from the above, command reordering does not affect the execution of a single thread, but it affects the correctness of concurrent execution of threads.
In other words, it is necessary to ensure atomicity, visibility, and order in order for concurrent programs to execute correctly. As long as one is not guaranteed, it may cause the program to run incorrectly.

Third, Java memory model

The Java memory model stipulates that all variables are present in main memory (similar to the physical memory mentioned earlier), and each thread has its own working memory (similar to the previous cache). All the operations of a thread on a variable must be done in working memory, not directly on main storage. And each thread cannot access the working memory of other threads. In other words, in the Java memory model, there is also the problem of cache consistency and instruction reordering.

1. atomicity

In Java, read and assign operations to variables of the base data type (except long and double) are atomic, meaning that the operations are non-disruptive, either executed or not executed. There are also some atomic classes, such as Atomicinteger, Atomicboolean, and so on.

2. Visibility

For visibility, Java provides the volatile keyword to guarantee visibility.
When a shared variable is modified by volatile, it guarantees that the modified value is immediately updated to main memory, and when other threads need to read it, it will read the new value in RAM.
The common shared variable does not guarantee visibility, because when a common shared variable is modified, it is indeterminate when it is written to main memory, and when other threads go to read it may be the original old value at this time and therefore cannot guarantee visibility.
In addition, visibility is ensured through synchronized and lock, and synchronized and lock ensure that only one thread acquires the lock at the same time and executes the synchronization code, and that changes to the variable are flushed to main memory before the lock is released. Visibility can therefore be guaranteed.

3. Order

In the Java memory model, the compiler and processor are allowed to reorder instructions, but the reordering process does not affect the execution of a single-threaded procedure, but it can affect the correctness of multithreaded concurrency execution.
In Java, you can use the volatile keyword to ensure a certain "order" (the specific principle is described in the next section). It is also possible to maintain order through synchronized and lock, and it is clear that synchronized and lock ensure that each time a thread executes the synchronous code, which is the equivalent of allowing the thread to execute the synchronization code in order, naturally guaranteeing order.

Four, in-depth analysis of the volatile keyword in front of a lot of knowledge, in fact, is to tell about the volatile keyword to do the foreshadowing.

Two-tier semantics for 1.volatile keywords
Once a shared variable (a member variable of a class, a static member variable of a class) is modified by volatile, then there are two layers of semantics:

    • The visibility of this variable is ensured by a thread that modifies the value of a variable, which is immediately visible to other threads;
    • command reordering is prohibited.
Take a look at the following code:
Thread 1boolean stop = False;while (!stop) {    dosomething ();}//Thread 2stop = true;
This is a classic code, and many people may use this notation when they break a thread.    But in fact, in most cases, this code can achieve the purpose of the thread break, but in a few cases it may also lead to the failure of the thread, resulting in a dead loop. The following explains why this code may cause the thread to fail. As explained earlier, each thread has its own working memory as it runs, and when thread 1 runs, it copies a copy of the value of the stop variable into its working memory. Then when thread 2 changed the value of the stop variable, but before it could write to the main memory, thread 2 went to do something else, and thread 1 would have been looping because it did not know that thread 2 had changed the stop variable.

But it becomes different after the volatile modification:
① using the volatile keyword forces the modified value to be immediately written to main memory;
② using the volatile keyword, when thread 2 is modified, it causes thread 1 to work in-memory cache variable stop cache row is invalid (reflected to the hardware layer, that is, the CPU L1 or L2 cache corresponding cache line is invalid);
③ because thread 1 works in-memory cache variable stop cache line is invalid, so thread 1 reads the value of the variable stop again when it reads from main memory.
Then thread 2 when modifying the stop value (of course, this includes 2 operations, modify the value in the thread 2 working memory, and then write the modified value to memory), will make thread 1 in the work in-memory cache variable stop cache row is invalid, and then threads 1 read, found that their cache row is invalid, It waits for the cache line corresponding to the main memory address to be updated, and then goes to the corresponding main memory to read the latest value.
Then thread 1 reads the latest correct value.

2.volatile guarantees the visibility of variables to threads, but does not guarantee atomicity

Volatile can only guarantee that the value of the variable that is read every time is the most recent value, but volatile does not guarantee the atomicity of the operation of the variable .

To ensure atomicity, you can use the Synchronized keyword, Lock, atomic variable (Atomicinteger, etc.)

3.volatile can ensure the order of

The volatile keyword prevents order reordering, so volatile can be guaranteed to a certain degree of order.
The volatile keyword prohibit command reordering has two layers of meaning:
1) When the program performs a read or write operation to a volatile variable, the changes in its preceding operation must have been made, and the result is already visible to the subsequent operation;
2) in the case of instruction optimization, you cannot put the statements that are accessed by volatile variables behind them, and you cannot put the statements that follow the volatile variable in front of them.

Take a look at this example:

X, Y is non-volatile variable//flag is the volatile variable x = 2;        Statement 1y = 0;        Statement 2flag = true;  Statement 3x = 4;         Statement 4y =-1;       Statement 5
Because the flag variable is a volatile variable, then in the process of order reordering, the statement 3 will not be placed in statement 1, Statement 2 before the statement 3 is not put into statement 4, statement 5 after. Note, however, that the order of statement 1 and Statement 2, Statement 4, and statement 5 are not guaranteed.
And the volatile keyword guarantees that execution to the statement 3 o'clock, Statement 1 and statement 2 must be completed, and statement 1 and statement 2 execution results to statement 3, Statement 4, Statement 5 is visible.

Let's look at the example above:

Thread 1:context = Loadcontext ();   Statement 1inited = true;             Statement 2//thread 2:while (!inited) {  sleep ()}dosomethingwithconfig (context);
In the preceding example, it is mentioned that it is possible that statement 2 will be executed before statement 1, so long may cause the context to not be initialized, and thread 2 will use the uninitialized context to operate, resulting in a program error.
If the inited variable is decorated with the volatile keyword, this problem does not occur because the context is guaranteed to be initialized when executing to statement 2 o'clock.

Principle and implementation mechanism of 4.volatile

Some of the uses of the volatile keyword are described earlier, so let's look at how volatile guarantees visibility and suppresses command reordering.
The following is an excerpt from the in-depth understanding of Java Virtual machines:
"Observing the addition of the volatile keyword and the assembly code generated when the volatile keyword was not added, a lock prefix instruction is added when adding the volatile keyword"
The lock prefix instruction is actually equivalent to a memory barrier (also a memory fence), and the memory barrier provides 3 functions:
1) It ensures that the command reordering does not place the instructions behind the memory barrier and does not queue the preceding instruction behind the memory barrier, i.e., the operation in front of the memory barrier is completed when the command is executed;
2) It forces the modification of the cache to be immediately written to main memory;
3) If it is a write operation, it causes the corresponding cache line in other CPUs to be invalid.

V. Scenarios using the VOLATILE keyword

In general: When the value of a variable is modified without relying on the original value, the usage scenario for the volatile keyword is met.

Scenario 1. Get method

private volatile int value = 0;    public void SetValue (int value) {      this.value = value;  }    public int GetValue () {      return value;  }  
When the set method and the Get method are executed in two threads, because the set method's modification of value does not depend on the original value of value, you can implement thread synchronization by adding the volatile keyword before the value variable to ensure that the value obtained by the Get method is up to date.

Scenario 2. Status Marker Amount

Volatile Boolean flag = false; while (!flag) {    dosomething ()}, public void Setflag () {    flag = true;}

Volatile Boolean inited = false;//thread 1:context = Loadcontext ();   Inited = true;              Thread 2:while (!inited) {sleep ()}dosomethingwithconfig (context);

concurrency experts suggest that it makes sense to stay away from volatile, and here's a summary:

    • Volatile is raised when the synchronized performance is low. Now the efficiency of synchronized has been greatly improved, so the significance of volatile existence is small.
    • Today's non-volatile shared variables have the same effect as volatile modified variables when access is not super-frequent.
    • Volatile does not guarantee atomicity, this is not very clear, so it is easy to make mistakes.
    • Volatile can prohibit reordering.
So if we are sure that we can use volatile correctly, then it is a good usage scenario when we prohibit reordering, otherwise we don't need to use it again.


Resources:

Http://www.cnblogs.com/dolphin0520/p/3920373.html

Http://www.cnblogs.com/mengheng/p/3495379.html


The volatile keyword in Java

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.