Thread Safety (top)--thoroughly understand the volatile keyword

Last Update:2018-08-20 Source: Internet

Author: User

Tags static class visibility volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For volatile this keyword, I believe many friends have heard, and even used, although the keyword is more simple to understand the literal, but it is not easy to use good up.
This article will explain the volatile in many ways, so that you can understand it more.

Why there is a thread insecurity in the computer

Since volatile is a thread-safe issue, let's start by understanding why the computer is having thread insecurity in the process of processing data.
As we all know, when the computer executes the program, each instruction is executed in the CPU, and the execution of the instruction involves the reading and writing of the data. Since the temporary data in the program is stored in main memory (physical RAM), there is a problem, because the CPU is executing fast, and the process of reading data from memory and writing data to memory is much slower than the CPU executing instructions. Therefore, if the operation of the data at any time through the interaction with the memory, it will greatly reduce the speed of instruction execution.
In order to deal with this problem, there is the concept of cache in the CPU. When the program is running, the data required by the operation is copied from main memory to the CPU cache, then the CPU can read and write data directly from its cache, and when the operation is finished, the data in the cache is flushed to main memory.
Let me give you a simple example, such as when the CPU executes the code below,

t = t + 1;

Will first see if there is a value of T in the cache, if any, then directly to use, if not, it will be read from main memory, read after the copy is stored in the cache to facilitate the next use. The cup then operates on the T plus 1, then writes the data to the cache, and finally flushes the cache data into main memory.

This process is not a problem in single-threaded operation, but it can be problematic to run on multiple threads. In multi-core CPUs, each thread may run on a different CPU, so each thread runs with its own cache (for a single-core CPU, this problem can actually occur, but is performed separately in the form of thread scheduling, this tutorial is based on multicore cups). In this case, the value of the same variable in the two cache is inconsistent.
For example:
Two threads read the value of T respectively, assuming that the value of T is 0, and the value of T is stored in the respective cache, then thread 1 to the T to add 1 operation, the value of T is 1, and the value of T is written back to main memory. However, the value of the cache in thread 2 is still 0, after adding 1, the value of T is 1, and then the value of T is written back to main memory.
At this point, there is a thread insecurity problem.

Thread-safety issues in Java

The above thread-safety issues may have different processing mechanisms for different operating systems, such as Windows operating systems and Linux operating systems that may be handled differently.
As we all know, Java is a platform-based language, so Java is a language that handles thread-safety issues with its own processing mechanisms, such as the volatile keyword, the synchronized keyword, and the mechanism for various platforms.
The Java memory model stipulates that all variables are present in main memory (similar to the physical memory mentioned earlier), and each thread has its own working memory (similar to the previous cache). All the operations of a thread on a variable must be done in working memory, not directly on main storage. And each thread cannot access the working memory of other threads.
Because each thread in Java has its own workspace, which is equivalent to the cache described above, thread-safety issues arise when multiple threads are working on a shared variable .

Here is a simple explanation of the shared variable , the above we call the T is a shared variable, that is, can be accessed by multiple threads of the variable, we called the shared variable. Shared variables in Java include instance variables, static variables, array elements. They are all stored in heap memory.

volatile keyword

There's a whole bunch of them that don't mention the role of the volatile keyword, and here's how the volatile keyword is guaranteed to be a thread-safe issue.

Visibility what is visibility?

This means that, in a multithreaded environment, if a shared variable is modified by one of the threads, the other thread can immediately know that the shared variable has been modified, and when the other thread reads the variable, it will eventually go to the memory to read it instead of reading it from its own workspace .
For example, as we said above, when thread 1 adds 1 to T and writes the data back to main memory, thread 2 will know that the T in its own workspace has been modified, and when it executes the 1 operation, it will be read in main memory. In this way, the data on both sides will be consistent.
If a variable is declared volatile, then the variable has the property of visibility . This is one of the key roles of volatile.

The principle of volatile guaranteed variable visibility

When a variable is declared volatile, the following line is added when it is translated into an instruction:

0x00bbacde: lock add1 $0x0,(%esp);

The meaning of this instruction is to perform an empty operation plus 0 in the register. However, there is a lock (lock) prefix in front of this instruction.
When the processor is handling instructions that have the lock prefix:
In the previous process, lock causes the bus that transmits the data to be locked and no other processors to access the bus, ensuring that the processor that handles the lock instruction is able to enjoy the memory area where the operation data resides without being disturbed by other processing.
However, because the bus is locked, other processors are blocked, which affects the efficiency of multiprocessor execution. To solve this problem, in later processors, the processor will not lock the bus when it encounters the lock instruction, but will check the memory area where the data resides, and if the data is in the internal cache of the processor, it will lock the cache area, write the cache back to main memory after processing, and take advantage of the cache consistency protocol to ensure the consistency of cached data in other processors.

Cache Consistency Protocol

When I was talking about visibility, I said, "If a shared variable is modified by a thread, when the other thread is going to read the variable, it will eventually read in memory instead of reading from its own workspace ," actually:
The processor in the thread will always sniff the memory address on the bus in its internal cache in the operation of the other processor, and once sniffing somewhere the processor intends to modify the value in its memory address, and the memory address is in its own internal cache, the processor will force itself to invalidate the cached address. So when the processor wants to access the data, it will go to main memory because it finds that the data it caches is invalid.

Order of

In fact, when we write the code, the virtual machine does not necessarily follow the order in which we write the code. For example, the following two lines of code:

int a = 1;int b = 2;

For these two lines of code, you will find that whether you execute a = 1 or b = 2 First, there is no effect on the final value of A/b. So when the virtual machine is compiled, it is possible to reorder them.
Why do we have to reorder them?
You know, if executing int a = 1, this code takes 100ms of time, but it takes 1ms to execute int b = 2, and which code to execute first does not affect the final value of a, A and B. That is, of course, the code that executes int b = 2 first.
Therefore, when a virtual machine is being compiled for code, it is possible to reorder the code that does not affect the value of the final variable after the order is changed.
More code compilation optimizations can be viewed in another article I wrote:
The optimization strategy of the virtual machine in the running period of the code

Does the reordering really affect the code?
In fact, after some code is reordered, there may be a thread-safety problem, although there is no effect on the value of the variable. See the code below for details

public class NoVisibility{    private static boolean ready;    private static int number;    private static class Reader extends Thread{        public void run(){        while(!ready){            Thread.yield();        }        System.out.println(number);    }}    public static void main(String[] args){        new Reader().start();        number = 42;        ready = true;    }}

Will this code eventually print 42? If there is no reordering, the print will indeed be 42, but if number = 42 and ready = True are reordered and reversed in order, then it is possible to print 0 instead of 42. (because the initial value of number is 0).
Therefore, reordering is a possible cause of thread safety problems.

If a variable is declared volatile, then the variable is not reordered, that is, the virtual opportunity guarantees that the code before the variable will be executed before it, and then the code will be slower than it executes.
For example, if the number above is declared volatile, then number = 421 will be executed first rather than Ready = true.

However, it is important to note that the virtual machine only guarantees that the code before the variable must be executed first, but there is no guarantee that the code before the variable cannot be reordered. And then the same.

The volatile keyword can guarantee the order of the code, which is also the role of the volatile keyword.
To summarize, a volatile declaration of a variable mainly has the following two characteristics to ensure thread safety.

Visibility.
Order.

Does volatile really guarantee the thread safety of a variable?

Through the above explanation, we find that the volatile keyword is very useful, not only to ensure the visibility of variables, but also to ensure the order of the Code.
So, can it really guarantee that a variable can be used correctly in a multithreaded environment?
The answer is in the negative. The reason is because the operations inside Java are not atomic .

Atomic operation

Atomic Operations : an operation, or operations, that are executed entirely and executed without interruption by any factor or execution.
That is to say, the processor to put this set of operations all finished, the middle is not allowed to be interrupted by other operations, so do not do this set of operations.
Just now, the operation in Java is not an atomic operation. Let me give an example, for example, this code

int a = b + 1;

The processor needs to handle the following three operations when processing code:

Reads the value of B from memory.
Perform a = B + 1 This operation
Write the value of a back in memory

And the three operation processors are not necessarily continuous execution, it is possible to perform the first operation, the processor will run to perform other operations.

Examples of proof that volatile does not guarantee thread safety

Because operations in Java are not atomic operations, variables that cause volatile declarations are not guaranteed to be thread-safe.
for this sentence, let me give you an example. The code is as follows:

 public class test{public static volatile" int t = 0;        public static void Main (string[] args) {thread[] threads = new THREAD[10];                for (int i = 0; i < i++) {//per thread 1000 plus 1 operation for T threads[i] new thread (new Runnable () {                        @Override public void Run () {for (int j = 0; J <; J + +) {                    t = t + 1;            }                }            });        Threads[i].start ();        }//waits for all cumulative threads to end while (Thread.activecount () > 1) {Thread.yield ();    }//Print the value of T System.out.println (t); }}

Will the final print result be 1000 * 10 = 10000? The answer is in the negative.
The problem arises in the code t = t + 1. Let's analyze it.
For example:
Thread 1 reads the value of T, if T = 0. Thread 2 then reads the value of T, at which time t = 0.
Then thread 1 performs a plus 1 operation, at which time t = 1. At this point, however, the processor has not written the value of T = 1 back into main memory. This time the processor ran to execute thread 2, note that just now thread 2 has read the value of T, so this time does not go to read the value of T, so the value of T is still 0, and then thread 2 performed a plus 1 operation on T, at this time t = 1.
At this point, there was a thread-safety problem, and two threads performed a plus 1 operation on T, but the value of T was 1. Therefore, the volatile keyword does not necessarily guarantee the security of the variable.

Under what circumstances volatile can guarantee thread safety

Just now, although the volatile keyword does not necessarily guarantee thread safety, in most cases volatile can still guarantee the thread safety of the variable. Therefore, volatile can guarantee the thread safety of a variable in the case of the following two conditions:

The result of the operation does not depend on the current value of the variable, or can ensure that only a single thread modifies the value of the variable.
Variables do not need to participate in invariant constraints with other state variables.

Speaking of which, the volatile keyword is even finished. If there is any wrong place, you are very welcome to the guidance. The Synchronize keyword should be spoken in the next article.
Finish

Reference books:

Deep understanding of Java Virtual machines (JVM advanced features and best practices).
Java is not an actual programming

If you are accustomed to reading the article, please pay attention to the public number: hard-pressed yards , get more original articles, backstage reply "gift package" send you a resource package.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More