Concurrent programming Java memory model + volatile keyword + happen-before rule _

Concurrent programming Java memory model + volatile keyword + happen-before rule __java

Last Update:2018-07-28 Source: Internet

Author: User

Tags static class visibility volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Preface

Landlord this title actually has a kind of death flavor, why, these three things can be divided into three articles to write, but, the landlord that these three things are highly relevant, should be in a knowledge point. Understand these things in a single study. To better understand the Java memory model and the volatile keyword and the HB principle.

The landlord is trying to tell the three questions in an article, and finally summarize. The hardware knowledge that must be reviewed before speaking concurrent knowledge. What the Java memory model is. What the Java memory model defines. What is the Happen-before principle that the Java memory model leads to. What Happen-before leads to volatile. Sum up those three. 1. The hardware knowledge that must be reviewed before speaking concurrent knowledge.

First, because we need to understand the concurrency of the Java virtual machine, and the concurrency of the physical hardware is very similar to that of the virtual machine, and many of the concurrent virtual machines are looking at the strange design caused by the design of the physical machine.

What is concurrency. Multiple CPUs are executed concurrently. But please note: Only the CPU is not, the CPU can only calculate data, then where the data come from.

Answer: Memory. The data comes from memory. You need to read the data to store the results of the calculation. Some students may say that there is no register and multi-level caching. But that is static random access memory (static Random access Memory), too expensive, SRAM in the design of the number of transistors used, higher prices, and not easy to make large capacity, only a small part of the integrated CPU into the CPU cache. All that is normally used is dynamic random access memory (dynamically Random access Memory). Intel's CPU FSB needs access to memory from the North Bridge, while AMD has no North Bridge, unlike Intel, where it communicates directly with the CPU rather than through the North Bridge, which means that the memory control component is integrated into the CPU. In theory this can speed up the transfer of CPU and memory.

Well, no matter which CPU it is, you need to read the data from memory and have a cache or register. What's the use of caching? Because of the fast speed of the CPU, memory does not keep up with the CPU, so you need to add a cache of memory and CPU to buffer the CPU data: The data to be used to copy the operation to the cache, so that the operation can be quickly executed, when the operation is completed and then synchronized from the cache to memory. This eliminates the need for the processor to wait for slow memory reading and writing.

But this leads to another problem: cache consistency (cached coherence). What does that mean.

In multiple processors, each processor has its own cache, and they share the same main memory (main Memory), which can result in inconsistent cache data when multiple processor operations involve the same main memory area. If this happens, whose cached data will be used to synchronize to main memory.

In the early CPU, the problem of cache inconsistency can be solved by adding lock# lock on the bus. Because the CPU and other parts of the communication is done through the bus, if the bus with lock# lock, that is, blocking other CPU access to other parts (such as memory), so that only one CPU can use the variable memory.

Now the CPU in order to solve the consistency problem, need each CPU access (read or write) cache time to follow some protocol: Msi,mesi,mosi,synapse,firefly,dragon Protocol, these are cache consistency protocol.

So, this time you need to say a noun: the memory model.

What is the memory model?

The memory model can be understood as a process abstraction for read-write access to a particular memory or cache under a specific operational protocol. CPUs in different architectures have different memory models, while Java Virtual machines mask differences in the different CPU memory models, which is the Java memory model.

So what is the structure of the Java memory model?

Well, as far as the memory model is concerned, we've talked about it, and the overall reason is that multiple CPUs accessing the same memory bar may result in inconsistent data. So a protocol is needed to allow these processors to comply with these protocols to ensure data consistency when accessing memory. There is one more question. pipelined execution and disorderly execution of CPU

Let's assume we now have a piece of code:

int a = 1;
int b = 2;
int C = a + B;

The above code we can not move in order and the result will not change it. Yes, there is no problem with the first and second line swaps.

In fact, the CPU sometimes swaps the code in order to optimize performance (in the context of guaranteed results), and the jargon is called reordering. Why reordering can optimize performance.

This is a bit complicated, we said slowly.

We know that the execution of an instruction can be divided into a number of steps, to put it simply, it can be divided into the following steps:
1. Take the indication IF
2. Decoding and taking register operand ID
3. Execution or effective address calculation EX
4. Memory back to MEM
5. Write back to WB

Our Assembly instructions are not implemented in one step, in the actual work of the CPU, he also needs to be divided into several steps to execute, each step involved in the hardware may also be different, for example, when the reference will be used to the PC register and memory, decoding will use the instruction Register group, the implementation will use the ALU, A register group is required to write back.

That is, because each step can be done with different hardware, the CPU engineers invent pipelining to execute instructions. What does that mean.

If you need to wash the car, the car wash will carry out the "car Wash" command, however, the car wash shop will be separate operation, such as flushing, foam, washing, drying, waxing, etc., this write action can be done by different employees, do not need an employee to take the execution, the rest of the staff waiting there, therefore, Each employee is assigned a task that is executed and handed over to the next employee, just like the assembly line in the factory.

The CPU does the same thing when it executes instructions.

Since the pipeline implementation, then the pipeline must not be interrupted, otherwise, a local interruption will affect all downstream components execution efficiency, performance loss is very large.

So what to do. For example, 1 flush, 2 dozen foam, 3 scrub, 4 dry, 5 waxing was originally performed in order. If there is no water at this time, then the action behind the flush will be affected, but, in fact, we can let the flush first to fetch water, and foam to change the position, so that we will first play foam, flush the water will be taken at this time, until the first car foam finished, flush the back, and continue to rush, Does not affect the work. This time the order becomes:

1 dozen foam, 2 flush, 3 scrub, 4 dry, 5 waxing.

But the work is not affected at all. The assembly line was also not broken. The execution of the chaos in the CPU is similar to the truth. The ultimate goal is to squeeze the CPU's performance.

Well, for today's article need hardware knowledge, we have reviewed the almost. To sum up, the main point is 2:
1. CPU's multilevel cache access main memory needs to match the cache consistency protocol. This process can be abstracted as a memory model.
2. CPU for performance will allow the command pipeline execution, and will be in the implementation of a single CPU without confusion in the execution structure.

So, the next step is to talk about the Java memory model. 2. What is the Java memory model?

Recall the above content, we say from the hardware level what is the memory model.

The memory model can be understood as a process abstraction for read-write access to a particular memory or cache under a specific operational protocol. CPUs of different schemas have different memory models.

Java, as a cross-platform language, is sure to mask the differences between the different CPU memory models and construct its own memory model, which is the Java memory model. In fact, the source comes from the memory model of the hardware.

Or look at this picture, the Java memory model and the hardware of the memory model is almost the same, each thread has its own working memory, similar to the CPU cache, and Java main memory equivalent to the hardware of the memory bar.

The Java memory model also abstracts the process of thread access to memory.

The JMM (Java memory model) stipulates that all variables are stored in main memory (this is important), including instance fields, static fields, and elements that make up the data object, but not local variables and method parameters, because the latter is thread-private. will not be shared. Naturally there is no competition problem.

What is working memory? Each thread has its own working memory (this is important), the thread's working memory holds the variable used by the thread and the copy of the main memory copies, and all operations of the variable (read and write) by the thread must be performed in working memory. You cannot read and write directly to variables in main memory. Variables in each other's working memory are not accessible between different threads. The transfer of variable values between threads needs to be done through main memory.

To sum up, the Java memory model defines two important things, 1. Main memory, 2. Working memory. The working memory of each thread is independent, and the thread manipulation data can only be computed in working memory and then brushed into main memory. This is the basic thread-working method defined by the Java memory model. 3. What the Java memory model defines.

In fact, the entire Java memory model was built around 3 features. These three features are the foundation of the entire Java concurrency.

Atomicity, visibility, order. Atomic Sex (atomicity)

What is atomic, in fact, the atomic and the atomic nature of the transaction process is basically the same definition. Refers to an operation that is not interrupted, indivisible. Even when multiple threads are executing together, an operation is not disturbed by other threads once it has started.

We can roughly assume that access to the basic data type is atomic (but if you calculate long and double on a 32-bit virtual machine), because the operations on long and double are not enforced to be atomic in the Java Virtual Machine specification, However, it is strongly recommended that atomic sex be used. As a result, most commercially available virtual machines are basically atomic in nature.

If the user needs to manipulate a larger range to ensure atomicity, then the Java memory model provides lock and unlock (2 of the 8 memory operations) to satisfy this requirement, but does not provide programmers with both operations that provide a more abstract monitorenter and Moniterexit Two byte code instructions, which is the Synchronized keyword. So the operations between the synchronized blocks are atomic in nature. Visibility (Visibility)

Visibility means that when a thread modifies the value of a shared variable, other threads are immediately aware of this change, and the Java memory model is to intern the visibility of the main memory as a pass-through medium by synchronizing the new value back to the main memory after the variable is modified and refreshing the variable value from the main memory before the variable is read. This is true for both the normal variable and the volatile variable. The difference is that volatile's special rules ensure that the new values are immediately synchronized to the main memory and can be refreshed from the main memory every time it is used, so it can be said that the volatile guarantees the visibility of the variables when multithreaded operations, while ordinary variables do not guarantee this.

In addition to volatile, synchronized and final can also achieve visibility. The visibility of the sync block is that the variable must be synchronized back to the main memory species (execute store, write operation) before performing a unlock operation on a variable. order (ordering)

The problem of order we said in the top of the hardware, the CPU will adjust the order of instruction, the same Java virtual machine will also adjust the bytecode sequence, but this adjustment in the single line Chengri not perceived, unless in the multithreaded program, this adjustment will bring some unexpected errors.

Java has two keywords to ensure the order of operations between multiple threads, and the volatile keyword itself contains the semantics of forbidding reordering, while synchronized is obtained by the rule that "a variable is allowed to lock only one thread at a time." This rule determines that the two synchronized blocks of the same lock can only be entered serially.

All right, here are three basic features of JMM. Do not know if we have found that volatile to ensure visibility and order, synchronized 3 characteristics are guaranteed, can be called omnipotent. and synchronized easy to use. However, still be wary of his influence on performance. 4. What is the Happen-before principle that the Java memory model leads to.

When it comes to ordering, note that we say that ordering can be done through volatile and synchronized, but we can't have all the code on these two keywords. In fact, the Java language has rules for reordering or ordering, which cannot be violated when the virtual machine is optimized.
1. Procedure order principle: Within a thread, in the order of program code, writing in the preceding operation occurs first in the operation that is written at the back.
2. Volatile rule: Volatile variable write, first occurs in read, which guarantees the visibility of volatile variables.
3. Lock rule: Unlocking (unlock) must occur before the subsequent lock (lock).
4. Transitivity: A prior to b,b C, then a must precede c.
5. The thread's Start method is preceded by each of his actions.
6. All operations of a thread precede the end of the thread.
7. The interrupt of the thread (interrupt ()) is preceded by the interrupted code.
8. The constructor of the object, ending before the Finalize method. 5. What is the volatile that happen-before leads?

We're in front of, say a lot of the volatile keyword, visible this keyword is very important, but it seems that his use frequency than synchronized
A little more, we know what this keyword can do.

Volatile can achieve the visibility of the thread, but also can achieve the order of the thread. But it is not possible to achieve atomic sex.

Let's just write a piece of code.

Package cn.think.in.java.two;

/**
 * Volatile does not guarantee atomicity and can only comply with the HP principle to ensure the order and visibility of a single thread.
 * * Public
class Multitudetest {

  static volatile int i = 0;

  Static Class Plustask implements Runnable {

    @Override public
    void Run () {for
      (int j = 0; J < 10000; J +) {
//        Plusi ();
        i++
  }

  }} public static void Main (string[] args) throws Interruptedexception {
    thread[] threads = new THREAD[10];
    for (int j = 0; J < + j) {
      threads[j] = new Thread (new Plustask ());
      Threads[j].start ();
    }

    for (int j = 0; J < j) {
      threads[j].join ();
    }

    System.out.println (i);
  }  static synchronized void Plusi () {
//    i++;
//  }

}

We started with 10 threads. + + operations on an int variable, note that the + + symbol is not atomic. The main thread then waits on the 10 threads and prints the int value after the execution finishes. You will find that no matter how the operation is not 10000, because he is not atomic. How to understand it.

i++ equals i = i + 1;

The virtual machine first reads the value of I and then adds 1 on the basis of I, note that volatile guarantees that the thread-read values are up to date and that when the thread reads I, the value is really up to date, but there are 10 threads that read it, they read the latest, and add 1 at the same time, these operations are not illegal Definition of volatile. The end result is an error, we can say that we use improperly.

The landlord also added a synchronization method to the test code, the synchronization method can guarantee the atomic sex. When the For loop executes not i++, but Plusi method, the result is accurate.

So, when do you use volatile?

The result of the operation does not depend on the current value of the variable, or the ability to ensure that only a single thread modifies the value of the variable.
Our procedure is that the result depends on the current value of I, if the atomic operation is changed: i = j, then the result will be the correct 9999.

For example, the following program is the example of using volatile:

Package cn.think.in.java.two;

/**
 * Java memory model:
 * The single thread will be reordered.
 * The following program will optimize the code (reorder) in-server mode, resulting in a perpetual loop.
 *
/public class Jmmdemo {

  //  static Boolean ready;
  static volatile Boolean ready;
  static int num;

  Static Class Readerthread extends Thread {public

    void run () {while
      (!ready) {
      }
      System.out.println (num);

    }
  }

  public static void Main (string[] args) throws interruptedexception {
    new Readerthread (). Start ();
    Thread.Sleep (1000);
    num =;
    Ready = true;
    Thread.Sleep (1000);
    Thread.yield ();
  }

This program is interesting, we use the volatile variable to control the process, the end of the correct result is 32, but note that if you do not use the volatile keyword, and the virtual machine is launched when the-server parameters, this program will never end, because he will be JIT Optimization and another thread can never see the modification of a variable (the JIT ignores code that he considers to be invalid). Of course, when you change to volatile, there's no problem.

Through the code above, we know that volatile does not guarantee atomicity, but it guarantees order and visibility. Then how to achieve it.

How to ensure the order of it. In fact, there is a lock prefix in the assembly code before and after the operation of the volatile keyword variable, which, according to the Intel IA32 Manual, is to enable the cache of this CPU to be written to memory, which can also cause other CPUs or other cores to invalidate their cache , other CPUs need to retrieve the cache again. This enables visibility. The instructions for the underlying or used CPU are visible.

How to achieve the order of it. The same is the lock instruction, which is also equivalent to a memory barrier (most modern computers are scrambling to perform in order to improve performance, making the memory barrier necessary.) Semantically, all writes before the memory barrier are written to memory, and read after the memory barrier can obtain the results of the write operation before the synchronization barrier. Therefore, for sensitive chunks, the memory barrier can be inserted after the write operation, before the read operation, which means that the following instructions cannot be reordered to the position before the memory barrier. The memory barrier is not required when only one CPU accesses memory, but if two or more CPUs are accessing the same memory, and one of them is observing the other, a memory barrier is needed to guarantee it.

Therefore, do not arbitrarily use the volatile variable, which causes the JIT to not optimize the code and inserts a lot of memory barrier instructions to degrade performance. 6. Summary

First JMM is the abstraction of the memory model of the hardware (the use of multi-level caching resulted in a cache consistency protocol), shielding the differences between the CPU and the operating system.

The Java memory model refers to the internal access procedures that are stored under a specific protocol. That is, the working memory of the thread and the direct operation order of main memory.

The JMM mainly revolves around the atomicity, the visibility, the orderliness to set up the specification.

Synchronized can implement these 3 functions, while volatile can only achieve visibility and order. Final can also be the realization of visibility.

The Happen-before principle sets out which virtual machines cannot be reordered, including the rules for locking, and the read and write rules for volatile variables.

And volatile we also said that there is no guarantee of atomicity, so use the time to pay attention. Volatile the bottom of the implementation or CPU lock instructions, by refreshing the rest of the CPU cache to ensure visibility, through the memory fence to ensure the order.

In general, these 3 concepts can be said to be closely related. They depend on each other. So the landlord put in an article to write, but this may lead to some omissions, but does not prevent us from understanding the whole concept. It can be said that JMM is the basis of all concurrent programming, if you do not understand JMM, it is impossible to efficiently concurrency.

Of course, this article is still not enough to the bottom, and did not analyze how the JVM inside the implementation, today is very late, there is a chance, we go into the JVM source to see their underlying implementation.

Good luck ....

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More