Java multithreading guide for beginners (9): Why Data Synchronization?

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Variables in Java are classified into local variables and class variables. A local variable is a variable defined in a method, such as a variable defined in the run method. For these variables, there is no issue of sharing between threads. Therefore, they do not need to be synchronized. Class variables are defined in the class, and the scope is the entire class. Such variables can be shared by multiple threads. Therefore, we need to synchronize data for such variables.

Data Synchronization means that at the same time, only one thread can access the synchronized class variables. Only after the current thread has accessed these variables can other threads continue to access them. The access mentioned here refers to access with write operations. If the threads of all callback variables are read operations, data synchronization is generally not required.

What will happen if data is not synchronized to shared class variables? Let's first look at what will happen to the following code:

Package test;

Public class mythread extends thread
{
Public static int n = 0;

Public void run ()
{
Int M = N;
Yield ();
M ++;
N = m;
}
Public static void main (string [] ARGs) throws exception
{
Mythread = new mythread ();
Thread threads [] = new thread [100];
For (INT I = 0; I <threads. length; I ++)
Threads [I] = new thread (mythread );
For (INT I = 0; I <threads. length; I ++)
Threads [I]. Start ();
For (INT I = 0; I <threads. length; I ++)
Threads [I]. Join ();
System. Out. println ("n =" + mythread. N );
}
}

The possible result of executing the above Code is as follows:

N = 59

Many readers may find this result strange. This program starts 100 threads, and each thread adds the static variable n to 1. Finally, use the join method to output the N value after all the 100 threads have run. Normally, the result is n = 100. The result is less than 100.

In fact, the culprit for generating such results is the "dirty data" that we often mention ". The yield () Statement in the run method is the initiator of the "dirty data" generation (the "dirty data" may also be generated without the yield statement, but it is not so obvious, "Dirty data" is often generated only when 100 is changed to a larger number. In this example, yield is called to enlarge the effect of "dirty data ). The yield method is used to suspend the thread, that is, to temporarily discard the CPU resources of the thread that calls the yield method, so that the CPU has the opportunity to execute other threads. To illustrate how this program generates "dirty data", we assume that only two threads are created: thread1 and thread2. Because the start method of thread1 is called first, the run method of thread1 is generally run first. When the run method of thread1 runs to the first line (int m = N;), the value of N is assigned to M. When the yield method is executed to the second row, thread1 will temporarily stop running. When thread1 is paused, thread2 starts running after it obtains CPU resources (thread2 is always in the ready state before ), when thread2 is executed to the first line (int m = N;), since N is still 0 when thread1 is executed to yield, the value of M in thread2 is also 0. In this way, the m values of thread1 and thread2 are both 0. After they run the yield method, they all add 1 starting from 0. Therefore, no matter who executes the method first, the last n value is 1, but this n is assigned by thread1 and thread2 respectively. This process is shown as follows:

Someone may ask, if only N ++ is used, will it generate "dirty data? The answer is yes. Then, N ++ is just a statement. How can I hand over the CPU to other threads during execution? In fact, this is only a superficial phenomenon. After being compiled into an intermediate language (also called bytecode) by the Java compiler, N ++ is not a language. Let's see what kind of Java intermediate language the following Java code will be compiled.

JavaSource code

Public void run ()
{
N ++;
}

Compiled intermediate language code

001 public void run ()
002 {
003 aload_0
004 DUP
005 getfield
006 iconst_1
007 iadd
008 putfield
009 return
010}

We can see that there is only N + + statements in the run method, but after compilation, there are 7 intermediate language statements. We don't need to know what the functions of these statements are. Just take a look at the 005th, 007, and 008 statements. In line 005, It is getfield. According to its English meaning, we can know that we want to get a value. Because there is only one n, there is no doubt that we want to get the value of N. The iadd in row 007 is not difficult to guess that it is to add 1 to the obtained N value. I think you may have guessed the meaning of putfield in line 008. It is responsible for updating the N after adding 1 back to class Variable N. Speaking of this, you may have another question: When you execute n ++, you just need to add n to 1. Why is it so costly. In fact, this involves a Java memory model.

The memory model of Java is divided into primary and working storage areas. The primary storage saves all Java instances. That is to say, after we use new to create an object, this object and its internal methods and variables are stored in this area, and N in the mythread class is saved in this area. The primary storage zone can be shared by all threads. The working storage area is the thread stack we mentioned earlier. In this area, the variables defined in the run method and the method called by the run method are saved, that is, the method variables. When the thread wants to modify the variables in the primary storage area, instead of directly modifying these variables, it copies them to the working storage area of the current thread. After the modification, overwrite the variable value in the primary storage area.

After learning about

Java

After the memory model, it is difficult to understand why

N ++

It is not an atomic operation. It must be copied and added

And the process of coverage. This process is similar to the process simulated in the mythread class. As you can imagine

Getfield

Thread1

If the thread is interrupted for some reason, the execution result of the mythread class will be similar. To completely solve this problem, you must use a certain method to synchronize N, that is, only one thread can operate N at a time, which is also called an atomic operation on N.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Java multithreading guide for beginners (9): Why Data Synchronization?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Java multithreading guide for beginners (9): Why Data Synchronization?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support