Java thread synchronization mechanism

Source: Internet
Author: User

We can run various computer software programs on computers. Each running program may contain multiple independent threads ).
A thread is a program that runs independently and has its own dedicated running stack. Threads may share resources with other threads, such as memory, files, and databases.
Conflicts may occur when multiple threads read and write the same shared resource at the same time. At this time, we need to introduce the thread "synchronization" mechanism, that is, there must be a first-in-one between all threads, and we cannot rush together for a group.
Synchronization is translated from synchronize. I don't know why I should use this word that is easy to misunderstand. Since we all use this, we have to do that.
The real meaning of thread synchronization is the opposite to that of the word surface. The real meaning of thread synchronization is actually "Queuing": Several threads need to queue up to operate the shared resources one by one, rather than simultaneously.

Therefore, the first thing to remember about thread synchronization is that thread synchronization is thread queuing. Synchronization means queuing. The purpose of thread synchronization is to avoid "synchronous" thread execution. This is really a boring tongue twister.
The second point to keep in mind about thread synchronization is the word "sharing. Only read/write access to shared resources needs to be synchronized. If resources are not shared, there is no need for synchronization.
The third point that needs to be kept in mind for thread synchronization is that only "variables" require synchronous access. If the shared resources are fixed and unchanged, it is equivalent to a "constant", and the thread does not need to synchronize the read constants at the same time. At least one thread modifies shared resources. In this case, synchronization is required between threads.
The fourth point to keep in mind about thread synchronization is that the Code accessed by multiple threads to share resources may be the same code or different code; whether or not the same code is executed, as long as the code of these threads accesses the same variable shared resource, these threads need to be synchronized.

For better understanding, the following are several examples.
There are two purchasers who share the same work content and follow the steps below:
(1) go to the market to find and purchase potential samples.
(2) return to the company and write a report.
Both of them have the same work content. They both need to purchase samples. They may buy the same kind of samples, but they will never purchase the same sample, and there is no shared resource between them. Therefore, they can do their jobs without interfering with each other.
The two purchasers are equivalent to two threads. The two purchasers follow the same steps, which is equivalent to executing the same code.

The following is a step for the two purchasers. Purchasers need to arrange their work plans based on the information posted on the company's "bulletin board.
The two purchasers may go to the front of the bulletin board and view the information on the bulletin board. There is no such problem. Because the bulletin board is read-only, neither of the two purchasers will modify the information written on the bulletin board.

Add a role. At this time, an office administrator also walked to the front of the bulletin board to modify the information on the bulletin board.
If the Administrator arrives at the bulletin board and is modifying the content of the bulletin board. The two purchasers arrived at the same time. The two purchasers can view the modified information only after the administrative staff completes the modification.
When the administrative staff arrive, the two purchasers are already watching the bulletin board. Therefore, the administrative staff must wait for the two purchasers to record the current information before writing new information.
In the above two cases, the access to the bulletin board by the administrative staff and the purchasers needs to be synchronized. Because one of the threads (administrators) modified the shared resources (Bulletin Board ). In addition, we can see that the administrative staff's workflow and the purchaser's workflow (Execution Code) are completely different, but because they access the same variable shared resource (Bulletin Board ), therefore, synchronization is required between them.

Synchronization lock

We have discussed why thread synchronization is required. Next we will look at how to synchronize threads.
The basic implementation of thread synchronization is easy to understand. We can add a lock to the shared resources, which only has one key. The thread that obtains the key has the right to access the shared resource.
In our lives, we may also encounter such an example. Some supermarkets provide some automatic storage boxes. Each storage box has a lock and a key. People can use storage boxes with keys, put things in the storage box, lock the storage box, and then take the key away. In this way, the storage box is locked and no one else can access it. (Of course, the real storage box key can be copied, so do not put valuables in the supermarket storage box. As a result, many supermarkets adopt electronic password locks .)
The thread synchronization lock model looks intuitive. However, there is another serious problem that cannot be solved. Where should this synchronization lock be applied?
Of course, it is added to shared resources. Readers who are quick to respond will answer the question first.
If possible, we should try to add the synchronization lock to the shared resources. Some well-developed shared resources, such as file systems and database systems, provide a complete synchronization lock mechanism. We do not need to lock these resources. These resources are locked by ourselves.
However, in most cases, the shared resources we access in the Code are relatively simple shared objects. There is no place in these objects for us to lock.
The reader may advise: why not add a new area inside each object to lock it? This design is also feasible theoretically. The problem is that thread synchronization is not very common. If a lock space is opened inside all objects due to this small probability event, it will cause a great waste of space. The loss is not worth the candle.
Therefore, the design philosophy of modern programming languages is to add the synchronization lock to the code segment. Specifically, the synchronization lock is added to the "code segment for accessing shared resources. Remember that the synchronization lock is added to the Code segment.
When the synchronization lock is added to the Code segment, the above space waste problem is well solved. However, it increases the complexity of the model and the difficulty of our understanding.
Now let's take a closer look at"Add the synchronization lock to the code segment.
First, we have solved the problem of synchronization lock addition. We have determined that the synchronization lock is not added to the shared resource, but to the code segment that accesses the shared resource.
Second, the problem we need to solve is what kind of lock should we add to the code segment. This issue is the focus. This is a special issue: the same synchronization lock should be added when accessing different code segments of the same shared resource; if different synchronization locks are applied, therefore, synchronization cannot be implemented at all, and it has no significance.
This means that the synchronization lock itself must also be a shared object between multiple threads.

Synchronized keyword in Java

For better understanding, we will give several Sample Code Synchronization examples.
The synchronization lock models for different languages are the same. But the expressions are somewhat different. Here we take the most popular Java language as an example. In Java, the synchronized keyword is used to lock the code segment. The overall syntax format is as follows:
Synchronized (synchronization lock ){
// Access shared resources, code segment to be synchronized
}

The synchronization lock must be a shared object.

... F1 (){

Object lock1 = new object (); // generates a synchronization lock

Synchronized (lock1 ){
// Code snippet
// Access the shared resource resource1
// Synchronization required
}
}

The above Code does not make any sense. Because the synchronization lock is generated inside the function body. When each thread calls this code, a new synchronization lock is generated. Different synchronization locks are used between multiple threads. Synchronization cannot be achieved at all.
The synchronization code must be written as follows to make sense.

Public static final object lock1 = new object ();

... F1 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code snippet
// Access the shared resource resource1
// Synchronization required
}

You do not have to declare the synchronization lock as static or public, but you must ensure that the same synchronization lock must be used between the related synchronization code.
At this point, you will be curious about what the synchronization lock is. Why can an object be declared as a synchronization lock?
In Java, the concept of synchronization lock is like this. Any object reference can be used as a synchronization lock. We can understand the object reference as the memory address of the object in the memory allocation system. Therefore, to ensure that the synchronized code segments use the same synchronization lock, we must ensure that the synchronized Keywords of these synchronized code segments use the same object reference and the same memory address. This is why I used the final keyword when declaring lock1 in the previous Code. This is to ensure that the object reference of lock1 remains unchanged throughout the system operation.
Some curious readers may want to continue to learn more about the actual operating mechanism of synchronzied. In Java Virtual Machine specifications (you can search by keyword such as "JVM spec" on Google), there is a detailed explanation of the synchronized keyword. Synchronized will be compiled into monitor enter ,... Monitor exit and other command pairs. Monitor is the actual synchronization lock. Each object reference corresponds to a monitor.
These implementation details are not the key to understanding the synchronization lock model. Let's continue to look at several examples to deepen our understanding of the synchronization lock model.

Public static final object lock1 = new object ();

... F1 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code snippet
// Access the shared resource resource1
// Synchronization required
}
}

... F2 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code segment B
// Access the shared resource resource1
// Synchronization required
}
}

In the above code, code snippet A and B are synchronized. Because they use the same synchronization lock lock1.
If 10 threads execute code segment A at the same time and 20 threads execute code segment B at the same time, the 30 threads must be synchronized.
All 30 threads compete for a synchronization lock lock1. At the same time, only one thread can obtain the ownership of lock1. only one thread can execute code segment A or code segment B. Other threads that fail to compete can only stop running and enter the ready queue of the synchronization lock.
Several thread queues are attached to each synchronization lock, including ready queues and waiting queues. For example, the ready queue corresponding to lock1 can be called lock1-ready queue. Each queue may have multiple threads that pause running.
Note that the thread that fails to compete for the synchronization lock enters the ready queue instead of the waiting queue (waiting queue, it can also be translated into a waiting queue ). The threads in the ready queue are always ready to compete for synchronization locks and are always ready to run. The threads in the waiting queue can only wait until a signal is notified before being transferred to the ready queue for running.
The thread that successfully acquires the synchronization lock. After the synchronization code segment is executed, the synchronization lock is released. Other threads in the ready queue of the synchronization lock will continue to compete for the next synchronization lock. The winner can continue to run, and the loser must stay in the ready queue.
Therefore, thread synchronization is a resource-consuming operation. We should try to control the scope of code segments for thread synchronization. The smaller the scope of the synchronized code segment, the better. We use the term"Synchronization GranularityTo indicate the scope of the synchronized code segment.
Synchronization Granularity
In Java, we can directly add the synchronized keyword to the function definition.
For example.
... Synchronized... F1 (){
// F1 code segment
}

This code is equivalent
... F1 (){
Synchronized (this) {// The synchronization lock is the object itself
// F1 code segment
}
}

The same principle applies to static functions.
For example.
... Static synchronized... F1 (){
// F1 code segment
}

This code is equivalent
... Static... F1 (){
Synchronized (class. forname (...)) {// Synchronization lock is the class definition itself
// F1 code segment
}
}

However, we should try to avoid this kind of laziness that directly adds synchronized to the function definition. Because we need to control the synchronization granularity. The smaller the synchronized code segment, the better. The smaller the scope of synchronized control, the better.
We should not only narrow down the length of the synchronization code segment, but also pay attention to the subdivided synchronization lock.
For example, the following code

Public static final object lock1 = new object ();

... F1 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code snippet
// Access the shared resource resource1
// Synchronization required
}
}

... F2 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code segment B
// Access the shared resource resource1
// Synchronization required
}
}

... F3 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code snippet C
// Access the shared resource resource2
// Synchronization required
}
}

... F4 (){

Synchronized (lock1) {// lock1 is a public synchronization lock
// Code segment D
// Access the shared resource resource2
// Synchronization required
}
}

The above four synchronization codes use the same synchronization lock lock1. All threads that call any piece of code in four sections of code need to compete for the same synchronization lock lock1.
We analyzed it carefully and found that this was unnecessary.
Because the code segments A and F2 () of F1 () access Shared resources are resource1, F3 () code segments C and F4 () the shared resource accessed by code snippet D is resource2, and they do not have to compete for the same synchronization lock lock1. We can add a synchronization lock lock2. The code for F3 () and F4 () can be modified:
Public static final object lock2 = new object ();

... F3 (){

Synchronized (lock2) {// lock2 is a public synchronization lock
// Code snippet C
// Access the shared resource resource2
// Synchronization required
}
}

... F4 (){

Synchronized (lock2) {// lock2 is a public synchronization lock
// Code segment D
// Access the shared resource resource2
// Synchronization required
}
}

In this way, F1 () and F2 () will compete for lock1, while F3 () and F4 () will compete for lock2. In this way, the two locks can be divided separately to greatly reduce the probability of synchronization locks competition, thus reducing the overhead of the system.

Semaphores

The synchronization lock model is only the simplest Synchronization Model. At the same time, only one thread can run the synchronization code.
Sometimes we want to process more complex synchronization models, such as producer/consumer models and read/write synchronization models. In this case, the synchronization lock model is not enough. We need a new model. This is the semaphore model.
The semaphore model works in the following way: When a thread is running, it can stop and wait for the notification of a semaphore. At this time, the thread enters the waiting of the semaphore) queue; wait until the notification is sent before running.
In many languages, synchronization locks are represented by special objects. The object name is usually monitor.
Similarly, in many languages, semaphores usually have special object names, such as mutex and semphore.
The semaphore model is much more complex than the Synchronous lock model. In some systems, semaphores can even be synchronized across processes. In addition, some semaphores even have the counting function to control the number of concurrent threads.
We do not need to consider such a complex model. All the complex models are derived from the most basic models. As long as you have mastered the most basic semaphore model-the "Wait/notification" model, the complex model can be solved.
Let's take Java as an example. The synchronization lock and semaphore concepts in the Java language are very vague. There is no special object term to indicate the synchronization lock and semaphore. There are only two keywords related to the synchronization lock-volatile and synchronized.
Although this ambiguity leads to unclear concepts, it also avoids various misunderstandings caused by terms such as monitor, mutex, and semphore. We don't have to stick to the term contention, so we can focus on understanding the actual operating principles.
In Java, any object reference can be used as a synchronization lock. In the same way, any object reference can also be used as a semaphore.
The wait () method of the object is to wait for the notification, and the notify () method of the object is to send the notification.
The specific call method is
(1) Wait for the notification of a semaphore
Public static final object signal = new object ();

... F1 (){
Synchronized (singal) {// first we need to obtain this semaphore. This semaphore is also a synchronization lock.

// This code can be entered only after the signal and synchronization lock are successfully obtained.
Signal. Wait (); // here the semaphore is to be discarded. This thread is going to enter the waiting queue of the signal semaphore

// Poor. The semaphores that have worked so hard to get their hands are abandoned.

// Wait until the notification is sent, and switch from the waiting queue to the ready queue
// Switch to the ready queue. If you are one step closer to the CPU core, you will have the opportunity to continue executing the following code.
// You still need to compete for the signal synchronization lock to continue executing the following code. Destiny.
...
}
}

Note the meaning of signal. Wait () in the above Code. Signal. Wait () can easily cause misunderstanding. Signal. wait () does not mean that signal starts wait, but that the current thread running this code starts wait, which is to enter the waiting queue of the signal object.

(2) Send a semaphore notification
... F2 (){
Synchronized (singal) {// first, we also need to obtain this semaphore. It is also a synchronization lock.

// This code can be entered only after the signal and synchronization lock are successfully obtained.
Signal. Y (); // here, we notify a thread in the signal waiting queue.

// If a thread waits for this notification, the thread will be transferred to the ready queue.
// But this thread continues to have the signal synchronization lock, and this thread continues to execute
// Hey, although this thread kindly notifies other threads,
// However, this thread can be less elegant and give up the synchronization lock
// This thread continues to execute the following code
...
}
}

It should be noted that signal. Y () means. Signal. Y () does not notify the signal object itself. Instead, it notifies other threads waiting for the signal semaphore.

The preceding figure shows the basic usage of wait () and Policy () of the object.
In fact, wait () can also define the wait time. When the thread is in the waiting queue of a semaphore, wait for a long enough time to wait until there is no waiting, you have moved from the waiting queue to the ready queue.
In addition, there is a yyall () method that notifies all threads in the waiting queue.
These details do not affect the overall situation.

Green thread

Green thread is a concept relative to the operating system thread (native thread.
The operating system thread (native thread) means that the threads in the program will be mapped to the operating system threads, and the running and scheduling of threads are controlled by the operating system.
The green thread means that the threads in the program are not actually mapped to the operating system threads, but are scheduled by the language running platform itself.
The threads in the current version of python can be mapped to the operating system threads. In the current version of Ruby, the threads are green threads and cannot be mapped to the operating system threads. Therefore, the running speed of Ruby threads is slow.
Is the green thread slower than the operating system thread? Of course not. In fact, the situation may be the opposite. Ruby is a special example. The thread scheduler is not very mature.
Currently, the popular implementation model of threads is green threads. For example, stackless Python introduces a more lightweight green thread concept. In terms of concurrent Thread Programming, both running speed and concurrent load are superior to Python.
Another more famous example is Erlang, an open-source language developed by Ericsson ).
Erlang has a very thorough concept of green threads. The Erlang thread is not called thread, but process. This can easily be confused with processes. Note the difference here.
Synchronization is not required between Erlang processes. Because all the variables in Erlang are final, the variable value cannot be changed. Therefore, synchronization is not required at all.
Another benefit of final variables is that objects cannot be cross-referenced and cannot form a ring Association. The associations between objects are unidirectional and tree-like. Therefore, the efficiency of memory garbage collection algorithms is also very high. This enables Erlang to achieve soft real time (soft real time. This is not easy for a language that supports memory garbage collection.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.