Recently in the study of the JVM, found that with the virtual machine at the bottom of the understanding of Java multithreading also has a new understanding of the original a small synchronized keyword in the Shia. decided to put their own Java multi-threading learning into an article, from the most basic why use multi-threading, has been deeply explained to the JVM's underlying lock implementation.
The purpose of multithreading
Why use multithreading? Can be easily divided into two aspects:
- Under multiple CPU cores, the benefits of multithreading are obvious, otherwise the core of multiple CPU cores running on only one thread is wasted;
- Even if the multi-core is not considered, multithreading is also meaningful in a single core, because in some operations, such as the IO operation is blocked, the CPU does not need to participate, when the CPU can open another thread to do other things, waiting for the IO operation to go back to the previous thread to continue execution.
Problems with multithreading
In fact, the basic problem of multithreading is only one: the sharing of variables between threads
The variables in Java can be divided into 3 categories:
- class variables (variables that are static modified within a class)
- instance variables (ordinary variables inside a class)
- Local variables (variables declared in a method)
is the memory area partition diagram of the JVM:
Depending on the definition of each region, we can know:
- Class variables are saved in the "method area"
- The instance variable is saved in the heap
- Local variables are saved in the "virtual machine stack"
Both the "method area" and "heap" belong to the thread-shared data area, and the "virtual machine stack" belongs to the thread private data area.
As a result, local variables cannot be shared by multiple threads, and class and instance variables can be shared by multiple threads. In fact, in Java, the only way to communicate between multiple threads is through class variables and instance variables.
In other words, if there are no class variables and instance variables in a multithreaded program, then this multithreaded procedure must be thread-safe.
As an example of a web development servlet, in general, when we developed our own class to inherit HttpServlet, rewrite Dopost (), doget () to process the request, no matter what code we write in these two methods, as long as there is no action class variable or instance variable, The last code written is thread-safe. If you add an instance variable to the Servlet class, you are likely to have a thread-safety problem, and the workaround is to change the instance variable to the threadlocal variable, and the meaning of the threadlocal implementation is to turn the instance variable into a "thread-private" That is, each thread is assigned a value of its own.
Now we know: in fact, multithreading fundamental problem only one: the sharing of variables between threads , here variables, refers to the class variables and instance variables, all the follow-up is to solve the class variables and instance variable sharing security issues.
How to safely share a variable
The only problem now is to have multiple thread-safe shared variables (the variables in the following are generally referred to as class variables and instance variables), and a threadlocal approach is mentioned above, which is not really shared, but rather assigns a value to each thread.
For example, there is a very simple requirement, there is a class variable a=0, now start 5 threads, each thread executes a++; If you use threadlocal, the final result is that 5 threads have a value of their A, and the end result is 1, which is obviously not in line with our expectations.
So what if we don't use threadlocal? Declare a class variable a=0 directly, and then let 5 threads execute a++ respectively, so the result is still wrong, and the result is indeterminate, possibly one of the 1,2,3,4,5. This condition is called a race condition (Race Condition), to understand the race condition first to understand the Java memory model:
To understand the memory model of Java, you can mimic the model of computer hardware accessing memory. Because the computer's CPU speed and memory IO speed have several orders of magnitude difference, so modern computers have to add a layer as close as possible to the processor speed of the cache to do the buffering: The in-memory operations need to use the data first copied into the cache, when the operation is finished and then synchronized back to memory. Such as:
Because the JVM is implementing a cross-hardware platform, the JVM defines its own memory model, but since the JVM's memory model is ultimately mapped to hardware, the JVM memory model is almost identical to the hardware model:
Each Java thread has its own working memory, and when the thread accesses the variable, it cannot directly access the variables in the main memory, but instead copies the main memory variables into its own working memory, then operates the variables in its own working memory, and then synchronizes to the main memory.
It is now possible to explain why 5 threads execute a++ The final result is not necessarily 5, because a++ can be decomposed into 3-step operations:
- Copy the A in main memory to the working memory of the thread
- Thread to a a=a+1 in working memory
- Synchronize a in the working memory of a thread back to main memory
When 5 threads execute concurrently, it is entirely possible that 5 threads first performed the initial step, so that the original value of a in the working memory of 5 threads is 0, and then the result of operation in working memory is 1 after executing a=a+1, and the value of the last synchronization back to main memory is certainly 1.
The way to avoid this is to ensure that a is used by only one thread at a time when multiple threads are concurrently accessing a.
Synchronization (synchronized) is that when multiple threads concurrently access shared data, the shared data is guaranteed to be used by only one thread at a time.
The basic idea of synchronization
To ensure that shared data is used only by one thread at a time, we have a very simple implementation idea of saving a lock in the shared data , and when there is no thread access, the lock is empty, and when the first thread accesses it, it saves the thread's identity in the lock . and allows this thread to access the shared data. If another thread wants to access the shared data before the current thread releases the shared data, wait for the lock to be released.
We draw out the three key points of this idea:
- Save a lock in shared data
- Save the identity of this thread in the lock
- Other threads access locked shared data to wait for lock release
Synchronization implementation of the JVM
It can be said that the three kinds of locks in the JVM are based on the above thought, but the implementation of the "heavyweight" different, the JVM has the following three kinds of locks (from top to bottom increasingly "heavyweight"):
- Biased lock
- Lightweight lock
- Heavy-weight lock
Where the heavyweight lock is the initial locking mechanism, the bias lock and the lightweight lock are added in jdk1.6 and can be selected to turn on or off. If both the bias lock and the lightweight lock are turned on, then when using the Synchronized keyword in Java code, the JVM will attempt to use a biased lock first, and if the bias lock is not available, convert to a lightweight lock, and then to a heavyweight lock if the lightweight lock is not available. The specific conversion process will be said below.
To learn more about the 3 kinds of locks need to understand the object's memory structure (Markword header), will involve the internal storage format of bytecode, but in fact, I think out of the details of the implementation, it is very easy to understand the principle of the three lock is very simple, only need to understand the two general concepts:
Each object in the Markword:java has a uniform data structure when it is stored. Each object contains an object header, called Markword, that holds the lock information about the object.
Lock record: That is, when each thread executes, it will have its own virtual machine stack, when the call of a method is equivalent to a stack frame in the virtual machine stack, and the lock record is on the stack frame and is used to hold the lock information about the thread.
Initially the JVM did not have the first two locks (the first two were introduced by jdk1.6), only the heavyweight lock.
We have given the three points of the basic idea of synchronization, and we have said that the three types of locks in the JVM are based on the basic idea, and that the three locks are essentially the same in the 1th and 2-point implementations:
- Save a lock in shared data//java synchronization is implemented by the Synchronized keyword, synchronized has three uses: one is a synchronization block, this usage needs to indicate a lock object, and one is to modify the static method, which is equivalent to locking the class object ; One is to modify the normal method, which is equivalent to the instance object where the lock method resides. Therefore, in Java can be locked by the Synchronized keyword must be an object, so it is necessary to save a lock in the object, and the object memory structure of the Markword can be considered to be this lock. Three kinds of locks are different in implementation details, but they are all using Markword to save locks.
- The identity of the thread in the lock//bias lock is the markword that holds the thread ID, and the lightweight lock is a pointer to the Markword that holds the lock's line stacks the lock, and the heavyweight lock is a pointer to the mutex in the Markword ( Mutexes Grant exclusive access to a shared resource to only one thread, which can be thought of as recording the identity of the thread)
The key to distinguishing the three types of locks is the 3rd of the basic idea of synchronization:
3. Other threads access locked shared data to wait for lock release
The waiting lock release here is an abstract statement, and there is no strict requirement for how to wait. Because of the use of mutex, the wait here is the thread blocking. The use of mutexes guarantees concurrency security in all cases, but the use of mutexes can result in a large performance drain. And in the actual project code, it is likely that a piece of code that would otherwise not have concurrency will be locked, so that each use of the mutex will consume performance in vain. Can you first assume that the locked code will not have concurrency, and then use the mutex when it finds concurrency? The answer is yes, both lightweight and biased locks are based on this assumption.
Lightweight lock
The core idea of a lightweight lock is that "locked code does not occur concurrently, and if concurrency occurs, it expands into a heavyweight lock (the weight of an expansion-fingered lock rises and is not downgraded once it is upgraded)".
Lightweight locks rely on an operation called CAS (compare and swap), which is implemented by the underlying hardware to provide the relevant instructions:
The CAs operation requires 3 parameters, namely memory location V, old expectation A and new value B. When the CAS instruction executes, the processor updates the value of V with the new value B when and only if the current value of V conforms to the old value of a, otherwise the update is not performed. The above procedure is an atomic operation.
Lightweight Lock and lock
Assuming a lightweight lock is now turned on, when the first thread locks the object, the thread first establishes the lock record (lock recording) space in the stack frame to store the copy of the object's current Markword, The virtual machine then uses the CAS operation to attempt to update the object's Markword to a pointer to the thread lock record. If the operation succeeds, the thread obtains the object lock. If it fails, after the thread has copied the object's current Markword, another thread acquires the object lock before performing the CAS operation, and we start with the assumption that "the locked code does not occur concurrently" fails. At this point, the lightweight lock will not be directly inflated to a heavyweight lock, the thread will repeatedly retry the CAS operation sent to the lock of the holding thread to actively release the lock, after a certain number of spins if the lock has not been successfully acquired, then the lightweight lock will be expanded to a heavyweight lock: the thread that successfully acquired the lightweight lock now still holds the lock Just replaced by a heavyweight lock, other threads trying to acquire a lock go into a wait state.
Lightweight lock unlock
The unlocking of a lightweight lock is also done with CAs, if the object's Markword is still holding the lock record pointer of the lock thread, the CAs succeeds, copies the original Markword copy of the lock record back, unlocks the completion If the object's Markword is no longer a lock record pointer holding the lock thread, this lightweight lock is already being inflated by other threads for a heavyweight lock while the lock thread holds the lock, so that the thread releases the lock while it wakes (notify) the waiting thread.
Biased lock
Based on the implementation of the lightweight lock, we know that while a lightweight lock does not support "concurrency", "concurrency" is inflated to a heavyweight lock, but a lightweight lock can support multiple threads accessing the same lock object in a serial manner. For example, a thread can get the light lock of the object o First, then a to release the light-weight lock, this time the B-thread to get O's light-weight lock, it can be successfully obtained, in this way can continue serial. This serial is achieved because there is an action to release the lock. Then suppose there is a lock Java method, this method at run time actually from the beginning to the end only one thread is called, but each call is finished to release the lock, the next call will also regain the lock.
So can we make a hypothesis: "Assuming that the lock code is called from the beginning to the end, if more than one thread is found to be called, then it is not too late to inflate into a lightweight lock." This hypothesis is the core idea that favors the lock.
Core implementations
The core implementation of the biased lock is simple: Assume that a biased lock is turned on, and when the first thread attempts to acquire an object lock, the lock record lock recording is also established in the stack frame, but the lock-record space does not need to be initialized (it will be used later). Then directly using CAs to write their own thread ID into the object's Markword, if the CAS operation is successful, it acquires a biased lock. When a thread acquires a biased lock, it is the code block that executes the lock, and the lock is held without active release. Therefore, each time the thread enters this lock-related block of code, it does not need to perform any additional synchronization operations.
When there is another thread trying to acquire the lock, the revoke operation is required, which is discussed in a separate situation:
- Determines whether the thread holding the biased lock is still alive, and if the thread is not active, the biased lock is reset to a lock-free state.
- If the thread holding a bias lock is still alive and the current thread is not actually holding the lock, the bias lock is reset to a lock-free state.
- If the thread holding the lock is still alive and the current thread actually holds the lock (in the synchronous code block), then the thread attempting to obtain a biased lock waits for a global security point (Global SafePoint), at the global security point, "the thread that is trying to get a biased lock" operation "thread stack of the thread that holds the lock" , traverse all the stack frames inside the Lockrecord associated with the current lock object, modify the contents of the Lockrecord in the lightweight lock lockrecord, and then put the "oldest" (oldest) A lockrecord pointer is written to the Markword of the object, so far, as if the original has never used a biased lock, used has been a lightweight lock.
The 3rd above is based on official documents translated, read a number of books, blogs, the piece is not understood.
Here is my own understanding:
A thread that already holds a biased lock, once again entering this lock-related block of code, does not need to perform additional synchronization operations, but will still generate an empty lockrecord on the stack, so for a thread that has re-entered the object lock a few times, There are multiple Lockrecord associated with the same object in the stack.
And the JVM runtime, will record the number of locks, each re-entry, + 1, when each to unlock, the first will lock the number of times-1, only when the number of locks reduced to 0 when the real to perform lock operation. This is a reference to the explanation of the Monitorexit bytecode:
Note that a single thread can lock an object several Times-the runtime system maintains a count of the Times T Hat the object is locked by the current thread, and only unlocks the object when the counter reaches zero.
When the lock number is reduced to 0, the corresponding lock record is definitely the first lock record, that is, "the oldest", so the "oldest" lock record is required to write the pointer to the Markword of the object, so that when the implementation of the lightweight lock unlocked CAS operation can be successfully unlocked. )
Biased Lock Optimization method
From the above biased lock core implementation we can see that when access to an object lock only one thread, the biased lock is really very fast, but once the second thread access, it may be to expand into a lightweight lock, the expansion overhead is very large.
So we have an idea: if you want to give an object a bias lock, it's good to know in advance whether the object will be accessed by a single thread or multiple threads. So how do you know if an object that has not been accessed is simply accessed by a single thread? We know that each object has a corresponding class, and we can speculate on the case that the object is going to be accessed by the same object as the other object of the same class (data type).
Therefore we can bulk manipulate the biased lock of all objects under this data type from the dimension of data type:
- When the biased lock of all objects under a data type reaches a certain threshold, the bulk Rebias is triggered: all objects under the data type are reset to the initial state (that is, the thread that can get the next access revoke the lock state). If the object is holding a lock (currently in a synchronized block), performing a revoke operation on the object expands to a lightweight lock.
- When the number of bulk Rebias executed under a data type reaches a certain threshold, the bulk revocation is triggered, and the biased lock of all objects under that data type is inflated to a lightweight lock, and the data generated in the future The instance object of type is disabled by default for the biased lock.
Summarize
In fact, aside from the implementation details, Java multithreading is simple:
The main problem of Java multithreading is thread safety problem--"
Thread-safety issues are caused by communication between threads, and there is no thread-safety problem between multiple threads-"
Java thread traffic can only be passed through class variables and instance variables, so the problem with threading security is resolving security access to variables--"
The secure access to the variables in Java is implemented by means of synchronization, and synchronization is achieved through locks--"
There are three kinds of locks to ensure that the variable is only one thread access, the lock is the fastest but only for a thread to obtain a lock from beginning to end, lightweight lock is faster but only for thread serial to obtain the lock, the weight lock is the slowest but can be used for thread concurrency to obtain the lock, first with the fastest biased lock, each time assuming that the
Multithreading in Java from the JVM's point of view