Talking about high concurrency (33) Understanding Java Memory Models from a consistency (consistency) perspective we said the hardware layer provides the ability to meet certain conformance requirements, and the Java memory model uses the capabilities provided by the hardware layer to specify a series of grammars and rules. Allow Java developers to isolate this underlying implementation and focus on the development of concurrency logic. Let's take a look at how the hardware layer provides these capabilities to achieve conformance requirements.
The hardware layer provides a series of memory barriers to barrier/memory fence (Intel's Reference) to provide consistent capabilities. For the X86 platform, there are several main memory barriers
1. Ifence, is a load Barrier reading barrier
2. Sfence, is a store Barrier writing barrier
3. Mfence is an all-round barrier with ifence and sfence capabilities
4. Lock prefix, lock is not a memory barrier, but it can complete a memory-like barrier function. Lock locks the CPU bus and cache, which can be understood as a lock at the CPU instruction level. It can be followed by commands such as Add, ADC, and, BTC, BTR, BTS, CMPXCHG, cmpxch8b, DEC, INC, NEG, not, OR, SBB, SUB, XOR, XADD, and Xchg.
The memory barrier has two capabilities:
1. Stop the command reordering on both sides of the barrier
2. Forcing the write buffer/cache dirty data to write back to the main memory, so that the corresponding data in the cache is invalidated
For load barrier, inserting a read barrier before reading the command can invalidate the data in the cache and reload the data from the main memory
For store barrier, inserting a write barrier after the Write command allows the most recent data written to the cache to be written back to the main memory
The lock prefix implements a similar capability, which locks the bus and cache first, then executes subsequent instructions, and finally releases the dirty data in the cache back to main memory after the lock is released. When lock locks the bus, read and write requests from other CPUs will be blocked until the lock is released.
The concept of memory barrier is well understood, different hardware implements the memory barrier differently, the Java memory model masks the difference of this underlying hardware platform, the JVM to generate the corresponding machine code for different platforms.
Some of the material says that Java uses a memory barrier like mfence when it comes to volatile, but I've been tested to find that volatile is implemented with the lock prefix on the X86 platform, and the tests are JDK6 and 7.
Here's a look at the code generated by the volatile assembly
Write volatile when the generated sink code is lock Addl $0x0, (%RSP), before the write operation using the lock prefix, locked the bus and the corresponding address, so that the other write and read to wait for the release of the lock. When the write is complete, release the lock and flush the cache to main memory.
Read the volatile is very good understanding, the CPU found the corresponding address of the cache is locked, waiting for the release of the lock, the cache consistency protocol will ensure that it read the latest value
Take a look at the implementation of synchronized. The synchronized block generation JVM directive is Monitorenter, Monitorexit, and the last generated assembly instruction is
Lock Cmpxchg%r15, 0x16 (%R10) and lock Cmpxchg%r10, (%R11)
CMPXCHG is the assembly instruction of CAs, which means locking the bus and cache with the lock instruction first, and then setting the synchronized flag bit in the object header with the Cmpxchg CAS operation. When the CAS is complete, release the lock and flush the cache to main memory.
Therefore, the underlying operation of the synchronized means that the lock flag bit of the object header is set to "locked" state by means of lock CMPXCHG, when the lock is released, the lock flag bit of the object header is "released" in the Way of lock Cmpxchg, and the write operation is immediately written back to the main memory
Resources:
Memory barriers/fences
LOCK vs Mfence
Memory barrier
Talk about high concurrency (35) Understanding memory barriers