1 SMP (symmetric multi-processor)
Symmetric multiprocessor architecture, which means that multiple CPUs in the server work symmetrically, with the same time required for each CPU to access the memory address. Its main feature is sharing, including CPU, memory, I/O, and so on.
SMP guarantees memory consistency, but these shared resources are likely to be performance bottlenecks, and as the number of CPUs increases, each CPU accesses the same memory resource, which can lead to memory access conflicts.
Can lead to waste of CPU resources. The common PC is this. 2 NUMA (non-uniform Memory Access)
Non-uniform storage access, the CPU is divided into CPU modules, each CPU module is composed of multiple CPUs, and has a separate local memory, I/O slot, etc., between modules can be accessed through the interconnected modules,
Access to local memory is much faster than accessing remote memory (memory from other nodes in the system), which is also the origin of inconsistent storage access. NUMA is a good solution to the problem of SMP expansion,
When the number of CPUs increases, the system performance cannot linearly increase by 3 CLH because the latency of accessing far-memory far exceeds the local memory.
CLH (Craig, Landin, and Hagersten locks): A spin lock that ensures no starvation, provides first-come-first-served fairness.
The CLH lock is also a scalable, high-performance, fair spin lock based on the list, and the application thread spins only on the local variable , constantly polling the predecessor's state, and ending the spin if it is found that the predecessor has released the lock.
When a thread needs to acquire a lock:
1. Create a Qnode and set the locked to true to indicate the need to acquire a lock
2. The thread calls the Getandset method on the tail domain, making itself the tail of the queue, and obtaining a reference to its forward node mypred
3. The thread rotates on the locked field of the forward node until the forward node releases the lock
4. When a thread needs to release the lock, set the locked field of the current node to false, while reclaiming the forward node
As shown in the following figure, thread a needs to acquire a lock whose mynode field is true,tail to point to thread A, and then thread B is added to thread A, tail point to thread B. Then threads A and b
is rotated on its mypred field, and once its mypred node's locked field becomes false, it can acquire the lock. Obviously thread A's mypred locked domain is false, at which point thread a acquires the lock.
code example
public class Clhlock implements Lock {atomicreference<qnode> tail = new Atomicreference<qnode> (New Qnod
E ());
Threadlocal<qnode> mypred;
Threadlocal<qnode> Mynode;
Public Clhlock () {tail = new atomicreference<qnode> (new Qnode ()); Mynode = new Threadlocal<qnode> () {protected Qnode InitialValue () {return new Qnod
E ();
}
};
mypred = new Threadlocal<qnode> () {protected Qnode InitialValue () {return null;
}
};
@Override public void Lock () {Qnode Qnode = Mynode.get ();
Qnode.locked = true;
Qnode pred = Tail.getandset (Qnode);
Mypred.set (pred); while (pred.locked) {}} @Override public void Unlock () {Qnode Qnode = Mynode.get
();
qnode.locked = false; Mynode. Set (Mypred.get ()); }
}
CLH Analysis
The advantage of the CLH queue lock is that the space complexity is low (if there are n threads, L locks, each thread acquires only one lock at a time, then the required storage space is O (l+n), n threads have n
Mynode,l Lock has an L-tail), a variant of CLH is applied to the Java concurrency Framework. Clh This method is very effective under the SMP system structure. But in NUMA system knots
Under construction, each thread has its own memory, if the memory position of the forward node is far away, the locked domain of the forward node can be judged by the spin, and the performance will be greatly reduced, a kind of solution NUMA system structure
The idea is MCS queue lock. 4 MCS Lock
The biggest difference between MSC and CLH is not whether the linked list is displayed or implicit, but the thread spins the rules differently: CLH is spinning on the locked domain of the forward node, while MSC is in its own
The node is locked on the field of the spin wait. Because of this, it solves the problem that CLH gets the locked domain state memory too far in the NUMA system architecture.
The implementation of the MCS queue lock is as follows: A. Queue initialization has no nodes, Tail=null B. Thread A wants to acquire a lock and then puts itself at the end of the line because it is the first node and its locked field is false c. threads B and C join queues successively, a->next=b, B->next=c. and b and C do not now acquire locks and are in a waiting state, so their locked domain is true.
The tail pointer points to thread C's corresponding node D. Thread A releases the lock, finds thread B along its next pointer, and sets the locked field of B to false. This action triggers thread B to acquire a lock
Code implementation
public class Mcslock implements Lock {atomicreference<qnode> tail;
Threadlocal<qnode> Mynode;
@Override public void Lock () {Qnode Qnode = Mynode.get ();
Qnode pred = Tail.getandset (Qnode);
if (pred!= null) {qnode.locked = true;
Pred.next = Qnode; Wait until predecessor gives up the lock while (qnode.locked) {}}} @Overrid
e public void Unlock () {Qnode Qnode = Mynode.get ();
if (Qnode.next = = null) {if (Tail.compareandset (Qnode, NULL)) return;
Wait until predecessor fills in it next field while (Qnode.next = = null) {}}
qnode.next.locked = false;
Qnode.next = null;
Class Qnode {Boolean locked = false;
Qnode next = null; }
}