1. Overview
Synchronization is a classic issue in the operating system and is born with concurrent processing. There are three common concurrent processing situations in the modern architecture:
(1) multiple threads run on a single processor-multi-thread programming
(2) multiple threads run on multiple processors-Parallel Computing
(3) multiple threads run on distributed processing-Distributed Computing
The corresponding programming is also divided into three situations:
Shared variable programming, distributed (message-based) programming, and parallel programming.
1.1. essence of Concurrent Programming
A concurrent program usually includes two or more processes working together to complete a task. As a result, communication between processes (threads) produces synchronization problems. Communication between processes (or threads) is the root cause of synchronization. synchronization is required only when communication is required. There are two communication methods between processes: shared variables and message passing ). When shared variables are used, one process writes the variables and the other processes reads the variables. When a message is sent, a process sends a message and a process receives the message. No matter which communication method is used, synchronization is required between processes. There are two basic synchronization Methods: mutual exclusion and condition synchronization ). Mutex ensures that key code segments are not executed at the same time. Conditional synchronization will block the process until the corresponding conditions occur. For example, for communication producer and consumer processes that use shared memory, the consumer does not access the memory when the mutex variable ensures that the producer accesses the memory. Conditional synchronization ensures that the consumer does not read data before the producer writes data.
The fundamental purpose of synchronization is to create a critical region or wait for a specific condition. Common methods include lock, semaphore, and monitor ). The first two are implemented in Linux, and the last method is generally implemented at the user State level (for example, Java uses a pipe program to implement the synchronization primitive ).
The hardware source of concurrent programming: interrupt and multi-processor.
1.2. hardware architecture
Three Common and popular computer architectures: (1) single processor and memory (2) shared memory multi-processor (3) distributed memory, including multi-computer and computer network.
1.2.1 single processor architecture
1.2.2. Shared Memory multi-processor
The processor and memory are connected through an interconnected network. Small-scale multi-processor computers can include 30 processors. The interconnection network is implemented through the memory bus or the crossbar switch. This architecture is usually called a UMA computer because each processor has the same access opportunity with the memory. The UMA machine is also called SMP.
1.2.3 distributed memory multi-processor
In the distributed memory multi-processor structure, there are also interconnected networks for communication, but each processor has its own private memory (private memory ).
This structure uses message transmission instead of read/write memory. There is no cache and memory consistency problem.
To utilize multiple processors, there are three applications: multi-threaded systems, distributed systems, and parallel computing.
2. Linux kernel Synchronization
There are two hardware sources for concurrent processing in the Linux kernel: interrupt and multi-CPU. There are four basic Synchronization Methods: Disable interrupt (only for the local CPU), atomic operation (atomic operation), spin lock, and semaphores. Atomic operations are actually simple packaging of CPU atomic commands, and their Atomicity is ensured by hardware. The atomic commands provided by the hardware are the basis for implementing locks and semaphores.
For the kernel state, the CPU may be in two different kernel control paths: Interrupt Processing and Exception Handling (including system calls ). For interrupt handling programs, in a single CPU, interruption can be disabled to implement the critical section; in multiple CPUs, the spin lock can be used to implement the critical section. For exception handling programs, in a single CPU, you can disable kernel preemption to implement the critical section. In multiple CPUs, you can use semaphores to implement the critical section.
2.1. spin lock)
The spin lock mainly targets multiple CPUs. It is a synchronization in the form of a busy wait (so it wastes the CPU machine cycle) and is mainly used for interrupt processing. Kernel preemption is prohibited for critical zones protected by spin locks. For a single CPU, spin locks do nothing except prohibit (or enable) kernel preemption.
2.1.1. Hardware Support
On the X86 platform, you can add the LOCK prefix to the following commands to ensure atomic execution of commands:
(1) Bit test and modification commands, such as BTS, BTR, and BTC;
(2) Switch command XCHG, in fact, for this command, even if the LOCK prefix is not added, it is also automatic atomic execution;
(3) some single-operand arithmetic and logical operation commands, such as INC, DEC, NOT, and NEG;
(4) Some dual-operand commands, such as ADD, ADC, SUB, SBB, AND, OR, and xor.
These atomic operations are the basis for implementing spin locks and semaphores.
2.1.2 implementation of spin locks
The following interfaces are provided:
Data structure: // Include/asm-i386/spinlock. h
/* Spin lock data structure, 2.6.10 */
Typedef struct {
Volatile unsigned int lock;
# Ifdef CONFIG_DEBUG_SPINLOCK
Unsigned magic;
# Endif
} Spinlock_t;
/* Changes from 2.6.11 to 2.6.10 */
Typedef struct {
Volatile unsigned int slock;
# Ifdef CONFIG_DEBUG_SPINLOCK
Unsigned magic;
# Endif
# Ifdef CONFIG_PREEMPT
Unsigned int break_lock;
# Endif
} Spinlock_t;
Interface implementation: // Include/linux/spinlock. h
# Define spin_lock (lock) _ spin_lock (lock)
# Define spin_unlock (lock) _ spin_unlock (lock)
//// // Preemptible the kernel's spin_lock //////// //////////////
// Kernel/spinlock. c
Void _ lockfunc _ spin_lock (spinlock_t * lock)
{
Preempt_disable (); // disable kernel preemption
If (unlikely (! _ Raw_spin_trylock (lock )))
_ Preempt_spin_lock (lock );
}
// Include/asm-i386/spinlock. h
// 1 indicates that the spin lock is obtained, and 0 indicates that the spin lock fails to be obtained.
Static inline int _ raw_spin_trylock (spinlock_t * lock)
{
Char oldval;
/* Xchgb is an atomic command.
** These commands are equivalent to: oldval = 0; tmp = oldval; oldval = lock-> lock; lock-> lock = tmp;
** Read the old value of the lock field and set it to 0 (locked)
Old
*/
_ Asm _ volatile __(
"Xchgb % b0, % 1"
: "= Q" (oldval), "= m" (lock-> lock)
: "0" (0): "memory ");
// If the old value of the spin lock is positive (that is, the original spin lock is in the unlock status, and the current kernel control can obtain the lock), the function returns 1; otherwise, 0 is returned.
Return oldval> 0;
}
// Include/linux/preempt. h
# Define preempt_disable ()\
Do {\
Inc_preempt_count ();\
Barrier ();\
} While (0)
// Kernel/spinlock. c
// This function is called when the current CPU kernel control path fails to obtain the spin lock
Static inline void _ preempt_spin_lock (spinlock_t * lock)
{
If (preempt_count ()> 1 ){
_ Raw_spin_lock (lock );
Return;
}
Do {
// The preemption counter value is reduced by 1, allowing kernel preemption while waiting for the spin lock
Preempt_enable ();
While (spin_is_locked (lock ))
Cpu_relax ();
Preempt_disable ();
} While (! _ Raw_spin_trylock (lock); // cyclically requests the spin lock
}
/// // For A spin_lock that is not preemptible to the kernel /////// ///////////////////////
// For non-preemptible kernel spin_lock
Void _ lockfunc _ spin_lock (spinlock_t * lock)
{
// Do nothing for non-preemptible kernels
Preempt_disable ();
_ Raw_spin_lock (lock );
}
// Include/asm-i386/spinlock. h
Static inline void _ raw_spin_lock (spinlock_t * lock)
{
# Ifdef CONFIG_DEBUG_SPINLOCK
If (unlikely (lock-> magic! = SPINLOCK_MAGIC )){
Printk ("eip: % p \ n", _ builtin_return_address (0 ));
BUG ();
}
# Endif
_ Asm _ volatile __(
Spin_lock_string
: "= M" (lock-> lock): "memory ");
}
# Define spin_lock_string \
"\ N1: \ t "\
"Lock; decb % 0 \ n \ t" # The lock counter value is reduced by 1.
"Jns 3f \ n" \ # if it is less than 0, it is jumped to label 3
"2: \ t "\
"Rep; nop \ n \ t" # execute an empty command
"Cmpb $0, % 0 \ n \ t" # comparison with 0
"Jle 2b \ n \ t" \ # if it is less than or equal to 0, it will jump to 2
"Jmp 1b \ n" \ # is greater than 0, then jump to 1
"3: \ n \ t"
/// // Implementation of spin_unlock //////////// /////////////////////
Void _ lockfunc _ spin_unlock (spinlock_t * lock)
{
_ Raw_spin_unlock (lock );
Preempt_enable ();
}
// Include/asm-i386/spinlock. h
Static inline void _ raw_spin_unlock (spinlock_t * lock)
{
# Ifdef CONFIG_DEBUG_SPINLOCK
BUG_ON (lock-> magic! = SPINLOCK_MAGIC );
BUG_ON (! Spin_is_locked (lock ));
# Endif
_ Asm _ volatile __(
Spin_unlock_string
);
}
# Define spin_unlock_string \
"Movb $1, % 0 "\
: "= M" (lock-> lock): "memory"