The various "Tasks" in the Linux kernel can see the kernel address space, so they also need synchronization and mutual exclusion. The synchronization/Mutex methods supported by the Linux kernel include:
Technology |
Function |
Scope of Action |
Per-CPU variable |
Copy a piece of data for each CPU |
All CPUs |
Atomic operation |
Atomic read-Modify-write a counter to the instruction |
All CPUs |
Memory barrier |
Avoid orders being reordered |
Local CPU or all CPUs |
Spin lock |
Locked and busy waiting |
All CPUs |
Signal Volume |
Lock and block Wait (sleep) |
All CPUs |
Sequential locks |
Lock based on Access counter |
All CPUs |
RCU |
Access shared data structures through pointers without locking |
All CPUs |
Completion |
Notification/(Waiting for another) a task completes |
All CPUs |
Turn off local interrupts |
Shutdown interrupts on a single CPU (this CPU) |
Local CPU |
Close local soft interrupts |
To prohibit the execution of a deferred function on a single CPU (this CPU) |
Local CPU |
one, per CPU variableFirst of all, it must be clear that the best sync/mutex technology is not to be synchronized/mutually exclusive. All synchronization/Mutex technologies have a cost of performance.
Each-CPU variable is the simplest method of synchronization, and it is actually an array of data structures, each of which corresponds to an element of the array.
With each CPU variable, each CPU can access only the elements associated with it, so each-CPU variable can only be used under special circumstances.
Each-CPU variable is used in main memory to ensure that they map to different hardware cashe rows. This ensures that concurrent access to each-CPU variable does not result in cache snooping and invalidation (which can incur high overhead).
Although each CPU variable protects concurrent access from different CPUs, it does not protect asynchronous access, such as interrupts and delay functions. In addition, if kernel preemption is supported, each CPU variable may have a race state. The kernel should therefore prevent kernel preemption when accessing each CPU variable.
Use macros and functions for each CPU variable:
DEFINE_PER_CPU (type, name): The macro statically assigns each-CPU variable whose name is type. PER_CPU (name, CPU): This macro selects the element _ _get_cpu_var (name) corresponding to the specified CPU for each CPU variable named name: the macro selects the element corresponding to the local CPU for each CPU variable named name Get_cpu_var (name): The macro closes kernel preemption and then selects the element Put_cpu_var (name) corresponding to the local CPU for each CPU variable named name: the macro opens kernel preemption and does not use the name Alloc_percpu (type) : The macro dynamically assigns a per-CPU variable of type A and returns its address FREE_PERCPU (pointer): the macro frees dynamically allocated per-CPU variables, pointer the address per_cpu_ptr for each CPU variable (pointer, CPU) : The macro returns the address of the element that corresponds to the CPU for each CPU variable stored in address pointer
Second, atomic operationA number of assembly instructions are "read-Modify-write" type, that is, this instruction to access memory two times, one read to get the old value, write to write a new value. If you have two or more CPUs that initiate this type of operation at the same time, the final structure may be wrong (each CPU read the old value, and then make changes to write, so that the final write will win, if it is two times plus 1, in this case, the end will only add 1). The simplest way to avoid this problem is to make sure that the operation is atomic at the chip level.
When we write code, we cannot be sure that the compiler will use the atomic instructions. Thus Lnux provides a special type of atomic_t and some special functions and macros, so that functions and Macros Act on the type of atomic_t and are implemented as separate, atomic assembly instructions.
Atomic operations in Linux:
Atomic_read (V): Returns the value of *v Atomic_set (v,i): Set *v value to I Atomic_add (i,v): *v value plus i atomic_sub (i,v): Reduce the value of *v i atomic_sub_and_test (I, V): Reduce the value of the *v to I and check whether the updated *v is 0, and if it is 0, return 1 atomic_inc (v): Add the value of *v to 1 Atomic_dec (v): Reduce the value of *v by 1 atomic_dec_and_test (v): * The value of V minus 1 and check whether the updated *v is 0, and if it is 0, return 1 atomic_inc_and_test (v): Add the value of *v to 1 and check if the updated *v is 0, and if it is 0, return 1 atomic_add_negative (i, v): Will * V Value Plus I and check whether the updated *v is negative, and if so, return 1 Atomic_inc_return (v): Add the *v value to 1 and return the updated *v value Atomic_dec_return (v): Reduce the *v value by 1 and return the value of the updated *v Atomic_add_return (i, V): Adds the value of *v to I and returns the value of the updated *v Atomic_sub_return (i, v): Reduce the value of *v to I and return the value of the updated *v there are also some atomic operations acting on the bitmask:
Test_bit (NR, addr): Back to *addr's nr bit set_bit (nr, addr): Set *addr's nr bit to 1 clear_bit (NR, addr): Nr *addr to Bitteching to 0 change_bit (NR Addr): The *addr nr bit is reversed Test_and_set_bit (nr, addr): The NR bit of the *ADDR is set to 1 and its old value is returned Test_and_clear_bit (NR, addr): Will * Addr's nr bit is set to 0 and returns its old value Test_and_change_bit (nr, addr): Reverse the *addr of the NR bit and return its old value atomic_clear_mask (mask, addr): Will * All bits in addr corresponding to mask are 0 atomic_set_mask (mask, addr): Set to 1 for all bits in *addr corresponding to mask
iii. optimization and memory barrierIf compiler optimizations are enabled, the order in which the instructions are executed is not necessarily the same as the order in which they are in the code. In addition, modern CPUs typically execute multiple instructions in parallel and may reschedule memory access.
However, when it comes to synchronization, the command rearrangement can cause problems, and if the instructions placed after the synchronization primitive are executed before the synchronization primitives, there may be a problem. In fact, all synchronization primitives play the role of optimization and memory barrier.
The optimized barrier primitives are used to tell the compiler that all memory addresses that are stored in the CPU registers and that are valid before the barrier are invalidated after the barrier. Thus the compiler does not process any read and write requests after the barrier before the completion of the read and write requests made before the barrier. The barrier () macro is the optimized barrier primitive in Linux. Note that this primitive does not guarantee the order in which the CPU executes them (due to the parallel execution of the attribute, the subsequent execution of the instruction may end first).
The memory barrier primitives ensure that statements placed before the primitive are finished before the statement after the primitive is executed.
Linux uses several memory barrier primitives, which can also be used as an optimization barrier. The read memory barrier applies only to read operations, and the write memory barrier applies only to write operations.
MB (): Memory barrier for single processor and multiprocessor architectures RMB (): Used as memory read barrier on single processor and multiprocessor Architecture WMB (): Used as a single processor and memory write barrier on multiprocessor architectures SMP_MB (): Used as a memory barrier on multiprocessor architectures SMP_RMB ( ): Used as memory read barrier on multiprocessor Architecture SMP_WMB (): Used as memory write barrier on multiprocessor architecture
four, Spin lock
1. Spin lockSpin lock is a widely used synchronization technology, when the kernel to access the shared data structure or access to the critical area to get a lock themselves. When the kernel wants to access the resource protected by the lock, it tries to acquire the lock, if no one is currently holding the lock, it can obtain the lock and then it can access the resource, and if someone already holds the lock, it will not be able to acquire the lock and will not be able to access the resource. Obviously the lock is collaborative, that is, all tasks that require access to resources follow the principle of first obtaining permission, then using it, and then releasing resources.
Spin locks are special locks used in multiple processing environments. When using a spin lock, if the current lock is locked and the lock cannot be acquired, the task that requests the lock loops over and waits for the lock to be freed (as the current CPU has been looping through the lock's release).
In general, a critical section protected by a spin lock is prohibited from kernel preemption. On a single-processor system, spin locks do not unlock, at which point the spin lock primitives simply prohibit or enable kernel preemption. It is also necessary to note that kernel preemption is still valid during spin lock busy, so the task waiting for the spin lock to be freed may be replaced by a higher-priority task.
Spin lock In addition to busy and so on, there is another need to pay attention to the impact: because the spin lock is mainly between the SMP synchronization, so the CPU operating the spin lock needs to see the memory of the spin lock is the most recent value, so it has an impact on the cache. Spin locks are only useful for protecting short snippets of code.
2. Spin lock data structure and macros, functionsThe Linux spin lock is represented by the spinlock_t data structure, which mainly includes a domain:
Slock: Indicates the state of the spin lock, 1 indicates an "unlocked" state, and 0 and negative values represent "lock" state spin-lock related macros (these macros are based on atomic operations):
Spin_lock_init (): Initializes the spin lock to 1 Spin_lock (): Gets the spin lock and loops until it gets to the spin lock spin_unlock () If no way to get it: Release the spin lock Spin_unlock_wait () : Wait for spin lock to be released spin_is_locked (): If the spin lock is locked, then return 0, otherwise return 1 spin_trylock (): Try to acquire a spin lock and return without blocking if it is not available. Returns a non-0 when getting to a lock; 0 In addition to these versions, there are versions that can be used for interrupts and soft break environments (break version: SPIN_LOCK_IRQ, which saves interrupted versions of the interrupt status word: Spin_lock_irqsave, soft break version: Spin_lock_ BH).
3. Read-Write spin lockRead-write spin locks are designed to improve the concurrency capabilities of the kernel. As long as no kernel path modifies the data structure, multiple kernel paths can be allowed to read the data structure at the same time. If a kernel path is required to write the data structure, a write lock must be obtained. To put it simply is to write exclusive, read share.
The read-write spin lock is represented by the rwlock_t data structure, and its lock field is a 32-bit field and can be divided into two parts:
A 24-bit counter that represents the number of kernel control paths for read access to the protected data structure, and the counter is stacked at bit 0-23. The "unlocked" flag field, which is 0 when there is no kernel control path to set the bit when read or write. Located at bit 24 thus 0x1000000 says unlocked, 0x00000000 says write-lock, 0X00FFFFFF represents a reader, 0xfffffe represents two readers ...
4. Read-write spin lock related functions
Read_lock: Gets spin lock for read, it is similar to Spin_lock (also prohibits kernel preemption), except that it runs concurrent reads. The value of the spin lock of the Atom is reduced by 1, if a non-negative value is obtained, the spin lock is obtained, otherwise the value of the spin lock on the increase of the atom is reduced by 1, then the value of the loop waits for the lock to be positive, and the lock's value becomes positive and continues to attempt to acquire the read spin lock. Read_unlock: Spin lock for read release. It reduces the value of the lock field by atoms and then makes the kernel preempt again.
Note: The kernel may not support preemption, which allows you to ignore actions that prohibit and enable kernel preemption
Write_lock: Get spin lock for write, it is similar to Spin_lock () and Read_lock () (also prevents kernel preemption). Its atom from the lock field minus 0x1000000, if you get a 0, you get a write lock, otherwise the function of the atom in the spin lock value on the 0x1000000 to cancel the operation. Then wait for the lock's value to become 0x01000000, and then continue trying to acquire the read spin if the condition is satisfied. Write_unlock: For write-free spin lock, it atoms to the lock field plus 0x1000000, and then again to enable the kernel preemption. Similar to spin locks, read-write spin locks are also available for interrupts and soft interrupts versions (break version: READ_LOCK_IRQ, which saves interrupted versions of the interrupt status word: Read_lock_irqsave, soft break version: READ_LOCK_BH).