Linux kernel synchronization per CPU variable, atomic operation, memory barrier, Spin lock "turn"

Last Update:2016-04-20 Source: Internet

Author: User

Tags mutex

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Transferred from: http://blog.csdn.net/goodluckwhh/article/details/9005585

Directory (?) [-]

One per CPU variable
Two-atom operation
Three optimizations and memory barriers
Four spin lock

Spin lock
Data structure and macro functions for spin locks
Read/write Spin lock
Correlation function of read-write spin lock

The various "Tasks" in the Linux kernel can see the kernel address space, so they also need to be synchronized and mutually exclusive. The Linux kernel supports synchronous/mutex methods including:

Technology	Function	Function range
Per CPU variable	Replicate one copy of data per CPU	All CPUs
Atomic operation	Atomic read-Modify-write an instruction for a counter	All CPUs
Memory barrier	Avoid orders being reordered	Local CPU or all CPUs
Spin lock	Locked up and busy waiting	All CPUs
Signal Volume	Lock and block Wait (sleep)	All CPUs
Sequential lock	Lock based on access counters	All CPUs
RCU	Access shared data structures with pointers without locking	All CPUs
Completion	Notification/(wait for another) a task is completed	All CPUs
Turn off local interrupts	Turn off interrupts on a single CPU (this CPU)	Local CPU
Turn off local soft interrupts	Prohibit the execution of a deferred function on a single CPU (this CPU)	Local CPU

One, each CPU variable must first clear the best synchronization/Mutex technology is not allowed to synchronize/mutex. All synchronous/Mutex technologies have a performance cost.
Each-CPU variable is the simplest synchronous means, which is actually an array of data structures, and each CPU of the system corresponds to an element in the array.
With each CPU variable, each CPU can only access the elements associated with it, so each-CPU variable can only be used in special cases.
Each-CPU variable is in main memory to ensure that they map to different hardware cashe rows. This ensures that concurrent access to each-CPU variable does not result in caching of snooping and invalidation, which can incur high overhead.
While each CPU variable can protect concurrent access from different CPUs, it does not protect asynchronous access, such as interrupts and deferred functions. In addition, if kernel preemption is supported, there may be a race condition for each CPU variable. Thus the kernel should disallow kernel preemption when accessing each CPU variable.
Use macros and functions for each CPU variable:

DEFINE_PER_CPU (type, name): The macro statically assigns a name to each-CPU variable named the type.
PER_CPU (name, CPU): the macro selects the element that corresponds to the specified CPU for each CPU variable named name
_ _get_cpu_var (name): the macro selects the element that corresponds to the local CPU for each CPU variable named name
Get_cpu_var (name): The macro closes the kernel preemption and then selects the element that corresponds to the local CPU for each CPU variable named name
Put_cpu_var (name): The macro opens kernel preemption, not using name
ALLOC_PERCPU (Type): The macro dynamically assigns a per-CPU variable of type A and returns its address
FREE_PERCPU (pointer): This macro releases the dynamically allocated per-CPU variable, pointer the address of each CPU variable
PER_CPU_PTR (pointer, CPU): The macro returns the address of the element that corresponds to the CPU for each CPU variable stored in address pointer

Second, the atomic operation has a lot of assembly instructions are "read-Modify-write" type, that is, this instruction to access memory two times, one time to read to get the old value, write to write a new value. If there are two or more than two CPUs initiating this type of operation at the same time, the final structure may be wrong (each CPU reads the old value, then make the modification and then write, so that the final write will win, if it is two times plus 1, in this case, the end will only add 1). The simplest way to avoid this problem is to ensure that the operation is atomic at the chip level.
When we write code, we cannot ensure that the compiler will use atomic instructions. So Lnux provides a special type of atomic_t and some special functions and macros, so that functions and Macros Act on the type of atomic_t and are implemented as separate, atomic assembly instructions.
Atomic operations in Linux:

Atomic_read (V): Returns the value of the *v
Atomic_set (v,i): Set the value of *v to I
Atomic_add (i,v): Add the value of *v I
Atomic_sub (I,V): Reduce the value of *v I
Atomic_sub_and_test (i, v): Subtract the value of *v and check if the updated *v is 0, or 0 returns 1
Atomic_inc (v): Add the value of *v to 1
Atomic_dec (v): Subtract the value of *V by 1
Atomic_dec_and_test (v): Subtract the value of *v by 1 and check if the updated *v is 0, or 0 returns 1
Atomic_inc_and_test (v): Adds a value of *v to 1 and checks if the updated *v is 0, or 0 returns 1
Atomic_add_negative (i, V): Adds the value of *v and checks if the updated *v is negative, or returns 1 if it is
Atomic_inc_return (v): Adds a value of *V to 1 and returns the value of the updated *v
Atomic_dec_return (v): Subtract the value of *v by 1 and return the value of the updated *v
Atomic_add_return (i, V): Adds the value of *v to I and returns the value of the updated *v
Atomic_sub_return (i, v): Subtract the value of *v by I and return the value of the updated *v

There are also some atomic operations acting on bitmask:

Test_bit (NR, addr): Returns the NR bit of *addr
Set_bit (NR, addr): Sets the Nr bit of *addr to 1
Clear_bit (NR, addr): *addr nr Bitteching 0
Change_bit (NR, addr): Reverse the NR bit of *addr
Test_and_set_bit (NR, addr): Sets the Nr bit of *ADDR to 1 and returns its old value
Test_and_clear_bit (NR, addr): Sets the Nr bit of *addr to 0 and returns its old value
Test_and_change_bit (NR, addr): Reverse the NR bit of *addr and return its old value
Atomic_clear_mask (Mask, addr): all bits corresponding to mask in *ADDR are cleared 0
Atomic_set_mask (Mask, addr): Sets all bits corresponding to mask in *addr to 1

Iii. optimization and memory barriers If compiler optimizations are enabled, the order in which directives are executed and their order in code is not necessarily the same. In addition, modern CPUs typically execute multiple instructions in parallel and may reschedule memory access.
However, when synchronization is involved, the command reflow can be problematic, and if the instructions placed after the synchronization primitive are executed before the synchronization primitive, there may be a problem. In fact all synchronization primitives are optimized and the memory barrier functions.
The optimization barrier primitive is used to tell the compiler that all memory addresses in the CPU register that are valid before the barrier are invalidated after the barrier. As a result, the compiler does not process any read-write requests after the barrier until the read-write request is made before the barrier. The barrier () macro is the optimized barrier primitive in Linux. Note that this primitive does not guarantee that the CPU executes their order (because of the nature of the parallel execution, the instructions that are executed may end first).
The memory barrier primitive ensures that statements placed before the primitive are executed before the start of the statement after the primitive.
Linux uses several memory barrier primitives, which can also be used as an optimization barrier. The read memory barrier is only available for read operations, and the write memory barrier is only available for write operations.

MB (): Used as a memory barrier on single-processor and multiprocessor architectures
RMB (): Used as a memory read barrier on single processor and multiprocessor architectures
WMB (): Used as a memory write barrier on single-processor and multiprocessor architectures
SMP_MB (): Used as a memory barrier on multiprocessor architectures
SMP_RMB (): Used as a memory read barrier on multiprocessor architectures
SMP_WMB (): Used as a memory write barrier on multiprocessor architectures

Four, Spin lock 1. Spin lock spin Lock is a widely used synchronization technology, when the kernel to access the shared data structure or enter the critical section, it is necessary to acquire a lock. When the kernel wants to access a lock-protected resource, it will attempt to acquire the lock, and if no one is currently holding the lock, then it will be able to access the resource, and if someone has already held the lock, it will not be able to access the lock. It is clear that locks are collaborative in nature, i.e. all tasks that require access to resources follow the principle of first obtaining permission, then using, and then releasing resources.
Spin locks are special locks used in multi-processing environments. When using a spin lock, if the current lock is locked and the lock cannot be acquired, the task that requests the lock waits for the lock to be released (showing that the current CPU is waiting for the lock to be released).
In general, a critical section protected by a spin lock is forbidden to preempt the kernel. On a single-processor system, the spin lock does not act as a lock, at which point the spin-lock primitive simply disables or enables kernel preemption. It is also important to note that kernel preemption is still valid during spin lock busy, so tasks that wait for the spin lock to be freed may be replaced by higher priority tasks.
Spin lock In addition to the busy, there is another need to pay attention to the impact: since the spin lock is mainly in the synchronization between SMP, so the operation of the spin-lock CPU needs to see the memory of the spin lock the latest value, so it has an impact on the cache. Spin locks are only available to protect short code fragments.
2. Spin lock data structure and macros, functions The Linux spin lock is represented by the spinlock_t data structure, which mainly includes a domain:

Slock: Indicates the state of the spin lock, 1 means "unlocked", and 0 and negative mean "lock" status

Spin Lock related macros (these macros are based on atomic operations):

Spin_lock_init (): Initialize the spin lock to 1
Spin_lock (): Gets the spin lock and, if no method is obtained, waits until it gets to the spin lock.
Spin_unlock (): Release spin lock
Spin_unlock_wait (): Wait for spin lock to be released
Spin_is_locked (): If the spin lock is locked, return 0, otherwise 1
Spin_trylock (): Attempts to acquire a spin lock and returns without blocking if it cannot be obtained. Not 0 is returned when the lock is obtained; otherwise 0

In addition to these versions, there are versions available for both interrupt and soft-break environments (Interrupt version: SPIN_LOCK_IRQ, which holds the interrupt version of the interrupt status word: Spin_lock_irqsave, soft interrupt version: SPIN_LOCK_BH).
3. Read-write spin lock read/write spin lock is to improve the concurrency of the kernel. As long as no kernel path modifies the data structure, multiple kernel paths can be allowed to read the data structure at the same time. If you have a kernel path to write the data structure, you must obtain a write lock. Simply put is write exclusive, read share.
The read-write spin lock is represented by the rwlock_t data structure, and its lock field is a 32-bit field and can be divided into two sections:

A 24-bit counter that represents the number of kernel control paths that read access to a protected data structure concurrently, and the counter is stacked at bit 0-23.
The "unlocked" flag field, when no kernel-controlled path is set on read or write, otherwise clear 0. Located at Bit 24

Thus 0x1000000 said unlocked, 0x00000000 means write lock, 0x00ffffff represents a reader, 0xfffffe represents two readers ...
4. Read-write spin lock correlation function

Read_lock: Gets the spin lock for Read, which is similar to Spin_lock (which also disables kernel preemption), except that it runs concurrent reads. It reduces the value of the spin lock by 1, and if a non-negative number is obtained, the spin lock is obtained, otherwise the atom increases the value of the spin lock to cancel the minus 1, then the loop waits for the lock's value to be positive, and the lock's value becomes positive and continues to attempt to acquire the read spin lock.
Read_unlock: Spin lock for read release. It atomically reduces the value of the Lock field and then re-enables the kernel to preempt.

Note: The kernel may not support preemption, which can be ignored by the disable and enable kernel preemption action

Write_lock: Gets the spin lock for write, which is similar to Spin_lock () and Read_lock () (also disables kernel preemption). It subtracts 0x1000000 from the lock field, and if a 0 is obtained, the write lock is obtained, otherwise the function atom is 0x1000000 on the value of the spin lock to cancel the subtraction operation. It then waits for the value of lock to become 0x01000000, and then continues to attempt to get read spin after the condition is satisfied.
Write_unlock: For write release spin lock, it atoms to the lock field plus 0x1000000, and then re-enable the kernel to preempt.

Similar to spin locks, read-write spin locks also exist for interrupts and soft interrupts (interrupt version: READ_LOCK_IRQ, which holds the interrupt version of the interrupt status word: Read_lock_irqsave, soft interrupt version: READ_LOCK_BH).

Linux kernel synchronization per CPU variable, atomic operation, memory barrier, Spin lock "turn"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More