Linux kernel design and implementation--kernel synchronization

Source: Internet
Author: User
Tags semaphore

Kernel synchronization
Synchronous introduction
The concept of synchronization

Critical section: Also known as the critical segment, is the code snippet that visits and operates shared data.

Competition conditions: 2 or more than 2 threads are running at the same time in a critical section, which constitutes a competitive condition.

So-called synchronization. In fact, to prevent the formation of competitive conditions in the critical area.

Assuming that the critical area is atomic (i.e. not interrupted until the entire operation is complete), then naturally there is no competitive condition. But in practical applications. The code in the critical section is often less simple, so the lock mechanism is introduced in order to maintain synchronization. But there are some questions about locks.

Deadlock-generated conditions: to have one or more running threads and one or more resources, each thread waits for one of the resources. But all the resources are already occupied.

So threads wait for each other. But they will never release the resources that they already possess. So no matter what thread can continue, deadlock occurs.

Self-deadlock: Suppose a running thread tries to get a lock that he already holds. It had to wait for the lock to be released. But because it is busy waiting for this lock. So I will never have the chance to release the lock, the deadlock produced.

Starvation (starvation) is a phenomenon in which a thread is not able to run for a long time without the resources it needs.

Causes of concurrency

Interrupts-interrupts can happen asynchronously at any time. That is, the code that is currently executing may be interrupted at any time.

Soft interrupts and tasklet--cores can wake up or dispatch interrupts and Tasklet at any time. Interrupts the code that is currently running.

Kernel preemption--because the kernel is preemptive. So tasks in the kernel may be preempted by a task.

Sleep and user space synchronization--processes running in the kernel may sleep, which wakes the scheduler and causes a new user process to be scheduled to run.

Symmetric multi-processing-two or more processors can run code at the same time.

Simple rules for avoiding deadlocks

The order in which the locks are added is key.

Using nested locks, you must ensure that the locks are acquired in the same order, which prevents deadlocks of the deadly hug type. It is best to record the order of the locks so that others can use them in this order.

Prevent the occurrence of hunger. Infer whether the run of this code will end. Suppose a does not happen, B should wait all the time?

Do not repeatedly request the same lock.

The more complex the locking scheme, the more likely it is to cause deadlocks. ---design should be simple.

The size of the lock

The size of the lock is used to describe the data size of the lock protection. A heavy lock protects large chunks of data, such as the entire data structure of a subsystem. A fine lock protects small pieces of data. For example, an element in a large data structure.

In addition to lock, not only to avoid deadlocks, but also to consider the size of the lock.

The granularity of the lock has a very large impact on the scalability of the system, and when locking, consider whether the lock will be frequently used by multiple threads.

Assume that locks are likely to be frequently used for contention. You need to refine the granularity of the lock.

The finer locks are in the case of multiprocessor. Performance is improved.

Synchronization method
Atomic operation

Atomic manipulation refers to operations that are not interrupted by other code paths during the run. Kernel code can safely invoke them without being interrupted.

Atomic operations are divided into integer atomic operations and bit-atomic operations.

Spinlock Spin Lock

The spin lock feature is that when a thread acquires a lock, the other thread that tries to acquire the lock is looping around to get the lock. Until the lock is available again.

Because threads are actually looping through this lock, it can cause a waste of CPU processing time, so it's best to use a spin lock for critical sections that can be processed very quickly.

There are 2 points to note when using the spin lock:

1. Spin lock is not recursive, recursive request the same spin lock will Self-lock itself.

2. Before the thread acquires the spin lock. To suppress interrupts on the current processor. (prevents the thread that acquires the lock and interrupts form a race condition) analogy: The current thread acquires a spin lock. Interrupt handlers interrupt the interrupt handler in the critical section to get the lock, and the interrupt handler waits for the current thread to release the lock, while the current thread waits for the break to run before the critical section and the code to release the lock.

The use of a spin lock in the operation of the lower half of the interrupt handling is especially necessary for caution:

1. When the lower half process and the process context share data, because the processing in the lower part can preempt the code of the process context, the process context disables the operation of the lower half before locking the shared data, and then the second half when unlocked.

2. When the interrupt handler (top half) and the lower half process the shared data, the interrupt processing (top half) can preempt the lower half of the operation. The lower half disables interrupt processing (top half) before locking the shared data, and then agrees to the interrupted operation when unlocking.

3. The same tasklet cannot be executed at the same time. So shared data in the same tasklet does not need to be protected.

4. When sharing data in different classes of tasklet, one of the tasklet gets locked. Do not prohibit the operation of other Tasklet, because the same processor will not have tasklet mutually preemption situation

5. Soft interrupts of the same type or non-identical type when sharing data, do not prohibit the lower half, because there will be no soft interrupts on the same processor to preempt each other situation

Read-Write Spin lock

Assuming that the data protected by the critical section is readable and writable, it is possible to support concurrent operations for read only if there is no write operation.

For such a requirement that only write operations are mutually exclusive, it is obvious that the use of spin locks does not satisfy this requirement (it is too wasteful for read operations). There is also a lock-read/write spin lock for this kernel, a read spin lock is also called a shared spin lock, and a write spin lock is also called an exclusive spin lock.

Read-write spin lock is a lock mechanism that is smaller than the spin lock granularity, it retains the concept of "spin", but in the writing operation. There can be only one write process at most. In the case of read operations, it is possible to have multiple read-run units at the same time, although neither reading nor writing can be performed at the same time.

Spin Lock provides a high-speed and simple way to achieve the results. Assuming the lock time is not long and the code does not sleep, using a spin lock is the best choice. Assuming that the lock time may be very long or the code may sleep while holding the lock, it is better to use the semaphore to complete the lock function.

Signal Volume

The semaphore in Linux is a sleep lock. Given a task that attempts to acquire a semaphore that is already occupied, the semaphore pushes it into a waiting queue and then sleeps, when the processor can regain its freedom to run other code when the semaphore-holding process releases the semaphore. Which task in the waiting queue is awakened. and obtain the semaphore.

1) The process of contention for semaphores will sleep when waiting for the lock to become available again. So the semaphore is applied to the case where the lock is held for a long period of time, whereas when the lock is held for a short time, it is not appropriate to use the semaphore.

The overhead of sleep, maintenance wait queues, and wake-up may be longer than the lock is spent all the time.

2) because the running thread sleeps when the lock is contended. Therefore, only in the context of the process to acquire a semaphore lock, because the interrupt context is not scheduled.

3) You can go to sleep while holding the semaphore. Because the other process is trying to get the same semaphore, it will not deadlock (because the process is only going to sleep, and will eventually continue to run).

4) You cannot occupy a spin lock at the same time you occupy the semaphore. Because you may sleep while you wait for semaphores, you do not agree to sleep when you hold a spin lock.

5) The semaphore agrees to the random number of lock holders at the same time, while the spin lock at a time at most agrees to a task holding it.

The reason is that the semaphore has a count value, such as a count of 5, indicating that at the same time 5 threads can access the critical section.

Assuming that the initial value of the semaphore starts at 1, the semaphore is the mutually exclusive semaphore (MUTEX). For a non-0-value semaphore greater than 1, it can also be called a count semaphore (counting semaphore).

The amount of semaphores used by the general driver is mutually exclusive.

Semaphores support two atomic operations: p/v Primitive operation (also known as down operation and up operation):

P: Assuming that the semaphore value is greater than 0, the value of the semaphore is decremented and the program continues to run. Otherwise. The sleep wait semaphore is greater than 0.

V: Increments the value of the semaphore, assuming that the value of the incremented semaphore is greater than 0, the waiting process is awakened.

The down operation has two version numbers, respectively, for sleep interruption and sleep interruption.

Read-Write signal volume

The relationship between read and write semaphores and semaphores is almost identical to the relationship between read and write spin locks and normal spin locks.

Read-write semaphores are two-value semaphores, which is the maximum count of 1, when the reader is added. The counter does not change, add the writer, the counter only minus one. This means that the critical section of the read/write semaphore protection, at most, has only one writer, but can have multiple readers.

All read-write lock sleep is not interrupted by a signal, so it only has a version number of down operation.

Knowing when to use spin locks and semaphores is important for writing good code, but in most cases it does not require much consideration. Since only the spin lock can be used in the interrupt context, only the semaphore can be used when the task is asleep.

Complete variable

Recommended Lock-up method

Low Overhead plus lock

Priority use of spin lock

Short-term plus lock

Priority use of spin lock

Long-term lock

Priority use of semaphores

Interrupt Context plus lock

Use spin lock

Holding a lock requires sleep

Using semaphores

Complete variable

assume that a task in the kernel needs to signal that a task has a specific event, and that a complete variable is used ( Completion Variable ) is a simple way to synchronize two tasks. If a task is to run some work, there is a task waiting on the completed variable.

When the task is finished, the variable is used to wake up the waiting task. Like what. When a child process runs or exits, thevfork () system call uses the completed variable to wake the parent process.

Seq Lock (Sequential Lock)

Such a lock provides a very easy mechanism for reading and writing shared data.

The implementation of such a lock relies primarily on a sequence counter. When there is doubt that the data is written, a lock is obtained. And the sequence value is added.

The serial number is read before and after the data is read. Assuming that the read sequence value is the same, the description is not interrupted by a write operation during the read operation.

Also, assume that the read value is an even number. Then it indicates that the write operation did not occur (to be clear because the initial value of the lock is 0. So the write lock makes the value odd, and when it is released it becomes even.

The SEQ lock helps provide a lightweight and scalable look when multiple readers share a lock with a minority of the writer. However, the SEQ lock is more advantageous to the writer, only if there is no other writer, the write lock can always be successfully obtained. The suspended writer will continually make the read operation loop (the previous example) until there is no longer any writer to hold the lock.

Prohibit preemption

Because the kernel is preemptive, processes in the kernel may stop at any point in time so that a process with a higher priority is executed. This means that a task and a preempted task may be executed within the same critical area.

To avoid this scenario, the kernel preemption code uses spin locks (which prevents true concurrency and kernel preemption on multiprocessor machines) as a token for non-preemption zones. Assuming a spin lock is held, the kernel cannot preempt.

In practice, there are situations where it is not necessary to emulate true concurrency on multiprocessor machines. But the need to prevent kernel preemption) does not require a spin lock, but it still needs to shut down kernel preemption.

In order to solve the problem. Ability to disallow kernel preemption through preempt_disable. This is a function that can nest calls. Be able to invoke random times. Each invocation must have a corresponding preempt_enable call. When the last preempt_enable is called, the kernel preemption is occupied again.

Order and barrier

For a section of code. The compiler or processor may perform some optimizations on the run sequence when compiling and running. This makes the code run in a slightly different order from the code we write.

Under normal circumstances, there is nothing wrong with that. However, in the concurrency conditions, the obtained values may appear inconsistent with the expected value, for example, the following code:

/  * * Thread A and thread B share variables A and b * Initial value a=1, b=2 */int a = 1, b = 2;/* * If thread A is operating on a and B */void thread_a () {    a = 5;    b = 4;} /* * If thread B operates on A and B */void thread_b () {    if (B = = 4)        printf ("a =%d\n", a);}

Because of optimizations for compilers or processors. The order of assignment in thread A may be that B is assigned before a value is assigned.

So assume that thread A is b=4; Run out, a=5; When thread B starts running when it is not running, thread B prints the initial value of a of 1.

This is inconsistent with what we expected, we expect a to be assigned before B, so thread B either does not print the content, assuming that the value of a should be 5 if printed.

In some cases, the order in which the code is run is guaranteed. A series of barrier methods are introduced to prevent compiler and processor optimizations.


Descriptive narrative

RMB ()

Prevent reordering of load actions that span barriers

Read_barrier_depends ()

Block load action reordering with data dependencies across barriers

WMB ()

Prevent reordering of storage actions that span barriers

MB ()

Stop the loading and storage actions across the barrier from being sorted again


RMB () function on SMP with barrier () function on up

Smp_read_barrier_depends ()

Provides read_barrier_depends () functionality on the SMP. Barrier () feature available on Up


WMB () feature on SMP with barrier () function on up


MB () feature on SMP with barrier () function on up

Barrier ()

Prevents the compiler from optimizing load or storage operations across barriers

To make the above sample work correctly, use the function in the table above to change the function of thread A:

/* * If the operation of A and B in thread a */void thread_a () {    a = 5;    MB ();     /      * * MB () ensures that all loading and storing of values prior to the operation of the load and store value (value is 4) of B is     complete (i.e. a = 5; completed)     * Just to ensure that the assignment of a is preceded by the assignment of B, Then thread B will run as expected     */    B = 4;}






Linux kernel design and implementation

Linux kernel design and implementation--kernel synchronization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.