Concurrency vs. race state

Source: Internet
Author: User

Linux drivers-concurrency and normality

Sequential-Multiple user-space programs that are running may access our code in a surprisingly combined way. SMP systems can even execute our code on different processors at the same time. Kernel code can be preempted, so our driver code can lose exclusive access to the processor at any time.

Signal Volume (semaphore) Implementation:

There are a pair of functions, which usually become P and V, with the lock with P, unlocked with V. A semaphore is also called a mutex (mutex) at any moment when only a single execution context is owned. Almost all semaphores in the Linux kernel are used for mutual exclusion.

Kernel code must include <asm/semaphore.h>

The correlation type is Structsemaphore

Declaration and initialization:

Directly create and initialize with the following functions
Voidsema_init (struct semaphore *sem, int val);

Val is the initial value of the semaphore

Declaring a mutex with a macro

Declare_mutex (name);

declare_mutex_locked (name);

The first one is initialized to 1, and the second is initialized to 0. You must explicitly unlock the thread before it is allowed to access it.

If the mutex must be initialized at run time (for example, in the case of dynamically allocating mutexes), one of the following functions should be used:

Voidinit_mutex (struct semaphore *sem);

voidinit_mutex_locked (struct semaphore *sem);

In the Linux world, p functions are called down, or variants of this name. It refers to a decrease in the semaphore value, which may put the caller into hibernation. It then waits for the semaphore to become available and then grants the caller access to the protected resource. The following are the three versions of Down:
Voiddown (struct semaphore *sem);

Intdown_interruptible (struct semaphore *sem);

Intdown_trylock (struct semphore *sem);

The down function waits indefinitely, down_interruptible allows user-space processes waiting on a semaphore to be interrupted by the user, returns a non-0 value when interrupted, does not have a lock, Down_trylock never sleeps, and does not get a lock to understand the return of a non-] 0 value.

V Operation is up:

Voidup (struct semaphore *sem);

The kernel's high-level code reacts differently when the operation returns different return values

Reader/writer Semaphore:

Rwsem (Reader/writersemaphore, reader/writer semaphore). <linux/rwsem.h>,structrw_semaphore.

Must be explicitly initialized:

Voidinit_rwsem (struct rw_semaphore *sem);

The available interfaces for read-only access are as follows:
Voiddown_read (struct rw_semaphore *sem);

Voiddown_read_trylock (struct rw_semaphore *sem);

Voidup_read (struct rw_semaphore *sem);

Writer's Interface:
Voiddown_write (struct rw_semaphore *sem);

Intdown_write_trylock (struct rw_semaphore *sem);

Voidup_write (struct semaphore *sem);

Voiddowngrade_write (struct rw_semaphore *sem);

When a quick change obtains a writer lock, which is followed by a longer read-only access, the downgrade_write can be called after the modification is completed to allow access by other readers.

The writer has a higher priority and does not allow the reader to gain access until all the writers have finished their work, so it is best to use Rwsem when there is little need for write access and the writer will only have a short-term amount of semaphore.


A common pattern in kernel programming is to initialize an activity outside of the current process and then wait for the end of the activity, which might be to create a new kernel thread or a new user-space process, a request for an existing process, or some type of hardware action, and so on. The following code is available:
Structsemaphore sem;

Init_mutex_locked (&sem);

Start_external_task (&sem);

Down (&SEM);

When an external task finishes its work, it calls up (&sem).

But the semaphore is not the best tool to use this situation. In the usual use, the code that attempts to lock a semaphore will find that the semaphore is almost always available. If there is a serious competition for the semaphore, performance will be affected, then the locking mechanism needs to be revisited, so the semaphore has been greatly optimized for "usable" conditions. However, if you use semaphores as above to communicate when a task is completed, the thread that calls down is almost always waiting, so performance is equally affected. If the semaphore is declared as an automatic variable in this case, it may also be affected by a (hard-to-handle) race. In some cases, the semaphore may disappear before the process that calls up completes its related tasks.

The above considerations led to the advent of the completion interface. is a lightweight mechanism that allows a thread to tell another thread that a job has been completed. In order to use completion, the code must include <linux/completion.h>. The following interfaces can be used to create:

Declare_completion (my_completion);

or dynamically Create and initialize:
Strcutcomplection my_completion;

Init_complection (&my_completion);

Wait for complection to be called:

Voidwait_for_completion (struct complection *c);

The function performs a non-disruptive wait. If the code calls Wait_for_completion and no one finishes the character, it will produce a non-kill process.

The completion event can be triggered by the following function:
Voidcompletion (struct completion *c);

Voidcompletion_all (struct completion *c);

Completion only wakes up a waiting thread, Complete_all allows all waiting threads to be woken up.

A completion is usually a one-time (one-shot) device that is only used once and then discarded. But careful processing, completion can be reused, if not using Completion_all, then we can reuse a completion structure, as long as the event that will be triggered is clear and unambiguous. However, if you use Complete_all, you must re-initialize it before you reuse the structure, with a function that is fast initialized:

Init_completion (structcompletion *c));

The typical use of the complete mechanism is that the kernel thread terminates when the module exits. In this prototype, the internal work of some drivers is done by a kernel thread in the while (1) loop, and when the kernel is ready to clear the module, the Exit function tells the thread to exit and wait for completion. To achieve this, the kernel contains a special function that can be used with this thread:
Voidcomplete_and_exit (struct completion *c, long retval);

Spin Lock:

Spin Lock ("Spinlock"). Can be used in code that cannot hibernate, a spin lock is a mutex, it can only have two values: "Lock" and "unlock" "Test and set" operation must be done atomically, even if there are multiple threads at a given time spin, there is only one thread can get the lock. When there is a spin lock, the processor waiting to perform the busy loop does not do any useful work.

Spin Lock API:

<linux/spinlock.h>, type: spinlock_t type. The initialization of a spin lock can be done at compile time with the following code:


or call the following function at run time:

Voidspin_lock_init (spinlock_t *lock);

Before entering the critical section, you must call the following function to obtain the required lock:

Voidspin_lock (spinlock_t *lock);

Release Lock:

Voidspin_lock (spinlock_t *lock);

Rules for using spin locks:

Avoid getting spin locks or hibernation, which may affect performance or cause deadlocks. The core rule for spin locks is that any code that has a spin lock must be atomic. He could not sleep, in fact, he could not give up the processor for any reason, except for service interruption (in some cases, the processor cannot be discarded at this time)

The kernel preemption situation is handled by the spin lock code itself. At any time, only the kernel code has a spin lock, and preemption on the associated processor is banned. Even on a single-processor system, preemption must be forbidden in the same way to avoid race. That's why we're not going to run our own code on multiprocessor systems, but we still have to deal with the reason for locking correctly.

Many kernel functions can hibernate, so you must pay attention to each function that you call.

The spin lock must be owned within the shortest possible time.

Spin lock function:

There are actually 4 functions that lock a spin lock:
Voidspin_lock (spinlock_t *lock);

Voidspin_lock_irqsave (spinlock_t *lock, unsigned long flags);

VOIDSPIN_LOCK_IRQ (spinlock_t *lock);

VOIDSPIN_LOCK_BH (spinlock_t *lock);

The Spin_lock_irqsave disables interrupts (only on the local processor) before the spin lock is obtained, and the previous interrupt state is saved in flags. If we can ensure that no other code is preventing interrupts on the local processor (or, in other words, we can ensure that interrupts should be enabled when the spin lock is released), you can use SPIN_LOCK_IRQ without the trace flag. Finally SPIN_LOCK_BH disables the software interruption before acquiring the lock, but keeps the hardware interrupt open.

A spin lock, which can be obtained by running code in the context of the (hardware or software) interrupt, must use a spin_lock form that disables interrupts, because the use of other locking functions sooner or later causes the system to deadlock. If we do not access the spin locks in the hardware interrupt handling routines, but may be accessed in software interrupts (for example, code running as Tasklet), you should use SPIN_LOCK_BH to safely avoid deadlocks and to service hardware interrupts.

Release spin lock method: Strictly Correspondence

Voidspin_unlock (sinlock_t *lock);

Voidspin_unlock_irqrestore (spinlock_t *lock, unsigned long flags);

VOIDSPIN_UNLOCK_IRQ (spinlock_t *lock);

VOIDSPIN_UNLOCK_BH (spinlock_t *lock);

Each Spin_unlock variant will undo the work done by the corresponding Spin_lock function, and the parameter flag passed to Spin_unlock_irqrestrore must be the corresponding one and must be called in the same function spin_lock_ Irqsave and Spin_unlock_irqrestore. Otherwise, the code may be having problems with some schemas.

The following non-blocking spin lock operation:

Intspin_trylock (spinlock_t *lock);

INTSPIN_TRYLOCK_BH (spinlock_t *lock);

There is no try version of the Disable interrupt.

Reader/writer Spin lock:

rwlock_t is defined in type,<linux/spinlcok.h>.

Two ways of declaring and defining:

Rwlock_tmy_rwlock = rw_lock_unlocked;//static


Rwlock_init (&my_rwlock);//dynamic

Voidread_lock (rwlock_t *lock);

Voidread_lock_irqsave (rwlock_t *lock, unsigned long Falgs);

VOIDREAD_LOCK_IRQ (rwlock_t *lock);

VOIDREAD_LOCK_BH (rwlock_t *lock);

Voidread_unlock (rwlock_t *lock);

Voidread_unlock_irqsave (rwlock_t *lock, unsigned long Falgs);

VOIDREAD_UNLOCK_IRQ (rwlock_t *lock);

VOIDREAD_UNLOCK_BH (rwlock_t *lock);

Read without a try

Voidwrite_lock (rwlock_t *lock);

Voidwrite_lock_irqsave (rwlock_t *lock, unsigned long Falgs);

VOIDWRITE_LOCK_IRQ (rwlock_t *lock);

VOIDWRITE_LOCK_BH (rwlock_t *lock);

Voidwrite_trylock (rwlock_t *lock);

Voidwrite_unlock (rwlock_t *lock);

Voidwrite_unlock_irqsave (rwlock_t *lock, unsigned long Falgs);

VOIDWRITE_UNLOCK_IRQ (rwlock_t *lock);

VOIDWRITE_UNLOCK_BH (rwlock_t *lock);

Note points for using locks:
Must be arranged in the prior period;

A function that obtains a lock must never invoke another function that attempts to obtain the lock, and the system hangs;

Sometimes it is necessary to write a function with two versions of lock and no lock, and to do a good job of marking;

These locks are always obtained in the same order when multiple locks need to be obtained;

Avoid multiple locks, obtain a local lock to obtain the core lock, first obtain the signal volume and then obtain the spin lock;

If we suspect a competitive lock is causing performance degradation, you can use the Lockmeter tool, a patch that measures how long the kernel spends on the lock.

No locks are used:

Loop buffers:
Universal Loop Buffer Implementation <linux/kfifo.h>

Atomic variables:
The kernel provides an integer type of yard atomic_t defined in <asm/atomic.h>

Two kinds of initialization methods:

Voidatomic_set (atomic_t *v, int i);

Atomic_tv=atomic_init (0);

Intatomic_read (atomic_t *v);

Returns the current value of V

Voidatomic_add (int i, atomic_t *v);

Accumulate I, no return value

voidatomic_sub (int i, atomic_t *v);

Minus I, no return value

Voidatomic_inc (atomic_t *v);

Voidatomic_dec (atomic_t *v);

Intatomic_inc_and_test (atomic_t *v);

Intatomic_dec_and_test (atomic_t *v);

intatomic_sub_and_test (int i, atomic_t *v);

Performs a specific action and tests the result; Returns True when the atomic value is 0 at the end of the operation.

Note that there is no Atomic_add_and_tedt function.

intatomic_add_negative (int i, atomic_t *v);

The integer variable i is accumulated to V, and the return value is negative true, otherwise false.

Intatomic_add_return (int i, atomic_t *v);

Intatomic_sub_return (int i, atomic_t *v);

Intatomic_inc_return (int i, atomic_t *v);

Intatomic_dec_return (int i, atomic_t *v);

has a return value

The atomic_t data item must be accessed by the above function, otherwise an error occurs.

Bit operation:

Atomic bit operation, declared in <linux/bitops.h>.

Available actions:

Voidset_bit (nr, void *addr);

Set the data item that addr points to NR

Voidclear_bit (nr, void *addr);


Voidchange_bit (nr, void *addr);


Test_bit (nr,void *addr);

The function is the only one that does not have to be atomically implemented, it simply returns the current value of the anchor

Inttest_and_set_bit (nr, void *addr);

Inttest_and_clear_bit (nr, void *addr);

Inttest_and_change_bit (nr, void *addr);

Changes and returns the previous value.


The 2.6 kernel contains two new mechanisms to provide fast access to shared resources, lock-free access. Seqlock can be used when the resources to be protected are small, simple, frequently accessed, and write access rarely occurs and must be fast.

Essentially, Seqlock allows the reader to have free access to the resource, but requires the reader to check if there is a conflict with the writer, and when such a conflict occurs, it is necessary to retry access to the resource. Seqlock generally cannot be used to protect a data structure that contains pointers, because the reader may follow an invalid pointer while the writer modifies the structure.

<linux/seqlock.h>seqlock_t type

There are two types of initialization methods:



Seqlock_init (&LOCK2);

Read access enters the critical section by obtaining an (unsigned) integer sequential value. On exit, the order value is compared to the current, and if not equal, read access must be retried. As a result, the reader code is written as follows:

Unsignedint seq;


Seq=read_seqbegin (&the_lock);

Complete the work accordingly

}whileraed_seqretry (&THE_LOCK,SEQ);

If you use Seqlock in an interrupt-handling routine, you should use an IRQ-safe version:

Unsignedint Read_seqbegin_irqsave (seqlock_t *lock, unsigned long flags);

Intread_seqretry_irqrestore (seqlock_r *lock, unsigned int seq, unsignedlong flags);

The writer must obtain a mutex when entering a critical section protected by Seqlock. To do this, you call the following function:

Voidwrite_seqlock (seqlock_t *lock);

Write locks are implemented using spin locks, so common limitations of spin locks are also used for write locks

Voidwirite_sequnlock (seqlock_t *lock);

Common spin lock variants are available:

Voidwrite_seqlock_irqsave (seqlock_t *lock, unsigned long flags);

VOIDWRITE_SEQLOCK_IRQ (seqlock_t *lock);

VOIDWRITE_SEQLOCK_BH (seqlock_t *lock);

Voidwrite_sequnlock_irqsave (seqlock_t *lock, unsigned long flags);

VOIDWRITE_SEQUNLOCK_IRQ (seqlock_t *lock);

VOIDWRITE_SEQUNLOCK_BH (seqlock_t *lock);

If the write_tryseqlock can obtain a spin lock, it will also return a value other than 0.


(READ-COPY-UPDATE,RCU), very famous but rarely used in drivers. It is optimized for situations where reads are frequently occurring and rarely written. The protected resources should be accessed through pointers, whereas references to those resources must be owned by atomic code only, and when the data structure needs to be modified, the write thread first replicates, then modifies the copy, and then replaces the relevant pointer with the new version. The old version can be released when the kernel determines that there is no other reference to the old version.

More complex, the recommended reference header file to understand the interface


This article is from the "No Front" blog, please be sure to keep this source

Concurrency vs. race state

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.