The atomic manipulation, spin lock of the lock mechanism in Linux kernel

Last Update:2014-06-30 Source: Internet

Author: User

Tags garbage collection

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Go to: http://blog.sina.com.cn/s/blog_6d7fa49b01014q7p.html

Many people ask this question, what does the Linux kernel do to provide a variety of synchronization lock mechanisms? Traced is actually due to the existence of multiple processes in the operating system concurrent access to shared resources, resulting in the race between processes. These include the SMP system we know, the competing resources between multiple cores, the competition between single CPUs, interruptions and the mutual preemption of processes.

Typically,1 shows that for a program, our ideals are always good, and hopefully it can be done this way: process 1 completes the operation on the critical section, then process 2 then operates the critical section. But often the reality is always cruel, process 1 in the course of execution, process 2 is likely to insert a foot here, resulting in two processes at the same time read and write access to the critical section, reading is no problem, but write the problem is big. In this way, the results are often not what we want.

Figure 11 A simple example

Therefore, we need some solution, in the Linux kernel it provides the following locking mechanism, for the user in different situations or with the use of, including: Atomic operation, spin lock, memory barrier, read and write spin lock, sequential lock, signal volume, read and write signal volume, completion,RCU mechanism, BKL (large kernel lock ) and so on, the following author will be divided into five blog posts to discuss these locking mechanisms. In addition, this article is concerned about the Linux kernel source code version:Linux 3.3.1. OK, let's start with a discussion about atomic manipulation and spin locks.

One, atomic operation

The so-called atomic operation is to ensure that the instruction is executed in an atomic manner, which is not interrupted during execution. It includes atomic integer operations and atomic bit operations, which are defined in the kernel by include\linux\types.h and arch\x86\include\asm\bitops.h respectively. Usually know a thing, we first understand how it is used, so let's first look at some of the interface functions that the kernel provides to the user. For an integer atomic operation function, as shown in 1.1, the following operation on addition has a corresponding subtraction operation in the kernel.

Figure 1.1 Integer atomic manipulation functions in the kernel

1.2 shows some of the primary bit atomic manipulation functions available in the kernel. The kernel also provides a set of non-atomic bit manipulation functions corresponding to the above operations, with two more underscores before the name. Because atomicity is not guaranteed, speed can be performed faster.

Figure 1.2 bit atomic manipulation functions in the kernel

The following article shows a specific example of atomic operations, note the content of the bold section, it is the role of the implementation of the device can only be opened by a process. It should not be difficult to understand the contents of the annotations, asshown in 1.3.

Figure 1.3 Atomic Operation sample Program

Here are some of the more important things I think about atomic operations:1, atomic operations in different architectures implemented in different ways, the basic implementation of the assembly;2. The integer atomic function set above is only for 32 bits, and the kernel has another set of functions for 64 bits. 3, for the SMP system, the kernel also provides the local_t data type, the implementation of an integer atomic operation on a single CPU, the interface function will only replace Atomic_ with Local_, the specific definition can be see arch Defined in/x86/include/asm/local.h.

Next look at its implementation core,shown in 1.4. Due to space constraints, the middle of the assembly operation on SMP is omitted here, interested readers can refer to the specific kernel source code.

Figure 1.4 Implementation core of atomic operation

The

can be seen for smp system, its implementation core is lock instructions, and for the single cpu system, it is degraded to an empty operation, because for a single cpu, during the execution of a program, there can be no other span lang= xml:lang= "en-us" >cpu to interrupt its execution, so, in fact, there is no need for atomic operations in non-SMP systems. The SMP system is discussed below. Before the discussion, first understand the lock directive in x86. The lock directive is a prefix that can be combined with other instructions to maintain the latch signal of the bus until the execution of its associated instruction is complete. This instruction avoids damaging useful information when cpu work in conjunction with other processors. It has no effect on interrupts, because interrupts can only be generated between instructions. lock prefix is to maintain control of the system bus until the entire instruction is executed.

After knowing what the lock command does, it is already obvious why the lock directive is used for atomic operations. Note, however, thatthe lock instruction is only processed for its own CPU. The lock instruction takes up CPU resources in execution, from the hardware consideration, the multi-core is responsible for communicating with each other, in order to let some kernel's modification be discovered by other cores, so the excessive use of lock instruction will inevitably reduce the performance of the system.

At this point, the contents of the atomic operation are basically discussed here. Summing up, for atomic operation, its advantage is simple, but its shortcomings are also very clear, that is, can only be counted operation, the protection of too little, from the interface functions it provides can be seen.

second, spin lock

Next, I will discuss the contents of the spin lock. Its definition can be described as follows: When a process is attempting to lock, if the current lock is already in a "locked " state, the attempt to lock the process will continue to "rotate ", with a dead loop to test the state of the lock, until the successful acquisition of the lock. It is defined in kernel include\linux\spinlock_types.h and is shown in the core structure and member 2.1.

Figure 2.1 Spin lock core structure and members

The following first looks at what functions the spin lock provides, in turn defining the include\linux\spinlock.h file, as shown in part of function 2.2.

Figure 2.2 Partial interface functions provided by spin lock

In the same example, this example is the protection of the device_count variable, as shown in example 2.3, also need to pay attention to the bold part. A closer look at this example will help you to understand sequential locks, and when you do, you'll see that it's actually the core implementation concept of sequential locking.

Figure 2.3 Spin Lock sample program

The above about the spin lock example should not be difficult to understand, let us dive into the spin lock and unlock the core source code, further to see how it is implemented. First, for a single CPU, its mechanism is actually prohibit and enable preemption, show is the spin lock lock and unlock in the core middle layer iteration of the source code, pay special attention to the bold part of the content. Deep down, there's actually a "reference counter " concept. It is a technique of memory management, which can be seen as a garbage collection mechanism in C/s + +, and the specific content reader can understand it and no longer dwell on it.

Figure 2.4 Lock function Implementation core of spin lock single CPU

The above shows a kernel lock function of the source code implementation process, in fact, for the unlocking is also such a process. as shown in 2.5.

Figure 2.5 Spin lock for single CPU implementation of the source code

In summary, in fact, for a single CPU, is actually very simple content, for the CPU has a kernel preemption mechanism, will prohibit kernel preemption, otherwise, degenerate to empty operation.

For SMP Systems, it does some additional work in addition to simply banning or enabling the CPU's preemption mechanism. Through the source code search, we can find that its implementation core is actually shown in Figure 2.6, the two functions, using the T- assembly implementation. It's quite complicated, but it's really simple to analyze.

Figure 2.6 Spin lock for SMP system implementation source code

Here, we can see that it really implements a process between multiple processes"Spin situation "XADDW", "CMPB", "Movb", " INCB ", wherein xaddw means that the value of the source operand and the destination operand is exchanged first, then the two operands are summed again by the word, and the final result is saved in the destination register; " CMPB "," Movb "," INCB "is simpler, subsequent b (byte) suffix means that this instruction is executed in bytes. Also note that in the linux kernel, the at&t assembly format is adopted, The order of the instruction operands is the first source, not the x86 compilation of the first purpose after source, 2.6 in the " Span lang= "en-us" xml:lang= "en-US" >xaddw "assembly instructions are %1 represents the register for the purpose register, namely lock->slock variables. Below we look at the specific implementation process, wherein p1,p2 represents two different processes in the system, the Span lang= "en-us" xml:lang= "en-us" to 2.7 described.

Figure 2.7 The spin lock core execution process

Here, the reader should understand how the spin lock spins. Note: for "XADDW" it actually completed three instructions, in order to prevent being interrupted by this process, so add the Lock_prefix macro, in front of the atomic operation we also saw the Lock_prefix macro is actually for The lock instruction is packaged, of course, for SMP systems.

Of course, the source code given above is the largest only supported256 processor cases, for operation256 processors, the kernel has a set of functions to deal with, interested can be studied. Maybe after analyzing the source code, someone will ask this question: ifP2 andP3 are waiting for a spin lock,How do Linux systems ensure that they are executed in the right order? In fact, this has been reflected in the source code, in fact, considerslcok, we can observe that it actually guarantees the sequential execution of subsequent processes waiting for the spin lock, such as the above analysis we get p2 slcok=0x0201, if P1 has not been released, p3 again to apply for spin lock, this time, the kernel has been calculated p3 the slcok=0x0301. Then continue to analyze the source code, we know p3 to execute, must be p2 after execution (at this time, Span lang= "en-us" xml:lang= "en-us" >slcok=0x0302), subject to conditions (slcok via low 8-bit plus 1 equals 0x0303) applies to the spin lock, which virtually guarantees the sequential execution of the application spin lock process. Because slock only high 8 bits are used to guarantee the order, so this source of the maximum support only 256 processors apply for a spin lock simultaneously.

In addition to the spin lock, we have to mention the relationship between spin lock and interrupt, first look at a double request example, if a process in the critical section is executing, however, suddenly there is an interrupt to interrupt it, then, in the critical section triggered the interrupt handler, if the interrupt handler contains the application of spin lock operation, This will create a big problem, the so-called double-request example. As shown in 2.8.

Figure 2.8 Double Request legend

Of course the kernel takes this into account, so there is a function for closing interrupts in the spin lock:SPIN_LOCK_IRQ () andSPIN_UNLOCK_IRQ (). 2.9 code, but the use of the two functions is conditional in the middle, which requires that interrupts must be active before locking. If there is a process now, its interruption would have been closed, but you have been through the process of the lock after the interruption becomes open, this does not cause problems again. In this case, the kernel presents the spin-lock function shown in the spin_lock_ Irqsave () and spin_unlock_irqrestore (). Perhaps the reader will have some doubts here, since flags is spin_lock_ The output parameters of irqsave () should theoretically be spin_lock_irqsave () flags without "&" is because this function is defined as a macro, has been nested, eventually to arch\x86\ Include\asm\irqflags.h, interested in searching to the source. In general, it is important to note that the functions of the output parameters without

Figure 2.9 Use case diagram for SPIN_LOCK_IRQ ( ) 2.10 spin_lock_irqsave ()

There are a few more things to explain about interrupting the bottom half. First, if there is a problem with data sharing between the lower half and a process, it is important to note that because the lower half of the interrupt can preempt the process context, there is a critical section between the lower half and the process in addition to the lock, and the lower half of the execution is forbidden. The kernel includes functions spin_lock_bh () and spin_unlock_bh () that are implemented to prohibit or enable the lower half. At the same time about the interrupt in the spin lock there are two points to note:① If the interrupt handler and interrupt the lower half of the shared data, the data area is locked while also prohibit interrupts, because the interrupt is also the second half of the preemptive interrupt. ② If the data is shared by a soft interrupt, it also needs to be locked because there is a soft interrupt on different processors to perform the problem simultaneously.

Ok! The above discussed the spin lock aspects of the content, the following is a summary of spin lock. First of all, from the previous implementation mechanism, the reader can see that the spin lock is mainly for the SMP system, and they exist preemption situation. For a single CPU system, the implementation of the spin lock is degraded to an empty operation. The second spin lock is a busy wait, requiring a short critical zone execution time. Another spin lock can cause a deadlock, which is caused by a recursive call to a spin lock (double request) or a spin lock that does not release, eventually causing the system to be unavailable. The last point is that if a spin lock calls a function that might cause sleep during a lock, such as Kmalloc (), and so on, "one cannot wake up" (since this time "nobody" is responsible for waking up, the main reason is that even interrupts are turned off, and never wakes up until the system is restarted ). This requires special attention.

At this point, the spin lock section of the content to the end of this discussion, let us jump out to view the global,2.11 shows. In fact, abstract view of the implementation mechanism of the spin lock is quite simple, and it provides a series of interrupts on the function of the spin lock provides a "seat belt " role.

Figure 2.11 Jumping out of view global -spin lock implementation mechanism

Due to the limitations of the length of the article, this post to this end, the following will give the "Big talk linux kernel lock mechanism of the memory barrier, read-write spin lock and sequential lock", Interested readers can continue to read the latter post. Due to the author's level limit, there are inevitably mistakes in the blog, readers are welcome to point out that we discuss each other and common progress.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More