The RCU of lock mechanism in Linux kernel, the big kernel lock

Last Update:2014-06-30 Source: Internet

Author: User

Tags semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The RCU of lock mechanism in Linux kernel, the big kernel lock

In the last blog post, the author analyzes the use of completion and mutual exclusion and some classic questions, the following author will focus on this blog post on the RCU mechanism of the relevant content and introduce the current has been eliminated from the kernel of the large kernel lock (BKL). At the end of this article, we summarize the series of blog posts on the lock mechanism in the Linux kernel, and put forward some basic viewpoints about the lock mechanism provided in the Linux kernel.

TenRCU mechanism

This section discusses another important locking mechanism: theRCU lock mechanism. First of all we understand conceptually what is called RCU, which reads : The reader does not need to obtain any locks to access the critical section of the RCU protection; Copy : When the writer accesses the critical section, the writer "himself " A copy of the critical section is copied and then modified; Update: The RCU mechanism will use a callback function at the appropriate time to point a pointer to the original critical section to the new modified critical section, and the garbage collector in the lock mechanism is responsible for calling the callback function. The summary is read -copy -update. the structure of the RCU is defined in figure 10.1.

Figure 10.1 structure definition of RCU

It can be seen that the structure definition is very simple. There is only one next pointer and one function pointer for the linked list, and this function pointer is the callback function mentioned above, which is used by the user of the RCU mechanism to register with the linked list, which is attached to the linked list, and is called at the appropriate time.

As for the writer in the RCU mechanism, it is the critical section of its own copy, not the operating system, which is then reclaimed by the registered callback function. So what is the timing of the so-called "appropriate timing"? All CPUs referencing the shared critical section exit the operation on the critical section. That is, when there is no CPU to operate the critical section, the critical section can be recycled, and the callback function is called.

The above discussed so many RCU content, perhaps the reader will ask:What can RCU do? RCU is a mechanism that allows read and write concurrent execution, and for the reader there is no synchronization overhead, because the critical section can be read at any time, and no interaction with other readers, but for different creators, if there is a synchronization overhead between them, The synchronization cost between the writer depends on the synchronization mechanism between the writer, and there is no direct relationship between the RCU and the written person.

As rcu mechanism can be executed in parallel between the writer. But rcu does not maintain locks, so for different writer to access the shared data, the writer and writer need to "negotiate with each other" to maintain the locking mechanism. In this way, RCU mechanism allows multiple readers to access the protected critical section at the same time, and allows multiple readers and multiple writer to access the protected critical section at the same time. It is important to note that the above mentioned whether there can be more than one writer parallel access to the critical area depends on the synchronization lock mechanism used between the writer. Therefore, for the read-write lock discussed earlier, it is found that rcu mechanism is actually an improved read-write lock, but it can not be substituted. Because rcu mechanism is mainly for pointer-type data, but read-write locks are not. In particular, when the writer's synchronization overhead is relatively large, that is, the write operation is relatively long, the performance of the reader can not compensate for the loss caused by the writer.

Here we see an example, but before we look at this example, we need to have two concepts:quiescent state , which represents the process of context switching for the CPU;grace period ( That is, the "appropriate time" mentioned at the beginning of this section, which represents the time required for all CPUs to experience a quiescent state , that is, all the readers in the system complete access to the shared critical section. Where a process executes , the values in all registers of the CPU, the state of the process, and the contents of the stack are referred to as the context of the process. When the kernel needs to switch to another process, it needs to save all the state of the current process, that is, the context of the current process, so that when the process is executed again, it can get the status of the process when it is switched, allowing the process to execute. At this point, I believe that the above two concepts have a relative degree of understanding it.

After understanding these two concepts, let's look at the following example,shown in 10.2.

Figure 10.1 RCU Use example

After the above discussion, I believe the reader has a better understanding of the RCU mechanism. But it is easy for readers to find that there are some constraints to RCU adoption. The first is that when using RCU, access to shared resources should be read-only for most of the time, and write access should be relatively small, because write access is necessarily relative to other locking mechanisms that make up the system resources, affecting efficiency. Secondly, when the reader holds Rcu_read_lock (RCU read lock function), the process context switch cannot occur, otherwise, because the writer needs to wait for the reader to complete, then the writer process will always be blocked, affecting the normal operation of the system. Once again the writer needs to call the callback function after execution, when the context switch occurs, the current process goes to sleep, the system will not be able to invoke the callback function, the more the cake is, at this time other processes to perform the shared critical section, will inevitably cause a certain error. The last point is that resources protected by the RCU mechanism must be accessed through pointers. Because from the RCU mechanism, almost all operations are directed at pointer data.

The following is a discussion of the operation function provided by RCU and its implementation, including figure 10.3 and Figure 10.4. They are include\linux\rcupdate.h implemented in the file separately .

Figure 10.3 function interface of RCU mechanism

Figure 10.3 shows the functions that RCU provides to the reader, including the basic read-write lock function and the synchronization function, where the synchronization function is most important, that is, synchronize_rcu (). The essence of the reader function is actually very simple: prohibit preemption, that is, during the RCU process context switch is not allowed to occur, the reason mentioned above, that is, the writer needs to wait for the reader to complete, then the writer process will always be blocked, affecting the normal operation of the system, and therefore not allowed in A process context switch occurs during RCU. The following gives the function that RCU provides to the writer, asshown in 10.4.

Figure 10.4 function interface of RCU mechanism

About the writer function, is mainly call_rcu and call_rcu_bh two functions. where CALL_RCU can be implemented is that it does not block the writer, so it can be used in the interrupt context and soft interrupt, the function will function func is attached to the callback function list of rcu, and then immediately returns, as mentioned in the reader function The span lang= "en-us" xml:lang= "en-us" >synchronize_rcu () function also calls the function when implemented. and call_rcu_bh functions are almost identical to call_rcu, The only difference is that it treats the completion of the soft break as experiencing a quiescent State (silent, which is mentioned at the beginning of this section), so if the writer uses the function, Then the reader needs to correspond to use rcu_read_lock_bh () and rcu _READ_UNLOCK_BH ().

Why do you say so, here I combine CALL_RCU_BH source code implementation to give their own views: a silent state represents a process context switch (mentioned above), Is that the current process finishes executing and switches smoothly to the next process. The completion of the soft interrupt as a silent state ensures that the soft interrupt of the system can be performed smoothly, because call_rcu_bh can be used in the interrupt context. The interrupt context interrupts the operation of the soft interrupt, so when the call_rcu_bh is used in the interrupt context, it is necessary to ensure that the soft interrupt is successfully executed.

The

corresponds to the time the reader needs to use rcu_read_lock_bh () and call_rcu_bh functions do not block the writer, and can be used in interrupt contexts and soft interrupts. This indicates that interrupts and soft interrupts in the system are not turned off at this time. Then the writer calls call_rcu_bh functions when accessing the critical section, rcu_read_lock_bh () and rcu_read_ The essence of the UNLOCK_BH () function is to call local_bh_disable () and Local_bh_enable () function, it is clear that this is the function that implements the disable soft interrupt and enable soft interrupt.

In addition , the comments on the C all_rcu_bh function in the Linux source code clearly state that if the current process is in the interrupt context, then Rcu_read_lock () and Rcu_read_unlock () will need to be executed. , combining the implementation of these two functions shows that it actually prohibits or enables the kernel to preempt the scheduling, the reason is self-evident, to avoid the current process in the execution of read and write process by other processes preempted. At the same time, the kernel annotations also indicate that the use condition of the interface function of C ALL_RCU_BH is that most of the Read critical section operations occur in the context of the soft interrupt, and the reason is that it is easy to understand from the function it implements, mainly to consider from the aspect of execution efficiency.

the essence of the callback function implementation of RCU is that it is mainly maintained by two data structures, including the Rcu_data and rcu_bh_data data structures, and implements the hook callback function, which makes the callback function form the linked list. The principle of the callback function is first registered to the first execution of the linked list.

The following article discusses RCU's list operation content, which is defined in file include\linux\rculist.h. In fact, for RCU, its goal is not only to protect the general pointers, but also to protect the doubly linked list. In fact, this is the reason why the RCU list operation occurs. Similar to the function content provided in the kernel for standard list operations, thefunctions provided by the RCU mechanism are also based on regular linked lists and hash lists. For RCU linked list operations, in addition to traversing linked lists, modifying and deleting linked list elements, you must call the function of the RCU mechanism, other processes can still use the standard list functions, while the standard list operation function is defined in the file include\linux\list.h. In fact, with regard to RCU's list operation function, most of their implementation mechanisms are called directly about the standard list operation function, and a few have added some code to invoke the operation mechanism of the RCU list.

Here are some of the functions provided by the RCU list operation, here are only some of the basic functions, the rest of the function can be used to check the specific kernel source code or related data. As shown in 10.5,10.6, the corresponding hash list function is given in turn under the normal list function.

Figure 10.5 RCU Linked list operation function

Figure 10.6 RCU Linked list operation function

11.BKL (large kernel lock)

Finally to introduce the kernel has been eliminated from the large kernel lock, referred to as BKL, because it has been kicked out of the kernel, it is not intended to do too much in-depth discussion, just can have a understanding, the kernel has been a lock. This lock is present in the lower version of the kernel, but it is no longer recommended. It locks the entire kernel and ensures that no processor runs in parallel with the kernel mentality.

Large kernel locks are formally kicked out of the kernel from the Linux 2.6.39, and are defined in the Include\linux\smp_lock.h file before being kicked out . The main two functions are provided,Lock_kernel can lock the whole kernel,unlock_kernel corresponding unlock. One feature of a large kernel lock is that its lock depth is counted. This means that Lock_kernel can still be called when the kernel is locked . Of course the corresponding unlock function Unlock_kernel must also call the same number of times to unlock the kernel so that other processors can enter the kernel. The whole process is simply the ability to achieve a recursive lock.

The kernel lock is essentially a spin lock, but it is also different from the spin lock, where the spin lock is not recursive to acquire locks (which can lead to deadlocks), while a large kernel lock can recursively acquire locks. A large kernel lock protects the entire kernel, while the corresponding spin lock is used to protect a very specific shared resource. Due to the general situation of the use of large kernel lock time to maintain the lock, resulting in a serious impact on the performance and scalability of the system, I think this is one of the important reasons it was kicked out of the kernel. At this point, about the content of the large kernel lock to this introduction, at the same time, about the Linux kernel lock mechanism is also introduced, the following author will be described above the content to make a summary.

Summarize

Finally, the author summarizes all the contents discussed earlier, including atomic operation, spin lock, memory barrier, read/write spin lock, sequential lock, signal volume, read/write semaphore, completion amount, mutex,RCU mechanism,BKL (large kernel lock ).

Through some of the above discussion, we can summarize the following basic points:① Atomic operations on integer operations, spin locks and signal volume applications are more extensive. ② when the critical area is small, the spin lock should be selected, and conversely, the signal volume should be selected. ③ about the selection of semaphores: The semaphore is for the process level, it runs in the kernel as a process, so it is generally used when the process that requests the semaphore takes a long time to take up resources. The ④ read-write spin lock and read-write semaphore conditions are much more relaxed relative to spin locks and semaphores, which can be derived from their definition. The application of ⑤ RCU mechanism is more and more widespread. ⑥ Memory barrier functions are more complex to use, and in most cases need to be related to specific architectures and are generally not recommended.

At the end of the article, the author analyzes the Linux kernel lock mechanism in the process of reference to some of the literature, interested readers can look at, but the fact that most of the content is still relatively shallow, I suggest if you want to understand the kernel of the lock mechanism or to see, to analyze the kernel source code.

this concludes with a series of blog posts about the lock mechanism in the "big talk" Linux kernel. Due to the author's level limit, this series of blog will inevitably have errors, readers are welcome to point out that we discuss each other and common progress.

Reprint Please specify source:http://blog.sina.com.cn/huangjiadong19880706

Reference documents

[1],Robert love, "Linux kernel design and implementation," Third edition, mechanical Industry Press,2011.

[2],Wolfgang mauerer, "deep Linux kernel Architecture", people's post and Telecommunications press,2011 years.

[3], Song Baohua, "Linux device Driver Development details", the second edition, People's Post and telecommunications publishing house,2011.

[4], Li Yunhua, "The original product kernel:Linux kernel Source code Guide", People post and Telecommunications press,2009 years.

[5],http://blog.chinaunix.net/space.php?uid=25845340&do=blog&id=3011577

[6],http://www.ibm.com/developerworks/cn/linux/l-cn-spinlock/

[7],http://www.ibm.com/developerworks/cn/linux/l-rcu/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More