Linux kernel rcu (Read Copy Update) lock brief analysis-prequel

Source: Internet
Author: User

If you use the top command of the Linux perf tool as the hotspot picket, you will find that there must be several locks in the top 10 suspects!
In parallel multi-processing, it is unavoidable to encounter a lock problem, which is unavoidable, since this is probably the only way to protect shared data, and the protected area is the critical section. And we know that the cost of the lock is huge because it inevitably waits or waits for others, but it's not the nature of the overhead, the essence of the overhead is that many locks use the "atomic manipulation" technique, so an atomic operation can have a big impact on bus or cache consistency. For example, to a variable atomic plus 1, do not think it is very simple, in fact, there will be a lot of unwanted operations, in a certain architecture of the processor, the first to lock the bus, which means that the lock does not release during the other processors can not be stored (at least some areas of memory), may also involve the brush cache, or trigger the cache consistency operation ... This is not the hardest hit, in some architectures, there is a memory fence, it will brush off the CPU pipeline, brush off the cache, almost all of the design for the optimization of the scheme all failed, of course, this is the price, the benefit is that you protect the critical area.
You have to pay the price to protect the critical area, which is a bit big if you pay with a complex lock. Does it have to be like this? Maybe your data structure is poorly designed, maybe your code flow is poorly designed, such as multiple threads reading shared data at the same time, two threads reading a write, can you use a ring buffer to mitigate the competition? In fact, many such as network cards, hard drives and other shared peripheral drivers are so playing, the code as long as the read pointer and write pointers do not go beyond each other, so that the use of the lock can be minimized, of course, this is a very simple example.

The design of the data structure and code flow is one hand, but this level is not enough abstraction, the better way is to design a more optimized lock. Read-write lock this asymmetric lock to the reader to write less than the situation is an optimization of the lock, its preferential treatment for the reader is no need to wait, as long as no writer can directly read, otherwise just wait. And for the writer, it needs to wait for all the readers to finish. This read-write implementation can depend on another mechanism called Spin lock implementation, one of my implementations is as follows:

typedef struct {    spinlock_t *spinlock;    atomic_t readers;} rwlock_t;static inline void Rdlock (rwlock_t *lock) {    spinlock_t *lck = lock->spinlock;    if (likely (!lock->readers++))        Spin_lock (LCK);} static inline void Rdunlock (rwlock_t *lock) {    spinlock_t *lck = lock->spinlock;    if (likely (!--lock->readers))        spin_unlock (LCK);} static inline void Wrlock (rwlock_t *lock) {    spin_lock (lock->spinlock);} static inline void Wrunlock (rwlock_t *lock) {    spin_unlock (lock->spinlock);}

It's OK, isn't it? But the best solution is to completely abandon the lock, completely without the lock.
I used to design my forwarding post, in order to reduce the lock overhead, I copied a partial local forwarding for each CPU, these forwarding statements are consistent, generated by the routing table, thinking that this can avoid competition, however, these forwarding will always face update problems, how to update them?? My initial approach was to use the IPI (interrupt between processors), in the processing function, to stop the processing thread, then update the data, and finally open the thread, so that lock can be avoided during processing. Very reasonable, isn't it? But I think it's complicated.
Take a closer look at the write lock of the read-write lock, which recklessly carries out the standard locking operation, while the read lock uses the lock action when the first reader comes in. Can these locking operations cause the wait to be avoided? Look at my original IPI scheme, stop the thread is to prevent readers to read the wrong data, in fact, will take the initiative to give way to write the flow of execution, the writer first, and then look at the writer in the read-write lock, found that there are readers, there is no initiative to give way, and just passively wait, this wait is very boring!
Can I combine the way I read and write locks?
How to combine? According to just the idea, nothing is to write a passive wait or the first reader to make a decision! But it has another option: to write the data according to its own process, not to write the original data, but to write a copy of the original data (a great write-time copy) and then hang it on a list of unfinished transactions, waiting for the system to find that all the readers are done with the data on the linked list, overwriting the original data. This is a good combination, this is the great RCU lock. The price of the reader is simply to mark someone to read, and the writer does not have to wait to hold the lock, write a copy directly, write off, later on to the system ....

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Linux kernel rcu (Read Copy Update) lock brief analysis-prequel

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.