On Linux read-write synchronization mechanism RCU

Source: Internet
Author: User

RCU is a Linux system read-write synchronization mechanism, in the final analysis he is also a kind of kernel synchronization means, this question on the RCU probability and implementation mechanism, give the author's understanding.

"Rcu probability"

Let's look at the definition of RCU in the kernel document:

RCU is a synchronization mechanism this was added to the Linux kernel during the 2.5 development effort that's optimized For read-mostly situations.

Translation: RCU is a synchronization mechanism introduced in the 2.5 kernel to optimize the efficiency of data reading scenarios.

Said read more write less scenes, we can naturally associate to read and write locks, yes, RCU is a similar to the read and write locks to improve the performance of the code to read more and more in the implementation of the system, its core idea is "subscription release" mechanism.

In fact, we use locks to protect mutually exclusive resources, but only to prevent these two situations:

1) When the reader reads the data, the writer overwrites the data, causing the reader to read the incomplete data.
2) When the writer writes the data, another writer writes the data at the same time, causing the data to be written dirty
As a result, we have used a variety of locking mechanisms to protect mutually exclusive resources, and for read-write less, we also specifically optimized the read-write lock, so that in the absence of the writer, multiple readers can hold the lock in parallel, so that can read data in parallel, improve efficiency. So is there a way to unlock the protection of mutually exclusive resources? So here is the RCU mechanism. Its core idea is that the mutex data is accessed by pointers, and when the writer wants to update the data, it copies the data and modifies the copied data so that it does not disturb the reader who is reading the data at the same time. When the modification is complete, the old data pointer is updated to point to the new data by assigning a value to the pointer. Finally, the release of the old data, the release needs to wait for the reader to use the old data before exiting the critical section, and the waiting time in the RCU mechanism is called the "grace period." Some of the important concepts here are "copy on Write", "pointer Assignment", and "grace period". It is like a magazine subscription and publication, the reader reads the data like a subscription magazine, written by
Copying and modifying the data is like editing a magazine, and finally updating the data with a pointer assignment is better than publishing a magazine, while the grace period waits for the periodical's release cycle, so this is an image metaphor. Through this mechanism, we can achieve the reader's lock, which has the following characteristics:

1) Readers do not need to read data shackles, because the data is updated by the pointer assignment, and the modern CPU processor basically can guarantee the atom of pointer assignment, and the writer guarantees that the data has been modified before the pointer is assigned, so the reader's reading data is always complete, no need to lock
2) The writer must update the data through "copy on Write" and "pointer assignment", while the old data is used by readers who have read the old data before the old data is released before the data can be updated.
3) The writer and the writer still need the lock to be mutually exclusive synchronization, but because of the RCU usage scene to read and write less, so the overhead is acceptable.

The kernel document clearly identifies a typical step for a RCU data update:

A. Remove pointers to a data structure, so that subsequent
Readers cannot gain a reference to it.

B. Wait for all previous readers to complete their RCU read-side
Critical sections.

C. At the point, there cannot is any readers who hold references
To the data structure, so it is now safely is reclaimed
(e.g., Kfree () d).

Translation:

A. (usually from a linked list) removes pointers to data structures (usually linked list nodes) so that subsequent readers can no longer refer to the data (via a linked list)

B. Readers who have read and are using the data before the data is removed exit the critical section

C. At this time, no reader is using this data structure, so it can be safely recycled

For example, there is a list of the following:

____       ____       ____
-->|__a_|-->|__b_|-->|__c_|-->, .....

Now you need to recycle the B-linked list, then:

A. Remove the B node from the list first, and then no more readers will be able to access the B-node, after which the following conditions are removed:

____       ____                       ____
-->|__a_|-->|__c_|-->, ..... n-->|__c_|

where "n" means that the C node is being used by N readers, although C is not in the list, but still have readers hold pointers to C, so temporarily C's memory can not be recycled

B. Wait so that the reader using the C node is finished, that is, exit the critical section, where the situation is as follows:

____       ____                      ____
-->|__a_|-->|__c_|-->, ..... 0-->|__c_|

"0" means that no readers are using the C node, so they can safely recycle

C. Destroy the C node and reclaim the memory:

____       ____
-->|__a_|-->|__c_|-->, .....

D. If you do not want to delete B, but only want to update the content of B, then the security changes, the result of the modification and then the B-node in an atomic way back into the queue, as follows:

____       ____       ____
-->|__a_|-->|__b_|-->|__c_|-->, .....

Well, here are a few key points that are not clear:

1. How do I know if the current reader process is using C node?

2. When the reader exits the critical section, what if the notice comes out?

So, the kernel is going to give us an API to do these things, so keep looking down.

"RCU's Core API"

The kernel documentation lists several core API functions as follows:

A. Rcu_read_lock ()
B. Rcu_read_unlock ()
C. SYNCHRONIZE_RCU ()/CALL_RCU ()
D. Rcu_assign_pointer ()
E. rcu_dereference ()
This means that the 5 APIs are the most basic, there are some other APIs, but can be achieved through the combination of the 5 APIs, the following one by one explains:

A. void Rcu_read_lock (void);
Translation: Used to inform the collector that the current reader has entered the critical section, in the reader's critical section is not allowed to block.

b. void Rcu_read_unlock (void);
Used to inform the collector that the current reader has exited the critical section.

c. void Synchronize_rcu (void);
SYNCHRONIZE_RCU is used to wait for the reader to enter the critical section before the SYNCHRONIZE_RCU call (which does not care to enter the critical section after the SYNCHRONIZE_RCU call), until the function is blocked. When returned, the old data can be safely released.

The kernel document also gives an example of how to:
CPU 0 CPU 1 CPU 2
-----------------           -------------------------              ---------------
1. Rcu_read_lock ()
2. Enters Synchronize_rcu ()
3. Rcu_read_lock ()
4. Rcu_read_unlock ()
5. Exits Synchronize_rcu ()
6. Rcu_read_unlock ()

D.typeof (P) rcu_assign_pointer (P, typeof (P) v);

This is a macro implementation, but also can only be macros, their own experience under (Hint: typeof ... )
Quote a kernel document the exact words: The updater uses this function to assign a new value to an rcu-protected pointer, in order to safely communicat E The change in value from the updater to the reader. This function returns the new value, and also executes any memory-barrier instructions required for a given CPU architectu Re.
This function is used to complete the previously mentioned "pointer assignment" action, it will deal with some memory barrier situation, otherwise we directly assign the value is, why use this macro?

E. typeof (P) rcu_dereference (p);
In the same way, the kernel document is interpreted as a macro implementation:
The reader uses rcu_dereference () to-fetch an rcu-protected pointer, which returns a value, and then is safely derefer enced. Note that rcu_deference () does isn't actually dereference the pointer, instead, it protects the pointer for later Dereferenc Ing. It also executes any needed memory-barrier instructions for a given CPU architecture.

This passage is difficult to understand, but plainly, when you want to get a pointer to a RCU data, rcu_dereference can return a safe reference. Here dereference is a very interesting word, we can check the difference between reference and dereference, very fun.

Summary

The key to understanding the RCU mechanism is how to understand "subscription publishing," and indeed, when we buy apps in the App Store, the user gets a fully available apk, the final product, and the app's development process is not visible to the user. When the author wants to update the software, it will be revised offline, and then pushed to update, that is, release. Similarly, when updating data, the RCU mechanism removes the data from the linked list (similar to the shelf) and waits for the reader to use the data, which we call "grace period" (similar to the following applications still continue to provide customer service, but there will be a deadline), and so on after the grace period has been revised and new, then re-plug it back into the linked list (like app re-shelves). This is a very ingenious design that takes some time to understand, but once understood, it is easy to master these concepts without even needing any memory.

On Linux read-write synchronization mechanism RCU

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.