Linux kernel radix Tree (ii)

Source: Internet
Author: User
Tags serialization

1. Concurrency Technology

Due to the need for page cache is global, the process of non-stop access, must consider its concurrency performance, simple to a tree using locks caused by a lot of contention is not enough to meet the speed needs, Linux in the traversal of the tree when the use of a RCU technology to achieve synchronization concurrency.
RCU (Read-copy Update), is a guarantee to read the radix tree, you can not control the insert/delete operation, that is, no need to use locks. From the kernel code point of view, when the lookup operation, read a node, the use of similar to node = Rcu_dereference (*slot); of the call. Insert/delete operation of the pointer, the use of Rcu_assign_pointer (Node->slots[offset], slot); of the call. The specific synchronization of things are handed to RCU to engage.

A fully concurrent query operation can be performed by using the Rcu,rcu radix tree. RCU essentially requires the atom to move the pointer from one version of the data structure to the new version, keeping the old version until the system is in a standstill state. At the Quiesce state Point, the old version of the data structure has no users and can therefore be safely freed.

RCU the modification of the radix tree requires serialization, but the query no longer needs to be serialized with the modify operation.

RCU can make the RCU radix tree query fully parallelized, but the modification operation becomes a "bottleneck". This can be improved by breaking the lock of the whole tree into smaller locks, and the obvious method is to lock the node rather than lock the whole tree.

Radix tree modification operations can be divided into one-way and two-way operations. One-way operations perform only one-direction pointer movement from the root node and leaf nodes, including insert, UPDATE, and set label operations. The bidirectional operation is more complex, it needs to move back after the pointer moves to the leaf, which includes the delete and clear label operations.
Cascade Locking (Ladder Locking) and lock-coupling (lock-coupling) techniques are commonly used in database aspects, allowing one-way traversal of a node-locked tree (bidirectional may result in deadlocks). If all the modifiers are modified from the top of the tree to the bottom of the tree and the modified node holds the lock, then the child is locked down while the child is locked and the node lock is released. In this case, the concurrency operation is possible because as long as the root node is unlocked, the other operation can proceed from the top down. If the paths of the two operations do not have the same operation node, the latter operation may be completed before the previous operation completes. The worst-case scenario is pipelining, but this is much better than serialization operations.

Two-way operations include delete and clear label operations, respectively, as follows:

8.1 Clear Label

Clearing a label in the radix tree includes actions to traverse the tree down, find the anchor entry, and clear the entry label. As long as the child node does not have a tag entry, you can walk up the node to clear the label. The end condition is: if the traversal encounters a node, after clearing a label, it has one or more entries with a set of tags that can end up traversing. In order to have the same end point as during a down traversal, change the termination condition to: The UP traversal will end at a node that has more labels than the number of clear labels. Thus, whenever such a node is encountered, it will be used as the end point of the upper traversal tree.

8.2 Delete Element

Deleting an element also needs to delete all the tags of the entry when it deletes the useless node. Its termination conditions need to be met in both respects. The following conditions need to be met when walking backward through the tree: when a non-empty node is encountered and there is no useless label, the upward fallback traversal tree should be terminated.
The criteria for identifying this point when traversing a tree downward is when a node with more than 2 children is encountered, and when more than one label entry for each tag is cleared, the end is traversed. This condition is used to identify the terminating point of the upward fallback traversal.

8.3 APIs for parallel operations implementation: Query get slot Operation

The query operation supports RCU non-blocking parallel read operations, so it is necessary to follow the RCU usage plus RCU read lock, and also need to use rcu_dereference () for the acquired slot, in the case of a write (or update) operation, the new slot needs to be rcu_assign_pointer ( )。 The usage of the query operation is listed as follows:

struct page **slot, *page;
Rcu_read_lock ();
Slot = Radix_tree_lookup_slot (&mapping->page_tree, index);
page = Rcu_dereference (*slot);
Rcu_read_unlock ();

8.4 APIs for parallel operations implementation: Querying the Modify slot Operation

The Linux kernel's radix tree requires patching to support concurrent modifications. The query has only one global state: RCU static State, and concurrent modifications need to track what locks are held. The lock state must be external to the operation, so we need to instantiate a local context to track these locks. The methods for querying the modified slots are listed below:

struct page **slot;
Define_radix_tree_context (Ctx,&mapping->page_tree);
Radix_tree_lock (&CTX); /* Lock the root node */
/* Ctx.tree instead of &mapping->page_tree as the root, you can pass the context
Slot = Radix_tree_lookup_slot (tx.tree, index);
Rcu_assign_pointer (*slot, new_page);
Radix_tree_unlock (&CTX);

The Radix tree API function Radix_tree_lookup_slot contains the lock move mechanism from the top of the tree, and the Code section of the lock movement is listed below:

void **radix_tree_lookup_slot (Structradix_tree *root, unsigned long index)
{
...
Radix_tree_context (CONTEXT, Root); /* Provide context and actual root pointer *,
...
do {
...
/* Move the lock down from the top of the tree */
Radix_ladder_lock (context, node);
...
} while (height > 0);
...
}

2. Other points of attention

Indirect pointers and direct pointers

In the real world, addresses are byte-aligned, so there is no last-digit 1 case. Then you can use the last one of the address to identify some useful information: the identity of 0 means that the node is a direct node, directly pointing to the item data, and vice versa, is the indirect node, pointing to the next layer of nodes.

#define RADIX_TREE_INDIRECT_PTR 1

Put the last value at 1

Static inline Void*radix_tree_ptr_to_indirect (void *ptr)

{return (void *) ((unsigned long) ptr |  RADIX_TREE_INDIRECT_PTR); }

Put the last value at 0

Static inline void*radix_tree_indirect_to_ptr (void *ptr)

{return (void *) ((unsigned long) PTR & ~radix_tree_indirect_ptr); }

Determine if it is an indirect node

Static inline intradix_tree_is_indirect_ptr (void *ptr)

{return (int) ((unsigned long) PTR & radix_tree_indirect_ptr); }

Linux kernel radix Tree (ii)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.