Original: http://wurong81.spaces.live.com/blog/cns! 5eb4a630986c6ecc! 393. Entry
Radix tree used by Linux Kernel
1. Introduction
In the page cache of the Linux kernel, each data block of a file can correspond to only one page cache. It manages these cache items through two data structures, Radix tree and two-way linked list. Radix tree is a search tree. the Linux kernel uses this data structure to quickly locate cache items through file offset. It is one of the address indexes of Radix tree, the radix tree has a split of 4 (22) and a tree height of 4. It is used to quickly locate the 8-bit file offset. Each leaf node in the radix tree points to the cache entry corresponding to the corresponding offset in the file.
In Linux, the radix tree can be described as follows:
The number of forks shown in this figure is 2 ^ 2 = 4. Its height is 3. Each node contains a height field to mark the maximum binary integer at the root of the subtree to optimize the search speed. Of course, a Global Array like the following is required:
Static unsigned long height_to_maxindex [radix_tree_max_path + 1];
2. Important differences from basic Radix trees
(1) The corresponding key value is an integer in the binary format, rather than a string. The integer is much faster than the string. All leaf nodes correspond to the long integer value, that is, the address space value.
(2) When backward Link (forking) is described by a binary integer, the path edge does not contain any actual content. You only need to backward link to describe a path, this is the same as the standard trie structure. The binary representation is a decimal value, that is, four backward links of "0, 1, 2, 3. 3. Implementation key:
(1) The number of backward links varies with different systems, such as lib/radix-tree.c has the following define:
# Ifdef _ KERNEL __
# Define radix_tree_map_shift (config_base_small? 4: 6)
# Else
# Define radix_tree_map_shift 3/* for more stressful testing */
# Endif
# Define radix_tree_map_size (1ul <radix_tree_map_shift)
# Define radix_tree_map_mask (RADIX_TREE_MAP_SIZE-1)
In other words, in kernel applications, if config_base_small is marked, each node provides 2 ^ 4 = 32 forks; otherwise, 2 ^ 6 = 64 forks. Radix_tree_map_mask indicates 4 or 6 digits. Apparently, the maximum depth is:
# Define radix_tree_index_bits (8/* char_bit */* sizeof (unsigned long ))
# Define radix_tree_max_path (div_round_up/
(Radix_tree_index_bits, radix_tree_map_shift ))
(2) Because insert/delete needs to maintain the height field of each node, it is not simply based on the split and merge of the basic Radix tree, but needs to adjust the height value of the relevant node, it is implemented by calling radix_tree_extend and radix_tree_shrink respectively.
Here, radix_tree_extend must be adjusted only when the newly inserted integer is greater than the maximum integer that can be stored under the current height, because the latter is the maximum value that can be stored by a full Binary Tree Under the height, if the maximum value is not exceeded, the value is inserted in a certain height level without adjusting the height of any other node.
(3) concurrency considerations: Because the page cache needs to be global and the processes are constantly accessed, the concurrency performance must be considered, A large amount of contention caused by the use of locks on a tree cannot meet the speed requirements. In Linux, RCU technology is used to traverse the tree to Achieve Synchronous concurrency.
RCU (read-copy update). To be honest, I did not read it at all. I only know that it is a way to ensure that the insert/delete operation is not required when reading the radix tree, no locks are required. From the kernel code, when reading a node during the lookup operation, the call is similar to node = rcu_dereference (* slot. When the insert/delete operation pointer is used, use the call of rcu_assign_pointer (node-> slots [offset], slot. The specific synchronization tasks are all handled by RCU.