: This article mainly introduces the nginx Data Structure 1 -- ngx_int_t and ngx_rbtree_t. For more information about PHP tutorials, see. There is no way to start with 71 source files in the./src/core subdirectory. Browse the nginx. c file containing the main function and find that nginx uses a lot of self-encapsulated data structures. it is difficult to understand the meaning of the operations in the main function without figuring out what the data structure is. So we picked a seemingly basic data structure and started Research. The ngx_core.h file organizes all nginx data structures. It includes ngx_config.h first. we found three types in ngx_config.h.
1. ngx_int_t, ngx_uint_t, and ngx_flag_t
The first unfamiliar data type seen in nginx. c is ngx_int_t, and its definition is found in nginx_config.h.
typedef intptr_t ngx_int_t;typedef uintptr_t ngx_uint_t;typedef intptr_t ngx_flag_t;
We found three data types. I have not introduced intptr_t/uintptr_t in the introductory c Course. I found their definitions in the stdint. h header file of c.
/* Types for `void *' pointers. */#if __WORDSIZE == 64# ifndef __intptr_t_definedtypedef long int intptr_t;# define __intptr_t_defined# endiftypedef unsigned long int uintptr_t;#else# ifndef __intptr_t_definedtypedef int intptr_t;# define __intptr_t_defined# endiftypedef unsigned int uintptr_t;#endif
Note that the two types are void * pointers. although intptr_t and uintptr_t are indeed integer and unsigned integer pointer types, they are confusing, why should I use an integer as the pointer type of an integer? Let's take a look at the macro at the end. if the machine is 64-bit long, intptr_t is long int, and uintptr_t is unsigned long int, it is exactly the 64-bit compiler on my machine, sizeof () it is 8 bytes 64-bit. intptr_t smaller than 64 characters is int, and uintptr_t is unsigned int. The table shows that int and unsigned are 4 bytes in the 32-bit compiler, the 16-bit compiler contains 2 bytes. Therefore, intptr_t/uintptr_t should be an integer type that will change with the platform font length. After understanding, I found that the explanation in "deep analysis of Linux kernel source code" is that the system kernel treats the memory as a large array while the pointer is an array index/subscript, kernel programmers use this special integer to accept the memory address value and operate the memory, which is more intuitive than using pointers and is not prone to mistakes. It seems that in nginx, only int and unsigned int variables related to some platforms are used.
2. ngx_rbtree_t
2.1 What is a red/black tree
As a retired player who used to draw water in ACM competitions all the year round, he is still sensitive to the famous data structure such as the red and black trees. The red/black tree is a balanced binary search tree under special constraints. Students who have studied data structure should know that the earliest self-balancing binary tree AVL in the textbook strictly requires that the height difference of the subtree should not exceed 2, the distance from the root node to all leaf nodes is basically the same (balanced.
The red and black trees do not pursue a strict balance, but achieve a basic balance through five constraints:
① The node is red or black;
② The root is black;
③ The leaf node is black;
④ The child nodes of the red node are black;
⑤ Black knots are the same in a simple path from any node to its leaf node.
The ratio of AVL root to leaf node's longest distance to the shortest distance cannot exceed 2. The constraints of the red and black trees also ensure this feature (the longest path is red and black, and the shortest path is black. in this case, the longest path is exactly twice the length of the shortest path ).
Since it is an implementation of the balanced binary search tree, the red and black trees are naturally in an internal order. at the same time, like the AVL tree, it supports searching, inserting, and deleting O (log2n) time complexity.
Compared with AVL trees, the red and black colors ensure that during the rebalancing process after each insert or delete operation, the update of the entire tree topology only involves constant nodes. In the worst case, the O (log2n) nodes need to be restained, but only O (1) nodes are involved in the allocation meaning (average efficiency. However, because there is no strict constraint on the balance of the tree, the height difference between the left and right sub-trees of the red and black trees is greater than that of the AVL tree.
2.2 ngx_rbtree.h
The opportunity is rare. we will take the nginx source code as the material to gain a deeper understanding of the implementation of the red and black trees. The first is the structure of the node:
Typedef ngx_uint_t break; typedef ngx_int_t break; typedef struct ngx_rbtree_node_s ngx_rbtree_node_t; struct ngx_rbtree_node_s {primary key; // platform-related unsigned integer keyword limit * left; // left child node pointer ngx_rbtree_node_t * right; // right child node pointer ngx_rbtree_node_t * parent; // parent node pointer u_char color; // node color u_char data; // node data };
Then there is the structure definition of the red/black tree:
Typedef struct ngx_rbtree_s ngx_rbtree_t; // "_ s" indicates the struct "_ t" indicates the type. // The following is the definition of a function pointer variable type, is the pointer of the red/black tree insertion function. // The pointer typedef void (* ngx_rbtree_insert_pt) (ngx_rbtree_node_t * root, ngx_rbtree_node_t * node, ngx_rbtree_node_t * sentinel); struct ngx_rbtree_s {ngx_rbtree_node_t * root; // The root node pointer ngx_rbtree_node_t * sentinel; // The nente_nodepointer; // insert function pointer };
Using function pointer variables as struct member variables can achieve the effect of using struct as classes (both member variables and member methods). This method is quite common in nginx source code. For functions, nginx also has a more magical means-macro:
# Define ngx_rbtree_init (tree, s, I) \ ngx_rbtree_sentinel_init (s); \ (tree)-> root = s; \ (tree)-> sentinel = s; \ (tree) -> insert = I // Here, the value assignment of the insert function pointer achieves polymorphism.
Using macros to achieve the effect of inline functions (if function implementation is relatively simple, simply move the implementation process to the class), it is puzzling that C is not without inline keywords, even an inline function is defined in the same header file. Before studying inline functions, there are several macros to take a look:
#define ngx_rbt_red(node) ((node)->color = 1)#define ngx_rbt_black(node) ((node)->color = 0)#define ngx_rbt_is_red(node) ((node)->color)#define ngx_rbt_is_black(node) (!ngx_rbt_is_red(node))#define ngx_rbt_copy_color(n1, n2) (n1->color = n2->color) /* a sentinel must be black */#define ngx_rbtree_sentinel_init(node) ngx_rbt_black(node)
The variables in the nginx source code are easy to understand, so that we do not need to query information or find comments. Color sets 1 to red and 0 to black. if color is set to 1, nodes are red. if it is not red, nodes are black. if the color of nodes is copied, the color value is copied. the nodes must be dyed black.
static ngx_inline ngx_rbtree_node_t *ngx_rbtree_min(ngx_rbtree_node_t *node, ngx_rbtree_node_t *sentinel){ while (node->left != sentinel) { node = node->left; } return node;}
Ngx_inline is a macro, and the actual value is the keyword inline. This inline function is very easy to understand and looks like it is to find the node with the smallest node value in the subtree with any node as the root node. The implementation method is to find the left subnode at the edge of the red/black tree subtree. We have reason to guess that the sentinel node is an empty node or edge identifier.
2.3 red/black tree node insertion
Next, let's dive into ngx_rbtree.c to see how nginx implements several key red/black tree methods.
Voidngx_rbtree_insert (ngx_rbtree_t * tree, ngx_rbtree_node_t * node) {// The pointer of the root node pointer or the array of root node pointers. Will there be multiple root nodes, puzzling // temporary node pointer // the sentinel node pointer. it is estimated that the sentinel node may be different during each query, and it may be indicated that the position to be inserted is not a branch, it is inconvenient for me to write comments. ngx_rbtree_node_t ** root, * temp, * sentinel;/* a binary tree insert */root = (ngx_rbtree_node_t **) & tree-> root; // the pointer to the root pointer is assigned to root sentinel = tree-> sentinel; // The sentinel pointer is assigned to if (* root = sentinel) {// special sentence, if the root is a sentinel, that is, the tree is empty node-> parent = NULL; // newly inserted The node is changed to the root node-> left = sentinel; // The left sub-node of the new node is the sentinel node-> right = sentinel; // the right child node of the new node is also the sentry ngx_rbt_black (node); // The new root is Black * root = node; // confirm that the new node is the new root return; // insert end} // The insert pointer is given a function address when the tree is initialized. // You can view the macro ngx_rbtree_init (tree, s, I). // You can only blacklist the specified node, both the root and Sentinel are assigned to the insert pointer. specify a function // ngx_rbtree.c which has two optional functions that meet the parameter table: insert value and insert timer value // you can see how the two types of insert are implemented. what is the difference? tree-> insert (* root, node, sentinel ); /* re-balance tree * // if the new node is not the root node and its parent node is red, the loop while (n Ode! = * Root & ngx_rbt_is_red (node-> parent) {// if the parent node is a left child node, obtain the right brother if (node-> parent = node-> parent-> left) of the parent node {temp = node-> parent-> right; // if the right brother of the parent node is red if (ngx_rbt_is_red (temp) {ngx_rbt_black (node-> parent); // the parent node is black ngx_rbt_black (temp ); // the right brother of the parent node is black ngx_rbt_red (node-> parent); // the parent node of the parent node is red node = node-> parent; // the parent node of the parent node becomes the current node} else {// if the right brother of the parent node is black if (node = node-> parent-> right) {// if the new node is the right child node = node-> parent; // the parent node becomes the new node ngx_rbtree_left_rotate (root, sentinel, node ); // node left-hand} ngx_rbt_black (node-> parent ); // node's parent node is black // node's parent node is red ngx_rbt_red (node-> parent); ngx_rbtree_right_rotate (root, sentinel, node-> parent); // right-hand of the parent node of the node} else {// if the parent node is the right child node, obtain the left brother temp = node-> parent-> left; // if the left brother of the parent node is red if (ngx_rbt_is_red (temp )) {ngx_rbt_black (node-> parent); // the parent node is black ngx_rbt_black (temp); // The left brother of the parent node is black ngx_rbt_red (node-> parent ); // the parent node of the parent node is red node = node-> parent ;} else {// if the left brother of the parent node is black if (node = node-> parent-> left) {// if the new node is left child node = node-> parent; // the parent node becomes the current node ngx_rbtree_right_rotate (root, sentinel, node ); // right-hand of the current node} ngx_rbt_black (node-> parent); // the current node is black // the parent node of the current node is red ngx_rbt_red (node-> parent ); ngx_rbtree_left_rotate (root, sentinel, node-> parent); // left-hand node of the parent node of the current node} ngx_rbt_black (* root ); // Black the root node}
Then there is the basic node insertion function corresponding to the ngx_rbtree_insert_pt pointer:
Struct (ngx_rbtree_node_t * temp, ngx_rbtree_node_t * node, ngx_rbtree_node_t * sentinel) {ngx_rbtree_node_t ** p; // Although insignificant, the two-layer pointers are confusing (;;) {// unconditional loop or infinite loop, equivalent to while (1) but saves a character p = (node-> key <temp-> key )? & Temp-> left: & temp-> right; if (* p = sentinel) {// find the appropriate leaf node location break of the new node in the binary tree ;} temp = * p;} // make the new node take the appropriate positions of the sentry node to become the new leaf node, red, and generate the new Sentinel * p = node; node-> parent = temp; node-> left = sentinel; node-> right = sentinel; ngx_rbt_red (node );}
The only difference between the ngx_rbtree_insert_timer_value function and the ngx_rbtree_insert_value function is that the value is greatly reduced by two values to avoid overflow.
The above are the functions involved in the insertion of nodes. To be honest, I don't like the implementation of such a long function. In other words, I must write it myself. There are too many branch operations and the code logic is messy. we need to draw several diagrams. First, if the fruit tree is empty:
For example, a fruit tree has only one root node:
If C>:
If C
Static ngx_inline inline (ngx_rbtree_node_t ** root, using * sentinel, ngx_rbtree_node_t * node) {ngx_rbtree_node_t * temp; temp = node-> left; node-> left = temp-> right; // The left subnode points to the right node of the original left subnode if (temp-> right! = Sentinel) {// if the right node of the left child node is not the sentinel temp-> right-> parent = node; // the right child node of the left child node is mounted on the right child node} temp-> parent = node-> parent; // the left child node is mounted on the parent node of the right-hand node if (node = * root) {// if the right-hand node is the root node * root = temp; // assign the root node to the left subnode} else if (node = node-> parent-> right) {// if the right-hand node is node-> parent-> right = temp; // the left child node is mounted to the right of the parent node} else {// otherwise the left child node is mounted to the left node of the parent node-> parent-> left = temp;} temp-> right = node; // The right-hand node hangs the right node of the left subnode-> parent = temp ;}
Obviously, B will become the new root, and left C right:
If B
Other insertion scenarios are either symmetric with the preceding values or occur in other subtrees of the tree. the actual process is the same. LL-type right-hand, RR-type left-hand, LR-type first right-hand and then left-hand, RL-type first left-hand and right-hand. Unlike the AVL tree, when the knot is inserted, the left-or right-hand judgment conditions of the red-black tree are clearly the color of one or two nearby nodes. there is no difference in other processes.
2.4 delete nodes of the red/black tree
It is said that the difference between the red-black tree and the AVL tree is mainly reflected in the deletion of nodes. let's take a look. What did I just say? the function body for deleting a node is longer, and there are more than 165 rows. I decided to perform segmentation research. let's first look at the first part:
If (node-> left = sentinel) {// if the left child node is a sentry node or the left and right child nodes are all sentinel temp = node-> right; // obtain the right child node, next let it replace the node location subst = node; // node assigned to subst} else if (node-> right = sentinel) {// if the right sub-node is the Sentinel temp = node-> left; // Obtain the left sub-node, and then let it replace the node position subst = node; // node is assigned to subst} else {// if the left and right subnodes are not the sentinel subst = ngx_rbtree_min (node-> right, sentinel ); // Obtain the smallest node in the right subtree if (subst-> left! = Sentinel) {// if the left subnode of the smallest node of the right subtree is not the sentinel temp = subst-> left; // Obtain the left subnode of the smallest node of the right subtree} else {// otherwise, obtain the right subnode temp = subst-> right of the smallest node of the right subtree ;} // It seems that the subst will be deleted from the original position and replaced by the node position}
Next let's take a look at what temp and subst will do:
If (subst = * root) {// if the subst is root * root = temp; // temp takes over the root ngx_rbt_black (temp ); // Black temp/* DEBUG stuff */node-> left = NULL; // clear the node to be deleted; node-> right = NULL; node-> parent = NULL; node-> key = 0; return;} red = ngx_rbt_is_red (subst); // Obtain whether the subst is red if (subst = subst-> parent-> left) {// if the subst is the left subnode subst-> parent-> left = temp; // mount the Slave node to the subst location} else {// if the subst is the right subnode subst-> parent-> right = temp; // mount the Slave node to the subst location}
Next section:
If (subst = node) {// if the subst is the node to be deleted temp-> parent = subst-> parent; // replace the node directly, deleted successfully} else {// if the subst is not the node to be deleted if (subst-> parent = node) {// if the parent node of the subst is the node to be deleted temp-> parent = subst; // The Slave node is mounted on the subst} else {// if the node to be deleted is higher than the parent node of the subst node, temp-> parent = subst-> parent; // place the Slave node on the parent node of the subst} // the position of the Slave node to be deleted, copy the relationship between the node to be deleted and the surrounding node subst-> left = node-> left; subst-> right = node-> right; subst-> parent = node-> parent; ngx_rbt_copy _ Color (subst, node); // Copy the color if (node = * root) {// if the node to be deleted is root * root = subst; // subst takes over the root} else {// if the node to be deleted is not the root, subst takes over it if (node = node-> parent-> left) {node-> parent-> left = subst;} else {node-> parent-> right = subst ;}} if (subst-> left! = Sentinel) {// if the left sub-node of the subst is not the sentinel subst-> left-> parent = subst; // The left sub-node of the subst abandons the node, hanging up} if (subst-> right! = Sentinel) {// if the right sub-node of the subst is not the sentinel subst-> right-> parent = subst; // the right sub-node of the subst abandons the node, hanging up} // clear the node to be deleted node/* DEBUG stuff */node-> left = NULL; node-> right = NULL; node-> parent = NULL; node-> key = 0; // if the subst is red and the red/black tree constraint is still observed, the deletion can be completed if (red) {return ;}
It seems that the node deletion process has been completed successfully, but if the subst is black, we need to fix the constraints of the red/black tree. The main character of the following code is the temp node that takes over the subst position:
// When the successor node of subst is not the root and black, the loop while (temp! = * Root & ngx_rbt_is_black (temp) {if (temp = temp-> parent-> left) {// if temp is the left sub-node w = temp-> parent-> right; // obtain its right brother if (ngx_rbt_is_red (w )) {// if the right brother of temp is red ngx_rbt_black (w); // the right brother of temp ngx_rbt_red (temp-> parent ); // Red temp parent node // temp parent node left ngx_rbtree_left_rotate (root, sentinel, temp-> parent); w = temp-> parent-> right; // Obtain the New Right brother of temp} // if the left and right child nodes of temp are black if (ngx_rbt_is_black (w-> left) & ngx_rbt_is_black (w-> right) {ngx_rbt_red (w); // the right brother temp of red temp = temp-> parent; // Get the temp parent node as new temp} else {// if the child node of the right temp brother is not all black if (ngx_rbt_is_black (w-> right )) {// if the right child node is black ngx_rbt_black (w-> left); // the left child node ngx_rbt_red (w); // the right brother ngx_rbtree_right_rotate (root, sentinel, w); // right brother right hand w = temp-> parent-> right; // Obtain the New Right brother of temp} // Copy the temp parent node color ngx_rbt_copy_color (w, temp-> parent); ngx_rbt_black (temp-> parent ); // Black temp parent node ngx_rbt_black (w-> right); // the right child node of the right brother of temp // temp parent node left ngx_rbtree_left_rotate (root, sentinel, temp-> parent); temp = * root; // get root} else {// If temp is a right child node, do symmetric w = temp-> parent-> left; if (ngx_rbt_is_red (w) {ngx_rbt_black (w); ngx_rbt_red (temp-> parent); ngx_rbtree_right_rotate (root, sentinel, temp-> parent); w = temp-> parent-> left;} if (ngx_rbt_is_black (w-> left) & ngx_rbt_is_black (w-> right )) {ngx_rbt_red (w); temp = temp-> parent;} else {if (ngx_rbt_is_black (w-> left) {ngx_rbt_black (w-> right); ngx_rbt_red (w ); ngx_rbtree_left_rotate (root, sentinel, w); w = temp-> parent-> left;} ngx_rbt_copy_color (w, temp-> parent); ngx_rbt_black (temp-> parent ); ngx_rbt_black (w-> left); ngx_rbtree_right_rotate (root, sentinel, temp-> parent); temp = * root ;}} ngx_rbt_black (temp); // Black the current temp
As messy as inserting a knot, let's sort it out.
First, ignore the constraints of the red and black trees to delete them:
① If you delete a leaf node, that is, the node with no successor or any successor being a SENTEL, you can directly delete it;
② If there is only one successor, replace the node to be deleted;
③ If there are two successors, you need to select a node from the edge of the tree. There are two equivalent options: the maximum node of the left subtree of the node to be deleted and the minimum node of the right subtree, nginx selects the latter, replaces the key and value of the node (key and value/data) with the key and value of the node to be deleted, and then deletes the substitute.
Both the nodes to be deleted in scenario ① and scenario ② and the substitutes in scenario ③ are subst in source code. We will discuss it here.
The above is a pure topology change without considering the balance of the red and black trees. Next, we should consider whether to adjust the tree's topological structure to rebalance the tree, and whether to adjust the color of the node to re-conform to the constraints of the red-black tree. We know that the key constraint of the red/black tree is that the numbers of black knots are the same in the simple path from any node to the middle of its subtree. If the subst is a red node, we do not need to make any adjustments to the red and black trees. if the subst is black, all simple paths that pass through the subst will have a black knot point, so you need to adjust it.
Next we will discuss the differences based on different scenarios, because the adjustment method can be reversed when the binary tree scenario is left or right. we will only discuss the situation where the subst is the left subnode. Set temp to X, and W to the New Right brother of X. The simplified source code shows that the color change of nodes is confusing. let's take a look at it first:
① W is red: Black W, red X and W's parent node X-> parent, X-> parent left-handed, W reset to X's new right brother, then transfer to scenarios ①, ②, or ③;
② W is black, W is black after both: W is red, and X is reset to X-> parent;
③ W is black, W is black on the right child node: black on the left child node of W, W is red, W is right, W is reset to the New Right brother of X, then, assign the color of X-> parent to W and the color of X-> parent to black, X-> parent left-hand, and the root to temp;
④ W is black, W is red on the right child node: black on the left child node of W, red on the right hand node of W, and reset W to the New Right brother of X, then, assign the color of X-> parent to W, Black the X-> parent, Black the right child node of W, and assign the left-hand side of X-> parent to temp.
Finally, dye temp black. We can see that in scenario ①, we performed a left-hand operation, while scene ② only performed dyeing. scenes ③ and ④ both performed a right-hand operation and a left-hand operation. Scenario ① must be transferred to another scenario at the end of processing. the appearance of scenario ②, ③, and ④ marks the end of this adjustment. Then, in the adjustment process after the red/black tree deletes the node, the maximum number of times the rotation occurs in the cycle of scenario ① will be 1, 2, or 3, the more times you go up, the more rare it is (the number of times the cycle appears according to scenario 1). The maximum number of rotations may reach the tree height, that is, the log2n times. In the production environment, after deleting a node, the average number of times it rotates during each adjustment is as mentioned before the source code analysis. it will be a constant scale.
Next, I plan to rewrite the red and black trees in a step-by-step version to learn more about the data structure of the red and black trees. Before rewriting, we need to understand that all the leaf nodes in the nginx red and black are sentinel, which is an optimization for the red and black trees when the red and black trees are adjusted. By adding a full-black subnode, the red-black subtree can contain red nodes. Although I did not prove it, this constant scale increases the number of rotations when the knots are deleted, it also promotes the probability of adjustment when a new node is inserted (increases the probability of inserting a new node under a red node), and also increases the number of rotations. The rotation will compress the height of the red-black subtree to improve the query efficiency.
In the process from simple to exquisite rewriting of the red and black trees, I will consider using nginx to optimize the red and black trees from less to more, or add my own optimizations.
Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.
The above introduces the nginx Data Structure 1 -- ngx_int_t and ngx_rbtree_t, including some content, and hope to be helpful to friends who are interested in PHP tutorials.