The nature and definition of red and black trees
The Red-black tree is a two-fork search tree that satisfies the following properties:
1. Each node is either red or black.
2. The root node is black.
3. All leaf nodes are black (they are actually null pointers, denoted by nil). Leaf nodes do not contain any keyword information, and all query keywords are on non-endpoints.
4. The two sub-nodes of each red node must be black. In other words: no two consecutive red nodes can be found on all paths from each leaf to the root
5. All paths from any node to each of its leaves contain the same number of black nodes
Black Depth -the number of black nodes on a path from a node x (not including node x itself) to the leaf node (including the leaf node), called the black depth of the node x, as the BD (x), and the black depth of the root node is the black depth of the red black tree. The black depth of the leaf node is 0. For example: BD (=2,BD) (8) =2,BD (1) =1
internal nodes --non-endpoints of red and black trees
outer node --The leaf node of the red and black tree
Correlation theorem of red and black trees
1. The longest possible path from the root to the leaf is no more than twice times longer than the shortest possible path.
According to the properties of the above 5 we know that each path of the red black tree is a 3 black node. So the shortest path length is 2 (Paths without red nodes). Then according to the nature of 4 (two red nodes can not be connected) and the nature of the (leaf and root must be black node). Then we can conclude that a path with 3 black nodes can have up to 2 red nodes (the red and black intervals exist). That is to say, black depth is 2 (the root node is also black) the longest path of the red black tree is 4, the shortest path is 2. From this we can see that the red and black trees are roughly balanced. (certainly worse than balanced binary trees, AVL has a balance factor of up to 1)
2. Red black Tree height (h) not greater than twice times the black depth of the red black Tree (BD), i.e. H<=2BD
According to Theorem 1, it is not difficult to illustrate this point. BD is the shortest path length for red and black trees. The longest possible path length (maximum tree height) is the red-black path, which is equal to 2bd. So h<=2bd.
3. A tree of red and black trees with n internal nodes (excluding leaf nodes) high H<=2log (n+1)
Here we first prove that a red-black tree with n internal nodes satisfies n>=2^bd-1. This can be proved by mathematical induction, the application is summarized in the tree height h. When h=0, this is equivalent to a leaf node, black height bd is 0, while the internal node number n is 0, at this time 0>=2^0-1 is established. Assuming the tree height h<=t, n>=2^bd-1 established, we remember a tree high for the t+1 of red and black tree root node of the left subtree of the number of internal nodes is NL, the number of internal nodes of the right subtree is NR, remember the black height of the two sub-trees for BD ' (note the black height of the two subtrees is necessarily the same), Obviously the tree of the two trees high <=t, so there are nl>=2^bd '-1 and nr>=2^bd '-1, the two inequalities are added to nl+nr>=2^ (BD ' + 1)-2, the inequality is added to the left and right 1, get n>=2^ (BD ' + 1) -1, obviously BD ' +1>=BD, so the preceding inequality can become n>=2^bd-1, which proves that a red-black tree with n internal nodes satisfies n>=2^bd-1.
In accordance with the theorem 2,h<=2bd. namely n>=2^ (H/2)-1, then H<=2log (n+1)
From here we can see that the search length of the red and black tree is not more than 2log (n+1), so its lookup time complexity is also O (log n) level.
Operation of red and black trees
Because each red and black tree is also a special two-fork lookup tree, the find operation on the red and black tree is the same as the find operation on the normal binary lookup tree. However, inserting and deleting on a red-black tree will result in no longer conforming to the nature of the red-black tree. Restoring the properties of a red-black tree requires a small amount (O (log n)) of color changes (actually very fast) and no more than three tree rotations (two times for the insert operation). While insertions and deletions are complex, the operation time can still be maintained at O (log n) times.
Insert operation
We first increment the node with a two-fork look-up tree and mark it as red. (If set to black, it will lead to the path of the root to the leaf, there is an additional black node, this is difficult to adjust.) However, when the red node is set, it may cause two consecutive red nodes to conflict, so you can adjust by color flips and tree rotation. What to do next depends on the color of the other neighboring nodes. As in the human family tree, we will use the term Uncle node to refer to the sibling node of the parent node of a node.
Suppose the newly added node is n, the Father node is p, the Uncle node is the UI (the Uncle node is the sibling node of some column p), and the grandfather node G (Father node P's father). Each case is given below, and we'll use the C sample code to show it. You can find the uncle and grandfather nodes of a node by using the following functions:
node Grandparent (node N) { return n.parent.parent;} node uncle (node N) { if (n.parent = = grandparent (n). Left )return grandparent (n). Else return grandparent (n). Left;
Situation 1. The current red-black tree is empty, and the new node n is located on the root of the tree with no parent nodes.
At this point it is very simple, we will insert a black node N (satisfying property 2), and the other case of N is red (the reason mentioned earlier).
void insert_case1 (node N) { if (n.parent = = NULL )= BLACK ; Else // Insert case 2 }
Situation 2. The parent node p of the new node n is black.
In this case, we insert a red junction N (satisfying property 5).
void Insert_case2 (node N) { if (n->parent->color = = BLACK )return/ / tree is still valid else // Insert case 3 }
Note: Under condition 3,4,5, we assume that the new node has a grandparent node because the parent node is red, and if it is the root, it should be black. So the new node always has an Uncle node, although it may be a leaf under conditions 4 and 5.
Case 3. If the parent node p and the Uncle node u are both red.
For example, since the newly added n node must be red, we can redraw the parent node P (guaranteed property 4), and N's Uncle node U (guaranteed property 5) again to black. If at this point the grandfather node G is the root, then the change is over. If it is not the root, the grandfather node is redrawn in red (guaranteed property 5). However, the father of G may also be red, in order to guarantee the nature of 4. We re-examine the G recursion as a newly added node n in a variety of situations.
void insert_case3 (node N) { if (Uncle (n)! = NULL && uncle (n)->color = = RED) { n ->parent->color = BLACK; Uncle (n)->color = BLACK; Grandparent (n)->color = RED; Insert_case1 (grandparent (n)); } Else insert_case4 (n);}
Note: Under Scenarios 4 and 5, we assume that the parent node p is the left child node of the grandparent node G. If it is a right child node, the left and right of scenario 4 and scenario 5 should be swapped.
situation 4. The parent node p is red and the Uncle node U is black or missing, and the new node n is the right child node of its parent node p, and the parent node p is the left child of the grandparent node G.
For example, in this case, we do a left rotation to replace the new node and its parent node (same as the left rotation of the AVL tree); This causes some paths to pass through one of the new node n or parent node p that they did not previously pass, but the two nodes are red, so property 5 is not invalidated. But the current situation will violate the nature of 4, so then we proceed as follows 5 to continue processing the previous parent node p.
void Insert_case4 (node N) { if (n = = N.parent.right && n.parent = = grandparent (n). left) {
rotate_left (n.parent); = n.left; Else if (n = = N.parent.left && n.parent = = grandparent (n). right) { rotate_right (n.parent ); = n.right; } INSERT_CASE5 (N)}
Situation 5. The parent node p is red and the Uncle node U is black or missing, the new node n is the left child node of its parent node, and the parent node p is also the left child node of the grandparent node's G.
For example: In this case, we do a right rotation for the grandfather node p; In a rotation-generated tree, the previous parent node p is now the parent node of the new node n and the previous grandparent node G. We know that the previous grandfather Node G is black, otherwise the parent node p cannot be red. We switched the previous parent node p and grandfather node G color, the result of the tree satisfies the nature of 4[3]. Nature 5[4] Also remains satisfied, because all paths through any one of these three nodes were previously passed through the grandfather Node G, and now they all pass through the previous parent node p. In each case, this is the only black node in the three nodes.
void insert_case5 (node N) { n->parent->color = BLACK; Grandparent (n)->color = RED; if (n = = N->parent->left && N->parent = = Grandparent (n), left) { rotate_right (grandparent (n) ); Else { /** /rotate_left (grandparent (n)); }}
Delete operation
If the node that needs to be removed has two sons, then the problem can be turned into the problem of deleting another node with only one son (for ease of expression, the son referred to here, the son of a non-leaf node). For a binary lookup tree, when deleting a node with two non-leaf sons, we find the largest element in its left subtree, or the smallest element in its right subtree, and transfer its value to the node to be deleted (as shown here). We then delete the node from which we copied the value, which must have fewer than two non-leaf sons. Because just copying a value without violating any of the attributes, this simplifies the problem by removing the problem of having a node of up to one son. It doesn't care if the node is the node that was originally being deleted or the node from which we copied the value.
In the remainder of this article, we only need to discuss the deletion of a node with only one son (if its two sons are all empty, that is, they are leaves, we arbitrarily treat one of them as its son). If we delete a red node, its father and son must be black. So we can simply replace it with its black son, and not destroy Properties 3 and 4. All paths through the deleted node are just one less red node, which continues to guarantee property 5. Another simple case is when the deleted node is black and its son is red. If you just remove this black node and replace it with its red son, it will break the property 4, but if we redraw its son as black, then all paths through it will pass through its black son, so that it can continue to hold property 4.
What needs further discussion is that this is a complicated situation when the node to be deleted and its son are both black. We first replace the node to be deleted with its son. For convenience, the son is called N, whose brother (another son of his father) is S. In the following, we still use p to address N's father, SL called S's left son, Sr called S's right son. We will use the following function to find the sibling node:
struct node * Sibling (struct node *N) { if (n = = n->parent-> left )return n->parent-> right; Else return n->parent-> left;}
We can use the following code to perform the above outlined steps, where the function Replace_node replaces the child to n position in the tree. For convenience, the code in this section assumes that an empty leaf is represented by an actual node object that is not null (the code in the Insert section can work with either representation).
void delete_one_child (struct node *n) {/* * Precondition:n have at the most one non-nu ll child. */ struct node *child = Is_leaf (n >right)? N->left:n->right; Replace_node (n, child); if (n->color == BLACK) {if (child->color == RED) ch ILD ->color = BLACK; else Delete_case1 (child); } free (n);}
If N and its original father is black, then deleting its father causes a black node to pass through the path of N less than the path that does not pass through it. Because this violates attribute 4, the tree needs to be rebalanced. There are several situations to consider:
Situation 1. N is the new root.
In this case, we are done. We have removed a black node from all paths, and the new root is black, so the attributes are maintained.
void delete_case1 (struct node *N) { if (n->parent! = NULL) delete_case2 (n);}
Note: Under conditions 2, 5 and 6, we assume that N is its father's left son. If it is the right son, then in these cases the left and right should be swapped.
Situation 2. S is red.
In this case we do a left spin on the father of N and convert the red brother to N's grandfather. We then swapped the colors of N's father and grandfather. Although all the paths still have the same number of black nodes, now N has a black brother and a red father, so we can proceed to 4, 5 or 6 cases. (Its new brother is black because it is a son of red S.) )
void Delete_case2 (struct node *N) { *s = sibling (n); if (S->color = = red) { n->parent->color = red; s->color = BLACK; if (n = = n->parent-> left) rotate_left (N-parent) ; Else Rotate_right (parent, n); DELETE_CASE3 (n);}
Scenario 3: The sons of N's father, S, and s are all black.
In this case, we simply redraw S to red. The result is all paths through s, which are those paths that previously did not pass N, with a black node missing. Because deleting N's initial father makes all the paths through n less a black node, this makes things all balanced up. However, all paths through P now have a black node less than the path that does not pass p, so the attribute 4 is still violated. To fix this problem, we need to do a rebalancing on P starting with scenario 1.
void delete_case3 (struct node *N) { *s = sibling (n); if (N->parent->color = black) && (s->color = = black) && (S- Left->color = = black) && (s->right->color = = black)) { s->color = RED; DELETE_CASE1 (parent, n); Else delete_case4 (n);}
situation 4. the sons of S and S are All black, but N's father is red.
In this case, we simply swap the color of N's brother and father. This does not affect the number of black nodes that do not pass n paths, but it adds one to the number of black nodes on the path through n, fill the black nodes that are deleted on those paths.
void Delete_case4 (struct node *N) { *s = sibling (n); if (N->parent->color = = RED) && (s->color = = BLACK) && (s->left- >color = = black) && (s->right->color = = black)) { s->color = RED; n->parent->color = BLACK; Else delete_case5 (n);}
Situation 5. S is black, S's left son is red, S's right son is black, and N is its father's left son.
In this case we do a right spin on s, so that S's left son becomes S father and N's new brother. We went on to Exchange S and its new father's color. All paths still have the same number of black nodes, but now N has a right son is the red black brother, so we entered the situation 6. Neither N nor its father is affected by this transformation.
voidDELETE_CASE5 (struct node *N) {struct node*s =sibling (n); if(S->color = =BLACK)/*This if statement are trivial,due to case 2 (even though case, changed the sibling to a sibling ' s child,the sibling ' s child can ' t is red, since no red parent can has a red child. *///the following statements just force the red-to is on the left of the the parent,//or right of the right, so case six would rotate correctly. if(n = = n->parent->left) &&(S->right->color = = BLACK) &&(S->left->color = = RED)) {//This last test was trivial too due to cases 2-4.S->color =RED; S->left->color =BLACK; Rotate_right (s); } Else if(n = = n->parent->right) &&(S->left->color = = BLACK) &&(S->right->color = = RED)) {//This last test was trivial too due to cases 2-4.S->color =RED; S->right->color =BLACK; Rotate_left (s); }} delete_case6 (n);}
Situation 6. S is black, S's right son is red, and N is its father's left son.
In this case we do a left spin on the father of N, so that s becomes the father of N and the parent of S's right son. We then swapped N's father and s for the color, and made s the right son black. The subtree is still the same color on its root, so attribute 3 is not violated. However, N now adds a black ancestor: either N's father turns black, or it is black and S is added to a black grandfather. Therefore, a black node is added through the path of N.
At this point, if a path does not pass N, there are two possibilities:
It passed N's new brother. Then it must have passed the father of S and N, and they just exchanged colors. So the path keeps the same number of black nodes.
It passed N's new uncle, S's right son. Then it passed through S, S's father and S's right son, but now only through S, it is assumed to be its former father's color, and S's right son, it was changed from red to black. The compositing effect is that this path passes the same number of black nodes.
In any case, the number of black nodes on these paths has not changed. So we recovered the attribute 4. The white nodes in can be red or black, but you must specify the same color before and after the transformation.
void Delete_case6 (struct node *N) { *s = sibling (n); s->color = n->parent->color; n->parent->color = BLACK; if (n = = n->parent-> left) { s->right->color = BLACK; Rotate_left (parent, n); Else { s->left->color = BLACK; Rotate_right (N-parent);} }
Similarly, the function call uses the trailing recursion, so the algorithm is in-place. In addition, recursive calls are no longer made after rotation, so a constant number (up to 3 times) of rotation is performed.
The advantages of red and black trees
The red-black tree can search, insert, and delete operations with the time complexity of O (log2 (N)). In addition, any imbalance will be resolved within 3 rotations. This is something that AVL does not have.
And in practice, many languages implement the data structure of red and black trees. such as TreeMap, TreeSet (Java), STL (c + +), and so on.
Red and black Trees