Analysis of red and black tree data structure

Source: Internet
Author: User

Red-black tree is a relatively common data structure in computer science, which makes the search, insertion and deletion of data can maintain the time complexity of O (LGN). However, compared to the general data structure, the implementation of red-black trees has increased the difficulty. The implementation of the red-black tree on the network is voluminous, but it is not used to introduce the red-black tree. This article through a self-implemented red and black tree data structure and the necessary search, insert and delete operation algorithm, for everyone to more systematically analyze the implementation of red-black tree data structure.

For most data structures, it is generally implemented in the form of abstract data types, and the template mechanism provided by C + + is independent of the data type, as the STL implements. However, this article is not to implement the red and black tree in STL, more importantly, through the implementation of red and black trees to learn the relevant algorithms and ideas. Of course, we will also draw on the STL in the implementation of the red and black tree to achieve some valuable content.

First, the basic concept

The basic meaning of the red-black tree must be clarified before it is specifically implemented. The red-black tree is essentially a binary search tree that satisfies the basic nature of the binary search tree-that is, any node in the tree has a value greater than its left child node and less than its right child node.

Figure 12 Fork Search Tree

Organize the data according to the binary search tree, making it quick to find the elements. Compared to 1 in the two fork search tree, if the query value is 48 nodes, only need to traverse 4 nodes to complete. Theoretically, the average search efficiency for any node of a balanced two-fork tree is the height of the tree h, which is O (LGN). But if the binary search tree loses its balance (the elements are all on one side), the search efficiency degrades to O (n), so the balance of the binary search tree is the key to the search efficiency. In order to maintain the balance of the tree, a variety of trees appear in the data structure, such as the AVL tree by maintaining the height difference of the left and right sub-tree of any node is not greater than 1 to maintain the balance of the tree, while the red and black tree uses the concept of color to maintain the balance of the tree, so that the height difference between the Compared to the other two-fork search tree, the red-black tree has its own advantages over the balance of the two-fork search tree.

As the name implies, the nodes of the red and black tree are color concept, that is, non-red or black. With the color constraints, the red and black trees maintain the balance of the two-fork search tree. A red and black tree must meet the following conditions:

Rule 1 , the root node must be black.

Rule 2 , any path from the root to the leaf does not contain contiguous red nodes.

Rule 3 , the total number of black nodes for any path from root to leaf is the same.

2, as a legitimate red-black tree, you can find the red and black trees in maintaining the basic properties of the binary search tree under the premise, and meet the red and black tree color conditions, the overall balance of the two-fork search tree maintained. (The data sequence for constructing the following red-black tree is: (50,35,78,27,56,90,45,40,48), which the reader can verify by itself.) )

Fig. 2 Red and black trees

Second, the data structure design

Similar to the general data structure design, we use the abstract data type to represent the nodes of the red-black tree, and use pointers to preserve the relationships between nodes.

As a red-black tree node, its basic properties are: node color, left dial hand node pointer, right child node pointer, parent node pointer, node value.

Figure 3 Red-Black tree node Basic properties

To facilitate the implementation of the key algorithms of the red and black tree, some simple operations (all inline functions) are also defined.

Red and black tree nodes
Template<class t>
Class Rb_tree_node
{
typedef rb_tree_node_color Node_color;
typedef rb_tree_node<t> NODE_TYPE;
Public
Node_color color;//Color
node_type*parent;//parent Node
node_type*left;//Left Dial Hand node
node_type*right;//Right child node
T value;//Value
Rb_tree_node (T&AMP;V);//constructor
inline Node_type*brother ();//Get sibling node
inline bool On_left ();//The Left Dial hand node itself
inline bool On_right ();//Self is Right child node
inline void Set_left (node_type*node);//Set left child node
inline void Set_right (node_type*node);//Set left child node
};

To represent the color of a red-black tree node, we define a simple enumeration type.

Red Black tree Node color
Enum Rb_tree_node_color
{
Red=false,
Black=true
};

With nodes, the rest is the implementation of red-black tree construction, insertion, search, delete and other key algorithms.

Red and black Trees
Template<class t>
Class Rb_tree
{
Public
typedef rb_tree_node<t> NODE_TYPE;
Rb_tree ();
~rb_tree ();
void Clear ();
void Insert (T v);//Add node
BOOL Insert_unique (T v);//Add unique node
node_type* Find (T v);//Query node
BOOL Remove (T v);//Delete node
Inline node_type* maximum ();//Maximum Value
Inline node_type* minimum ();//Minimum value
Inline node_type* Next (node_type*node);//Next node
Inline node_type* prev (node_type*node);//Previous Node
void print ();//Output
int height ();//height
unsigned count ();//Node Count
bool Validate ();//Validation
unsigned get_rotate_times ();//Gets the number of rotations
Private
node_type*root;//roots
Unsigned rotate_times;//number of rotations
Unsigned node_count;//node points
void __clear (Node_type*sub_root);//Clear function
void __insert (Node_type*&sub_root,node_type*parent,node_type*node);//Internal node insertion function
node_type* __find (Node_type*sub_root,t v);//Query
Inline node_type* __maximum (node_type*sub_root);//Maximum Value
Inline node_type* __minimum (node_type*sub_root);//Minimum value
void __rebalance (Node_type*node);//New insertion node adjustment balance
void __fix (Node_type*node,node_type*parent,bool direct);//delete node adjust balance
void __rotate (Node_type*node);//auto-judge type rotation
void __rotate_left (Node_type*node);//left rotation
void __rotate_right (Node_type*node);//Right rotation
void __print (Node_type*sub_root);//Output
int __height (node_type*&sub_root);//height
BOOL __validate (node_type*&sub_root,int& count);//Verify the legitimacy of the red and black tree
};

In the red-black tree class, the root (root) and number of nodes (count) are defined, which also records the number of rotations performed by the red-black tree when the delete operation is inserted rotate_times. Among the core operations are insert operations (insert), search operations (find), delete operations (remove), decrement operations (prev)--looking for nodes smaller than the current node, incremental operations (next)--looking for nodes larger than the current node, Maximum value (maximum) and minimum (minimum) operations, and so on. One of the validation actions (__ Validate) is to verify that the red-black tree is balanced by recursively manipulating the red-black tree and verifying the red-black tree's three basic color constraints.

Because the insert and delete operations are key to the red-black tree, these two actions are highlighted below. Other operations generally can be done by the recursive operation of the tree is easy to complete, here no longer repeat.

Three, the red black tree insert operation

The red-black tree's insert and query operations are similar in that it recursively looks for the insertion point in a binary search way. However, the boundary conditions need to be considered here--special handling is required when the tree is empty (there is no special technique of STL implementation for tree root nodes). If we insert the first node, we record the node directly with the tree root and set it to black, otherwise it will be recursive lookup insert (__insert operation).

The node colors that are inserted by default are red because inserting black nodes destroys the total number of black nodes on the root path, but even so, a continuous red node is present. Therefore, after the general insert operation, when the red-black tree constraint condition is not satisfied (called loss of balance), it must be adjusted according to the current situation of the red-black tree (__rebalance operation). The balance adjustment of the AVL tree is similar to the implementation of the rotation operation, and the adjustment of the red and black tree is usually done by rotating the color-changing operation of the node.

The imbalance in the red-black tree Insert node operation results from the current insertion point and the parent node's color conflict (all red, violation of rule 2).

Figure 4 Insert conflict

As shown in 4, because the red-black tree is balanced before the node is inserted, it can be concluded that the grandfather node G must exist (rule 1: The root node has to be black), and is black (rule 2: There will be no contiguous red nodes), and the Uncle node U color is uncertain, so the problem can be divided into two categories:

1, the Uncle node is black (if the empty node is the default black)

In this case, the red and black trees can be balanced by rotation and discoloration. However, considering the position of the current node N and parent node p is divided into four cases:

A, n is the P left dial hand node, and p is the left child node of G.

B, N is the right child node of P, and P is the right child node of G.

C, N is the P left dial hand node, and p is the right child node of G.

D, n is P right child node, p is the left child node of G.

Case A, a uniform is called the lateral insertion, c,d unification is called medial insertion. The reason for this classification is that the resolution of the same type of insertion is symmetric and can be done in a similar way as mirroring.

First consider the situation a:n is the P left dial hand node, p is the left child node of G. For this case, a right rotation operation can be made, and P is set to black, and g to red to complete the rebalancing.

Figure 5 left lateral insertion adjustment

The right-hand operation is to hook the p in the original position of the G-node (if the G is the root node, you need to consider the boundary condition), the right subtree X of P is hung to the left child node of G, and then the G is hung on the right child node of P, complete the right-handed operation. Here the root node of the subtree that eventually rotates the result is used as the axis of rotation (P-node), that is, the axis of rotation is called the root node of the new subtree at the end of the revolution! Here we need to emphasize the difference between the rotation operation and the STL, the rotation axis of the STL's right-hand operation is considered as the subtree node (G node) before rotation, but this does not affect the effect of the rotation operation.

In analogy, case B would need to use the left single-spin operation to solve the balance problem, and the method is similar to case a.

Figure 6 Right lateral insertion

Next, consider the situation c:n is the P left dial hand node, p is the right child node of G. For this case through a left-hand rotation, a right-handed operation (the rotation axis is N, note is not p), and N is set to Black, G is set to red to complete the rebalancing.

Figure 7 left medial insert

It is important to note that since the newly inserted node is n, its left and right subtree X, y are empty nodes, but even so, the result of the rotation operation will need to set the new X, y location correctly (if the corresponding branch of P and G is not set to an empty node, the structure of the tree will be destroyed). In other subsequent operations, the left and right subtrees of the node n to be rotated may not be empty nodes.

Analogy, case D needs to use a right paddle, a left single-spin operation to solve the balance problem, the method is similar to the case C.

Figure 8 Right Medial insert

2, the Uncle node is red

When the Uncle node is red, it cannot be dealt with directly by the above method (the U node of all the cases in front is considered red, and the node U and g are red conflicting). But we can swap the color of G with the P,u node to complete the current conflict resolution.

Figure 9 Insertion of the Uncle node as red

But it is not enough to do this, because the parent node of the grandfather node G (which is recorded as GP) will still conflict if it is red (g and GP are contiguous red, violating Rule 2). To resolve this conflict, we need to backtrack two times from the current insertion point n to root node root.

For the first time, all nodes with two red nodes are processed, and the colors of parent node G and child nodes P,u are exchanged as shown in Figure 9, and the color conflicts between GP and P are temporarily ignored. If this is the case for the two child nodes of the root node, then the root node is re-set to black after the color exchange is complete.

The second backtracking specifically deals with successive red node collisions. Due to the first pass processing, there must be no sibling nodes with red on the path of the new insertion point n. When there is still a red conflict between GP and P, the GP's sibling node (GU) can be judged black so that it returns to the case when the Uncle node discussed in the front is black.

Figure 10 Eliminating continuous red nodes

Because the two red node locations that are in conflict may be arbitrary, there are four of these rotation situations. However, we see the red node near the leaf (g) as a newly inserted node, so in the face of a A, a, the parent of P as the axis of rotation, after rotation GP will be the root of the new subtree, and the face of the c,d situation when the p as the axis of rotation can be rotated, p is the root of the new subtree (so

In the second backtracking, the position of the G and GP nodes (the distance from the root node) is increased after each red conflict rotation, but both G and GP who are the root of the new subtree will not affect the backtracking of the new Insert node n to the root path of the roots node, And once the root of the new subtree reaches the root node (the parent pointer is empty), you can stop backtracking.

The above tree rebalancing strategy can perfectly solve the problem of the balance of the red and black tree insertion node.

Four, red black tree Delete operation

The deletion of red-black trees is more complicated than the insert operation. A lot of information is not the removal of red and black trees to explain clearly, Tsinghua data structure of the red and black tree deletion of the description is also very confusing, "STL source Analysis," Hou Sir on the removal of red and black trees is not talk about. Here is the reference to the implementation of the red and black tree delete operation, and make the appropriate changes (red black tree using sentinel nodes to represent empty nodes, and here the use of NULL pointers, so to eliminate the problem of NULL pointer reference).

Since the red and black trees are binary search trees, the nodes are deleted in the same way as the two-fork search tree. But the difficulty of the red-black tree Delete operation is not the deletion of the node, but the adjustment operation after deleting the node. Therefore, the red-black tree delete operation is divided into two steps, first determine the location of the deleted node, and then adjust the balance of the red and black tree.

Consider the location of the delete node first, and if the node to be deleted has a unique child node or No child nodes, the node is deleted and its child nodes (or empty nodes) are replaced by their own locations. If the node you are deleting has two child nodes, you cannot delete the node directly. Instead, select the Minimum node (or the maximum node of the left subtree) from its right subtree as the delete node (the node must have no two child nodes, or you can take a smaller value). Of course, before deleting the selected node, the data of the selected node needs to be copied to the node that would otherwise need to be deleted. The deletion node location is selected as shown in 11, which is exactly the same as the node deletion of the two-fork search tree.

Figure 11 Selecting a Delete point

A red-flagged node in Figure 11 represents the selected node (node y) that is actually deleted. Where the green node (yold) represents the node that originally needs to be deleted, and since it has two subnodes, remove y instead of it, and delete y before you need to copy the value of Y to yold, note that if the red and black tree does not change the color of Yold! In this way, all the node deletion problems are simplified to a separate successor (or no successor) to the node deletion problem. Then consider removing the red and black tree balance adjustment problem after Y. Since the y node is removed, the successor N of y will be the child of the parent p of Y. So when the red and black tree balance is adjusted, n is the child node of P.

Consider the balance adjustment problem below, first consider the color of the deleted node Y. If Y is red, removing Y will not affect the balance of the red and black tree, so there is no need to make any adjustments. If Y is black, the total number of black nodes on the path where Y is located is reduced by 1, and the red-black tree loses its balance and needs to be adjusted.

When Y is black, then the color of node n is considered. If n is red, because N is the only successor to Y, if you set the color of n to black, you can restore the total number of black nodes in the path before Y, and the adjustment is complete. If n is also black, you need to follow these four steps to consider.

Set P is the parent node of N, and W is the sibling node of the N node. Assuming that n is the left child node of P, N is the right child node of p which can mirror symmetry considerations.

Step 1: If W is red, then the child nodes of W (if present or empty nodes) and node p must be black (rule 2). The color of W and P is exchanged and the left rotation is performed with W for the rotation axis, and W is finally set to the new sibling node of N (the original w's left subtree x).

With this conversion, the original Red W node case is converted to the Black W node. If W is originally a black (or empty node), go directly to step 2.

Figure 12 Node Deletion scenario 1

Step 2: Regardless of whether step 1 is processed, step 2 handles always the black W node, at this point consider the color of the two sub-nodes of W x, Y. If x, Y is a black node (or an empty node, if the parent node W is an empty node, and x, Y is considered an empty node), the color of W is set to Red and N is set to the parent of N p. At this point, if n is red, then the n is directly set to black and the adjustment ends. Otherwise go back to step 1 again to do similar processing. Note that nodes W and P will need to be reset after the node n has changed.

Consider that node n is a node with double black, since the Black node deletion causes the number of black nodes on the n path to be reduced by 1. This step moves the N node up so that the distance between N and the root node is reduced, and, more extreme, when n becomes the root node, the tree can be balanced (because the root node doesn't care about one more black). In addition, it is possible to restore the tree balance by subsequent transformations during the move-up process of N.

Figure 13 Node Deletion Scenario 2

Step 3: If the child node of the W in step 2 is not full black, but LR (x red) right black (Y black), the x is set to Black, W is set to red, and node x is rotated to the right axis, and finally W is set to N's new sibling (the original X node).

By such a conversion, the original W sub-node LR right black situation into the case of left black right red. If the right child node of W is originally red (the left child node color can be black and red), then go directly to step 4.

Figure 14 Node Deletion scenario 3

Step 4: This step handles W right child node Y is red, at this time W's left child node X can be black and red. At this point, the right child node y of W is set to black, and the color of W and parent node P (w is black, p color can be black and red), and then the left rotation of the rotation axis of W, red black tree adjustment algorithm end.

Through the conversion of this step, can completely solve the red and black tree balance problem! The essence of this step is to use the total number of black nodes on the left-hand recovery node n, although P and w swap colors, but they are all ancestors of n, so the number of black nodes on the n path increases by 1. At the same time, because of the left-hand, so that the number of black nodes on the y path minus 1, it happens that Y color is red, set Y to Black can restore the Y-node path of the total number of black nodes.

Figure 15 Node Deletion scenario 4

Summing up the above steps, the balance adjustment of red-black tree node deletion is summarized as follows.

Figure 16 Node Deletion adjustment process

With the above adjustment strategy, we can solve the problem of the balance of the red and black tree node deletion perfectly.

Five, random test

The test of data structure accuracy mainly examines the following operations: INSERT, delete, query, traverse, and verify. Insert and delete operations are fully described before, implemented by inset and remove, the query operation in the INSERT and delete operations will be called indirectly, by the find implementation, traversal operations are divided into the positive sequence (by minimum and next implementation) and reverse traversal (implemented by Maximim and prev), The validation operation is primarily to verify the legality of the red-black tree after insertion and deletion (rules 1, 2, 3), implemented by validate. Other operations related to the statistical characteristics of the red and black trees, such as obtaining tree heights, number of nodes, and cumulative number of rotations, can be easily implemented.

We use random number generator to randomly generate a batch of data into the red black tree, and then randomly generate a batch of data as the parameters of the delete operation. The validity of the tree is verified each time it is inserted and deleted, and the nodes of the red and black tree and other statistics are output in both positive and reverse order after the insertion of the deleted data. The test code is as follows:

#include "rb_tree.h"
#include <time.h>
#include <windows.h>

int main ()
{
Srand ((unsigned) getcurrenttime ());
int times=10,len=30;
while (times--)
{
Rb_tree<int> Tree;
for (int i=0;i<len;i++)
{
int Num=rand ()%len;
Tree.insert_unique (num);
if (!tree.validate ()) cout<< "loss of balance when inserting" <<endl;
}
cout<< "Positive order:";
For (Rb_tree<int>::node_type*node=tree.minimum (); Node;node=tree.next (node))
{
cout<<node->value<< "";
}
cout<< "\ n number of rotations-black high-number of nodes:" <<tree.get_rotate_times ()
<< "<<tree.height () <<" "<<tree.count () <<endl;
cout<< "Delete:";
for (int i=0;i<len;i++)
{
int Num=rand ()%len;
if (Tree.remove (num)) cout<<num<< "";
if (!tree.validate ()) cout<< "Lose balance on Delete" <<endl;
}
cout<<endl;
cout<< "Reverse:";
For (Rb_tree<int>::node_type*node=tree.maximum (); Node;node=tree.prev (node))
{
cout<<node->value<< "";
}
cout<< "\ n number of rotations-black high-number of nodes:" <<tree.get_rotate_times ()
<< "<<tree.height () <<" "<<tree.count () <<endl;
cout<< "________________________________________________________________________________" <<endl;
}
return 0;
}

After a large number of cyclic random tests, we can verify the stability of the red and black tree data structure and the correctness of the balance adjustment algorithm, which is the part of the test result.

This article constructs the red-black tree data structure source code is: Https://github.com/fanzhidongyzby/RBTree.

If the reader is interested, you can download the verification.

Figure 17 Test results

To sum up, we have a more fully understanding of the red and black tree data structure, especially the complex red and black tree insertion and deletion of the balance adjustment algorithm, the final test verifies the red black tree core algorithm correctness. Through the detailed analysis of the red and black tree data structure, I believe we have a more fully understanding of the importance of data structure in computer science, I hope this article will help you.

Analysis of red and black tree data structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.