In application engineering, there is a need for innovation in the existing data structure, but there is little need to create a new data structure. Generally, you only need to add some information to the standard data structure. Data structures can be incorporated into new creations to support required applications. However, the expansion of the data structure is not always easy. Additional information must be updated and maintained for the conventional operations of the data structure. This chapter discusses how to expand the data structure constructed by the red/black tree.
Dynamic sequence statistics
Chapter 2 introduces the concept of Ordered Statistics. Any ordered statistics in an unordered set can be found in O (n) time. This section describes how to modify the structure of the red/black tree so that any sequence statistic can be determined within the O (lgn) time.
A sequence statistic tree T is formed by simply storing additional information in each node of the red/black tree. In node X, apart from the original domain, size [x] is also included. This domain contains the number of nodes in the subtree rooted in node X:
Size [x] = size [left [x] + size [right [x] + 1
To retrieve the Element Algorithm with sorting by point, you only need to call OS-select (root [T], I ):
OS-select (X, I)
1 R = size [left [x] + 1
2 If I = r
3 then return r
4 else if I <r
5 then return OS-select (left [X], I)
6 else return OS-select (right [X], I-r)
Determine the rank of an element:
OS-rank (t, x)
1 R = size [left [x] + 1
2 y = x
3 while y! = Root [T]
4 do if y = right [p [y]
5 then r = R + size [left [p [y] + 1
6 y = P [y]
7 return r
Maintenance of sub-tree scale:
Given the size field of each node, the OS-SLELECT and OS-rank can quickly calculate the required sequence statistics. However, unless these size fields can be effectively maintained using the basic modification operations on the red/black tree, the expected purpose will not be met.
Insert node: when looking for the Insert Location, add 1 to the node size field in the path.
Rotation: if some knots need to be rotated after a node is inserted, you can verify that the size field of the related node can be updated within the constant time.
How to expand the data structure
Expansion of a data structure is generally divided into four steps
1. Select Basic Data Structure
2. Determine what information is selected as a keyword and what information is selected as an additional field. The selection depends on the design purpose.
3. Verify that the newly added information can be maintained in the general operations of the data structure. Generally, if a certain information only depends on the information of its subnodes, it takes only a constant time to maintain the information of a node.
4. design new operations
The sequence statistics tree in the preceding section is used as an example. Instead of storing sequence statistics in nodes, we store the number of nodes in the subtree, because the former is difficult to maintain. With the latter, we can also get the sequence statistics within the faster time lgn.
Interval Tree:
A closed interval is an ordered pair of a real number [T1, T2], where t1 <= T2. Intervals can easily indicate events that occupy consecutive periods. We can represent an interval [T1, T2] as an object I, and each of its fields is low [I] = T1, high [I] = T2. The two intervals I and j satisfy one of the following three relationships:
1 overlapping
2 I is on the left of J, that is, high [I] <low [J]
3 I is on the right of J, that is, high [J] <low [I]
We need to maintain a dynamic range set and quickly find the set elements that overlap with a known range.
First, construct the Interval Tree according to the method described in the previous section:
1. Basic Data Structure
Select the red/black tree, where each node X contains the int [x] range field, and the keyword is the low endpoint of the interval low [int [x]. In this way, the tree can be traversed in the middle order and the output interval can be in the order of low endpoints.
2 Additional information
In addition to the interval information, each node also contains a value of MAX [X], that is, the maximum value of the endpoints of all Subtrees with X as the root.
3. Information Maintenance
Verify that the maintenance of the Max domain can be completed in general insert and delete operations:
Max [x] = max (high [int [x], Max [left [x], Max [right [x]), in this way, only constant time is required to update the max field.
4. New operations
Interval-search (t, I)
1. x root [T]
2. While X! = Nil [T] And I dose not overlap int [x]
3. Do if left [x]! = Nil [T] and max [left [x]> = low [I]
4. Then X left [x]
5. Else X right [x]
6. Return x
The key of the above algorithm lies in the selection of Line 3-5. If the node X and I do not overlap, if X has a left Tree and the maximum endpoint of the Left subtree is greater than the low point of I, search to the left subtree: this is the correct choice. It should be that if there is no overlap with I in the left subtree under this condition, the right subtree will not exist. Because Max [left [x]> = low [I], a node y, high [y]> = low [I] must exist in the left subtree. if y and I do not overlap, if low [y]> high [I] exists, the low end of all nodes in the right subtree of X is greater than high [I].