Summary:
Ordinary BST does not guarantee a good performance after multiple operations. So we need a BST, yes. The cost of each insertion and search is a logarithmic function of the large size of the tree. The 2-3-4 tree is a kind of BST that maintains balance.
2-3-4 Tree Introduction
To ensure that BST is balanced, we need to have some flexibility in the tree. 2-3-4 Tree Selection method is to make the node can hold more than one keyword. More specifically, we will use three nodes: 2-node, 3-node, 4-node. Can accommodate 1, 2, 3 keywords respectively. Keyword divides subsequent child nodes into intervals.
Nature : A balanced 2-3-4 search tree, all links to the empty tree are the same distance from the root.
explanation : The so-called empty tree is an external link, which is the two null link of the leaf node. So it can also be understood that all leaf nodes in the balanced 2-3-4 tree have the same depth.
Insert a new keyword in the 2-3-4 tree
In order to insert a new node into the 2-3-4 tree, we can perform a failed search first and then insert the node. But this new tree will be unbalanced. The main reason why the 2-3-4 tree is important is that in each case, we can insert and still maintain a perfectly balanced state. For example: When a new keyword is inserted into a 2-node, you can turn it into a 3-node. When you insert a new keyword into a 3-node, you can turn it into a 4-node. Beware when inserting a keyword into a 4-node, you can split it into two 2-nodes and pass the middle keyword up and into the parent node. If the parent node of the split 4-junction is also a 4-node, it can continue to split upward. An easier way to do this is to split any encountered 4-node in the search for the tree downward, thus guaranteeing that the search path will not end at a 4-node point. When we insert from top to bottom, all 4-node encounters are split, which guarantees that the starting point will never be a 4-node when entering the next level.
The key point is that these divisions are completely local--except for the split nodes and their parent nodes, the other parts don't need to be checked or modified, and are completely unaffected. This guarantees the balance of our books.
Another detail problem is that as long as the root becomes a 4-node, it is split into three 2-nodes in time. Splitting the roots is the only act that adds a layer of depth to the tree.
Below we give a 2-3-4 tree insert legend.
Nature : Searches in the 2-3-4 tree of n nodes, with a maximum of lgn+1 nodes.
explanation : According to the basic nature, we have roots to the leaf node depth are all equal. The insert routine has no effect on the depth of the tree, only the depth plus one when splitting the root. If all nodes are 2-node, it is easy to obtain the above results according to the properties of BST, and when there are 3-and 4-nodes, the height is only as low as possible.
Nature : Inserts a 2-3-4 tree of n nodes, and in the worst case, splits less than lgn+1 nodes. On average, the number of splits required may be less than 1.
explanation : The worst possible scenario is that all nodes on the path leading to the insertion point are 4-nodes and will all be split. However, this is generally not the case because there are not many 4-node nodes in the tree.
Discuss
For the 2-3-4 Tree search operation, the above description can give a worst-case scenario to achieve good performance algorithm. Although we can actually write such algorithms, the node control and fragmentation involved are too complex. These additional costs can cause these algorithms to be slower than standard BST. It should be taken into account that the main purpose of the balance is to provide the worst-case protection, and the cost of providing safeguards should of course be as low as possible.
So we're going to use the red black tree as a vector to implement the 2-3-4 tree. The next chapter will discuss the nature and implementation of the red and black trees in depth.
Data structure selected -2-3-4 trees (2-3-4 tree)