B, B-, B +, B *, and B-
Declaration: This article only introduces the definitions of these trees and their comparison. They do not involve operations such as insertion, deletion, splitting, and integration. These will be introduced in later articles.
B tree
That is, the binary search tree:
1. all non-leaf nodes have at most two sons (Left and Right); 2. store a keyword on all nodes. 3. the left pointer of a non-leaf node points to the subtree smaller than its keyword, And the right Pointer Points to the subtree larger than its keyword;
For example:
B
Tree search, starting from the root node. If the query keyword is the same as the node keyword, it hits;
Otherwise, if the query keyword is smaller than the node keyword, it enters the left son; if it is larger than the node keyword, it enters the right son;
If the pointer of the Left or Right son is null, the corresponding keyword cannot be found in the report;
IfB
The number of left and right subtree nodes of all non-leaf nodes in the tree is almost (balanced), soB
Tree search performance approaches binary search;
However, it has the following advantages over binary search of continuous memory space: ChangesB
Tree Structure (insertion and deletion of nodes) does not need to move large segments of memory data, or even constant overhead;
HoweverB
After multiple inserts and deletions, the tree may have different structures:
The right side is alsoB
Tree, but its search performance is linear. The same keyword set may lead to different tree indexes.B
Tree should also consider makingB
The tree maintains the structure of the Left graph, and avoids the structure of the right graph, that is, the so-called "balance" problem;
Actually usedB
The tree is in the originalB
Add a balance algorithm on the basis of the tree, that is, the "balance Binary Tree ";How to maintainB
Balanced Tree node distribution is the key to balancing Binary Trees.; Balancing algorithm isB
Policies for inserting and deleting nodes in the tree;
B-tree
Is a multi-path search tree (not binary ):
1) each node x has the following attributes:-node x has n keywords-the keywords themselves are not sorted in descending order-there is a Boolean value indicating whether the node is a leaf node 2) each internal node x also contains n + 1 pointer to the child. 3) each leaf node has the same depth. 4) each node contains an upper and lower bound Number of keywords, using the minimum degree t> = 2 of Tree B represents-each node except the root node must have at least T-1 keywords, so, each internal node has at least t children-each node contains a maximum of 2 T-1 keywords. Therefore, an internal node has a maximum of 2 t children. T = 2 is the simplest. Each internal node has 2, 3, and 4 children. It is a 2-3-4 tree.
For example, (M = 3 ):
B-
Tree search, starting from the root node, performs a binary search for the keyword (ordered) sequence in the node. If
The hit ends. Otherwise, the child node in the scope of the query keyword is entered. Repeat until the corresponding son pointer is
Null, or it is already a leaf node;
B-
Features of the tree:
1. the set of keywords is distributed in the entire tree; 2. any keyword appears only in one node; 3. the search may end at a non-leaf node; 4. its search performance is equivalent to performing a binary search in the full set of keywords; 5. automatic hierarchical control;
B + tree
B+-tree
: Is a kind of generated according to the needs of the file systemB-tree
.
One treem
LevelB+
Tree andm
LevelB
Tree similarities and differences are:
The non-terminal node contains only the pointer of the smallest keyword in the Child root node.
Why is B +-tree more suitable for the file index and database index of the operating system in actual applications than B-tree?
B+-tree
The internal node of does not point to the specific information pointer of the keyword. Therefore, its internal nodes are relativelyB
The tree is smaller. If you store all the keywords of the same internal node in the same disk, the more keywords the disk can hold. The more keywords you need to search for In-memory reading at one time. Relatively speakingIO
The read/write count is reduced.
For example, assume that a disk block in a disk can accommodate16bytes
And a keyword2bytes
, A specific keyword information pointer2bytes
. Grade 9B-tree
(A node can contain up to eight keywords) the internal node requires two disks. WhileB+
Only one disk is needed for the node inside the tree. When you need to read internal nodes into the memory,B
Tree RatioB+
The time when the disk block is searched (in the disk, the time when the disk is rotated ).
2.B+-tree
More stable query efficiency
Because the non-endpoint is not the final point to the file content node, it is only the index of the keyword in the leaf node. Therefore, any keyword search must follow a path from the root node to the leaf node. The path length of all keyword queries is the same, resulting in the query efficiency of each data.
3. database indexingB+
The main reason for the tree is:
B
The tree improves the diskIO
PerformanceThe problem of inefficient element traversal is not solved.. To solve this problem,B+
Tree came into being.B+
As long as the tree traverses the leaf node, the whole tree can be traversed. In addition, range-based queries in databases are very frequent, whileB
The tree does not support such operations (or the efficiency is too low)
B *-tree
B*-tree
YesB+-tree
InB+
Based on the tree (all leaf nodes contain information about all keywords and pointers to records containing these keywords ),B*
The non-root and non-leaf nodes in the tree add pointers pointing to brothers.;B*
The number of non-leaf node keywords defined in the tree must be at least(2/3)*M
That is, the minimum block usage is2/3(
ReplaceB+
Tree1/2
). A simple example is provided, as shown in:
B+
Split the tree: When a node is full, allocate a new node and1/2
Copy the data to the new node, and add a pointer to the new node in the parent node;B+
Tree Splitting only affects the original node and the parent node, but does not affect the sibling node. Therefore, it does not need to point to the sibling node.
B*
Split of a tree: When a node is full, if its next sibling node is not full, move part of the data to the sibling node, and then insert a keyword into the original node, finally, modify the keywords of the sibling node in the parent node (because the keyword range of the sibling node has changed). If the sibling node is full, add a new node between the original and sibling nodes and copy each node.1/3
To the new node, and then add a pointer to the new node at the parent node.
So,B*
Probability Ratio of the new node allocated by the treeB+
The tree should be low and the space usage should be higher;