"Turn" binary tree, B-tree, B-tree, + + tree, b* tree

Source: Internet
Author: User

Two fork Tree

1. All non-leaf nodes have a maximum of two sons (left and right);

2. All nodes store a keyword;

3. The left pointer of a non-leaf node points to a subtree smaller than its key, and the right pointer points to a subtree larger than its key;

Such as:

Binary tree search, starting from the root node, if the query key words and nodes are equal, then hit;

Otherwise, if the query keyword is smaller than the node keyword, go to the left son; if the keyword is larger than the node, enter

Right son; If the pointer to the left son or right son is empty, the report cannot find the corresponding keyword;

If the number of nodes of the left and right subtree of all non-leaf nodes in the binary tree remains approximately (balanced), then the two fork tree

Search performance approximation to binary lookup; but it has the advantage over the binary search of contiguous memory space, changing the binary tree structure

(Insert and DELETE nodes) do not need to move large segments of memory data, or even constant overhead;

Such as:

However, after several insertions and deletions, the binary tree may lead to different structures:

The right side is also a binary tree, but its search performance is already linear; the same keyword set can lead to different

tree structure index; therefore, the use of a two-fork tree should also consider as far as possible to keep the structure of the left of the two tree, and avoid the structure of the right graph, it is

is the so-called "balance" problem;

The actual use of the two-fork tree is based on the original binary tree with the balance algorithm, namely "balanced binary tree"; How to keep a binary tree

The equilibrium algorithm of node distribution is the key to balance binary tree. The equilibrium algorithm is a kind of inserting and deleting nodes in a binary tree.

Policy

B-Tree

In 1970, R.bayer and E.mccreight proposed a tree for outside lookups, which is a balanced multi-fork tree called B-tree (or B-tree, B_ tree). A M-Order B-tree (balanced tree of order m) is a balanced m-Path search tree. It is either an empty tree, or a tree that satisfies the following properties: 1, the root node has at least two children; 2. The number of keywords contained in each non-root nodes J satisfies: ┌m/2┐-1 <= J <= m-1;3, all nodes other than the root node (excluding leaf nodes) are exactly the number of keywords plus 1 , the number of internal subtree K satisfies: ┌m/2┐<= K <= m; 4. All leaf nodes are located on the same floor. In the B-tree, the keywords in each node are arranged from small to large, and when the child of the node is a non-leaf node, the K-1 keyword is a division of the range of keywords that the K child contains. Because the leaf node does not contain keywords, so you can think of the leaf node in the tree does not actually exist outside the node, the pointer to these external nodes is empty, the number of leaf nodes exactly equal to the total number of keywords contained in the tree plus 1. A B-tree contains n keywords, the general form of a n+1 pointer node is: (N,p0,k1,p1,k2,p2,..., kn,pn) where ki is the keyword, k1<k2<...<kn, Pi is pointing to include Ki to ki+ 1 A pointer to a subtree between the keywords.

B-Tree

is a multi-path search tree (not two-pronged):

1. Definition of any non-leaf node up to only m sons; m>2;

2. The number of sons of the root node is [2, M];

3. The number of sons of non-leaf nodes outside the root node is [M/2, M];

4. Each node is stored at least m/2-1 (rounded) and up to M-1 keywords; (at least 2 keywords)

5. Number of key words for non-leaf nodes = number of pointers to sons-1;

6. Non-leaf node keywords: k[1], k[2], ..., k[m-1]; K[i] < k[i+1];

7. Pointers to non-leaf nodes: p[1], p[2], ..., p[m], where p[1] points to a keyword less than k[1]

Subtree, p[m] to the subtree of the keyword greater than k[m-1], and the other p[i] to the subtree that the keyword belongs to (k[i-1], k[i]);

8. All leaf nodes are located on the same floor;

such as: (M=3)

B-Tree search, starting from the root node, a binary search of the keyword (ordered) sequence within the node, if

Hit the end, or enter the query keyword belongs to the range of the son node; repeat until the corresponding son pointer is

Empty, or already a leaf knot;

B-Tree Features:

1. The keyword set is distributed throughout the tree;

2. Any keyword appears and appears only in one node;

3. Search may end at non-leaf nodes;

4. Its search performance is equivalent to doing one-time binary search within the complete range of keywords;

5. Automatic level control;

Due to the restriction of non-leaf nodes outside the root node, at least the M/2 son, to ensure that the end point of at least

Utilization, with the most up-to-the-bottom search performance:

where m is the maximum number of subtree of non-leaf nodes, and n is the total number of keywords;

So the performance of B-tree is always equivalent to binary lookup (independent of M-value), there is no problem of B-tree equilibrium;

Due to the limitation of M/2, if the node is full at the point of insertion, the node must be split into two

M/2 nodes; When deleting nodes, it is necessary to merge two brothers with insufficient m/2;

B + Tree

B + trees are variants of B-trees and are also a multi-path search tree:

1. Its definition is basically the same as the B-tree, except:

2. The sub-tree pointer of non-leaf node is the same as the number of keywords;

3. Subtree pointer of non-leaf node p[i], pointing to the subtree of the keyword value belonging to [K[i], k[i+1])

(b-Tree is open interval);

5. Add a chain pointer for all leaf nodes;

6. All keywords appear at the leaf node;

such as: (M=3)

B + 's search is basically the same as that of the B. C-tree, except that the difference is that B + trees only hit the leaf nodes (b-trees can be

Non-leaf node hit), and its performance is equivalent to the keyword in the complete collection of two-point search;

Features of B +:

1. All keywords appear in the list of leaf nodes (dense index), and the keywords in the list are exactly

is orderly;

2. Cannot be hit on non-leaf nodes;

3. The non-leaf node is equivalent to the index of the leaf node (sparse index), the leaf node is equivalent to the storage

Data layer of the data (keywords);

4. More suitable for file indexing system;

b* Tree

is a variant of B + tree in which the non-root and non-leaf nodes of the B + tree are added to the pointer of the brother;

b* Tree defines the number of non-leaf node keywords at least (2/3) *m, that is, the minimum usage of the block is 2/3

(instead of 1/2 of B + trees);

B + Tree split: When a node is full, assign a new node and place 1/2 of the data in the original node.

Copy to the new node, and finally add a pointer to the new node in the parent node; B + The division of the tree affects only the original node and the parent.

node, without affecting the sibling node, so it does not need to point to the brother's pointer;

b*: When a node is full, if its next sibling node is not full, then part

The data is moved to the sibling node, the keyword is inserted at the original node, and the keyword of the sibling node in the parent node is modified.

(Because the key range of the sibling node has changed); If the brothers are full, then at the original node and the Brother node.

The new nodes are added, each copy 1/3 of the data to the new node, and finally the pointer of the new node is added to the parent node;

Therefore, the probability of allocating new nodes to b* tree is lower than that of B + tree, and the space utilization rate is higher.

Summary

Binary tree: Two fork tree, each node store only one keyword, equal to hit, less than go left node, greater than

Go to the right junction point;

B-Tree: Multi-path search tree, each node storage M/2 to M keywords, non-leaf node storage point key

The sub-nodes of the word range;

All keywords appear in the whole tree, and only once, non-leaf nodes can hit;

B + Tree: Adds a list pointer to the leaf node on the basis of the tree, and all the keywords are in the leaf node.

, the non-leaf node is the index of the leaf node, and B + tree is always hit by the leaf knot point;

b* tree: On the basis of B + tree, for non-leaf nodes also increase the list pointer, the minimum utilization of nodes

Increased from 1/2 to 2/3;

"Turn from" http://blog.csdn.net/nashouat/article/details/8494946

"Turn" binary tree, B-tree, B-tree, + + tree, b* tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.