B-Tree b-b+ b*

Source: Internet
Author: User

B-Tree

Binary search tree:

1. All non-leaf nodes have a maximum of two sons (left and right);

2. All nodes store a keyword;

3. The left pointer of a non-leaf node points to a subtree smaller than its key, and the right pointer points to a subtree larger than its key;

Such as:

B-Tree search, starting from the root node, if the query keyword is equal to the keyword of the node, then hit;

Otherwise, if the query keyword is smaller than the node keyword, go to the left son; if the keyword is larger than the node, enter

Right son; If the pointer to the left son or right son is empty, the report cannot find the corresponding keyword;

If the number of nodes of the left and right subtrees of all non-leaf nodes in the B-tree remains approximately (balanced), then the B-tree

Search performance approximation to binary lookup; but it has the advantage of a binary lookup over contiguous memory space, changing the B-tree structure

(Insert and DELETE nodes) do not need to move large segments of memory data, or even constant overhead;

Such as:

However, after several insertions and deletions, the B-tree may lead to different structures:

The right side is also a B-tree, but its search performance is already linear; the same keyword set can lead to different

tree structure index; therefore, the use of B-trees should also be considered as far as possible to keep the B-tree structure of the left graph, and avoid the structure of the right graph, also

is the so-called "balance" problem;

The actual B-tree is based on the original B-tree with the balance algorithm, that is, "balanced binary tree"; How to keep B-tree

The equilibrium algorithm of node distribution is the key to balance binary tree. The equilibrium algorithm is a kind of inserting and deleting nodes in the B-tree.

Policy

B-Tree

is a multi-path search tree (not two-pronged):

1. Definition of any non-leaf node up to only m sons; m>2;

2. The number of sons of the root node is [2, M];

3. The number of sons of non-leaf nodes outside the root node is [M/2, M];

4. Each node is stored at least m/2-1 (rounded) and up to M-1 keywords; (at least 2 keywords)

5. Number of key words for non-leaf nodes = number of pointers to sons-1;

6. Non-leaf node keywords: k[1], k[2], ..., k[m-1]; K[i] < k[i+1];

7. Pointers to non-leaf nodes: p[1], p[2], ..., p[m], where p[1] points to a keyword less than k[1]

Subtree, p[m] to the subtree of the keyword greater than k[m-1], and the other p[i] to the subtree that the keyword belongs to (k[i-1], k[i]);

8. All leaf nodes are located on the same floor;

such as: (M=3)

B-Tree search, starting from the root node, a binary search of the keyword (ordered) sequence within the node, if

Hit the end, or enter the query keyword belongs to the range of the son node; repeat until the corresponding son pointer is

Empty, or already a leaf knot;

B-Tree Features:

1. The keyword set is distributed throughout the tree;

2. Any keyword appears and appears only in one node;

3. Search may end at non-leaf nodes;

4. Its search performance is equivalent to doing one-time binary search within the complete range of keywords;

5. Automatic level control;

Due to the restriction of non-leaf nodes outside the root node, at least the M/2 son, to ensure that the end point of at least

Utilization, with the most up-to-the-bottom search performance:

where m is the maximum number of subtree of non-leaf nodes, and n is the total number of keywords;

So the performance of B-tree is always equivalent to binary lookup (independent of M-value), there is no problem of B-tree equilibrium;

Due to the limitation of M/2, if the node is full at the point of insertion, the node must be split into two

M/2 nodes; When deleting nodes, it is necessary to merge two brothers with insufficient m/2;

B + Tree

B + trees are variants of B-trees and are also a multi-path search tree:

1. Its definition is basically the same as the B-tree, except:

2. The sub-tree pointer of non-leaf node is the same as the number of keywords;

3. Subtree pointer of non-leaf node p[i], pointing to the subtree of the keyword value belonging to [K[i], k[i+1])

(b-Tree is open interval);

5. Add a chain pointer for all leaf nodes;

6. All keywords appear at the leaf node;

such as: (M=3)

B + 's search is basically the same as that of the B. C-tree, except that the difference is that B + trees only hit the leaf nodes (b-trees can be

Non-leaf node hit), and its performance is equivalent to the keyword in the complete collection of two-point search;

Features of B +:

1. All keywords appear in the list of leaf nodes (dense index), and the keywords in the list are exactly

is orderly;

2. Cannot be hit on non-leaf nodes;

3. The non-leaf node is equivalent to the index of the leaf node (sparse index), the leaf node is equivalent to the storage

Data layer of the data (keywords);

4. More suitable for file indexing system;

b* Tree

is a variant of B + tree in which the non-root and non-leaf nodes of the B + tree are added to the pointer of the brother;

b* Tree defines the number of non-leaf node keywords at least (2/3) *m, that is, the minimum usage of the block is 2/3

(instead of 1/2 of B + trees);

B + Tree split: When a node is full, assign a new node and place 1/2 of the data in the original node.

Copy to the new node, and finally add a pointer to the new node in the parent node; B + The division of the tree affects only the original node and the parent.

node, without affecting the sibling node, so it does not need to point to the brother's pointer;

b*: When a node is full, if its next sibling node is not full, then part

The data is moved to the sibling node, the keyword is inserted at the original node, and the keyword of the sibling node in the parent node is modified.

(Because the key range of the sibling node has changed); If the brothers are full, then at the original node and the Brother node.

The new nodes are added, each copy 1/3 of the data to the new node, and finally the pointer of the new node is added to the parent node;

Therefore, the probability of allocating new nodes to b* tree is lower than that of B + tree, and the space utilization rate is higher.

Summary

B-Tree: Two fork tree, each node stored only one keyword, equal to hit, less than go left node, greater than

Go to the right junction point;

B-Tree: Multi-path search tree, each node storage M/2 to M keywords, non-leaf node storage point key

The sub-nodes of the word range;

All keywords appear in the whole tree, and only once, non-leaf nodes can hit;

B + Tree: Adds a list pointer to the leaf node on the basis of the tree, and all the keywords are in the leaf node.

, the non-leaf node is the index of the leaf node, and B + tree is always hit by the leaf knot point;

b* tree: On the basis of B + tree, for non-leaf nodes also increase the list pointer, the minimum utilization of nodes

Increased from 1/2 to 2/3;

Original Address http://blog.csdn.net/manesking/archive/2007/02/09/1505979.aspx

B-Tree b-b+ b*

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.