B, B-, B +, B *, and B-

Source: Internet
Author: User

B, B-, B +, B *, and B-

Declaration: This article only introduces the definitions of these trees and their comparison. They do not involve operations such as insertion, deletion, splitting, and integration. These will be introduced in later articles.

B tree

That is, the binary search tree:

1. all non-leaf nodes have at most two sons (Left and Right); 2. store a keyword on all nodes. 3. the left pointer of a non-leaf node points to the subtree smaller than its keyword, And the right Pointer Points to the subtree larger than its keyword;

For example:

BTree search, starting from the root node. If the query keyword is the same as the node keyword, it hits;
Otherwise, if the query keyword is smaller than the node keyword, it enters the left son; if it is larger than the node keyword, it enters the right son;
If the pointer of the Left or Right son is null, the corresponding keyword cannot be found in the report;

IfBThe number of left and right subtree nodes of all non-leaf nodes in the tree is almost (balanced), soBTree search performance approaches binary search;
However, it has the following advantages over binary search of continuous memory space: ChangesBTree Structure (insertion and deletion of nodes) does not need to move large segments of memory data, or even constant overhead;

HoweverBAfter multiple inserts and deletions, the tree may have different structures:

The right side is alsoBTree, but its search performance is linear. The same keyword set may lead to different tree indexes.BTree should also consider makingBThe tree maintains the structure of the Left graph, and avoids the structure of the right graph, that is, the so-called "balance" problem;

Actually usedBThe tree is in the originalBAdd a balance algorithm on the basis of the tree, that is, the "balance Binary Tree ";How to maintainBBalanced Tree node distribution is the key to balancing Binary Trees.; Balancing algorithm isBPolicies for inserting and deleting nodes in the tree;

B-tree

Is a multi-path search tree (not binary ):

1) each node x has the following attributes:-node x has n keywords-the keywords themselves are not sorted in descending order-there is a Boolean value indicating whether the node is a leaf node 2) each internal node x also contains n + 1 pointer to the child. 3) each leaf node has the same depth. 4) each node contains an upper and lower bound Number of keywords, using the minimum degree t> = 2 of Tree B represents-each node except the root node must have at least T-1 keywords, so, each internal node has at least t children-each node contains a maximum of 2 T-1 keywords. Therefore, an internal node has a maximum of 2 t children. T = 2 is the simplest. Each internal node has 2, 3, and 4 children. It is a 2-3-4 tree.

For example, (M = 3 ):


B-Tree search, starting from the root node, performs a binary search for the keyword (ordered) sequence in the node. If

The hit ends. Otherwise, the child node in the scope of the query keyword is entered. Repeat until the corresponding son pointer is

Null, or it is already a leaf node;

B-Features of the tree:

1. the set of keywords is distributed in the entire tree; 2. any keyword appears only in one node; 3. the search may end at a non-leaf node; 4. its search performance is equivalent to performing a binary search in the full set of keywords; 5. automatic hierarchical control;
B + tree

B+-tree: Is a kind of generated according to the needs of the file systemB-tree.

One treemLevelB+Tree andmLevelBTree similarities and differences are:

The non-terminal node contains only the pointer of the smallest keyword in the Child root node.

Why is B +-tree more suitable for the file index and database index of the operating system in actual applications than B-tree?

B+-treeThe internal node of does not point to the specific information pointer of the keyword. Therefore, its internal nodes are relativelyBThe tree is smaller. If you store all the keywords of the same internal node in the same disk, the more keywords the disk can hold. The more keywords you need to search for In-memory reading at one time. Relatively speakingIOThe read/write count is reduced.

For example, assume that a disk block in a disk can accommodate16bytesAnd a keyword2bytes, A specific keyword information pointer2bytes. Grade 9B-tree(A node can contain up to eight keywords) the internal node requires two disks. WhileB+Only one disk is needed for the node inside the tree. When you need to read internal nodes into the memory,BTree RatioB+The time when the disk block is searched (in the disk, the time when the disk is rotated ).
2.B+-treeMore stable query efficiency

Because the non-endpoint is not the final point to the file content node, it is only the index of the keyword in the leaf node. Therefore, any keyword search must follow a path from the root node to the leaf node. The path length of all keyword queries is the same, resulting in the query efficiency of each data.
3. database indexingB+The main reason for the tree is:

BThe tree improves the diskIOPerformanceThe problem of inefficient element traversal is not solved.. To solve this problem,B+Tree came into being.B+As long as the tree traverses the leaf node, the whole tree can be traversed. In addition, range-based queries in databases are very frequent, whileBThe tree does not support such operations (or the efficiency is too low)

B *-tree

B*-treeYesB+-treeInB+Based on the tree (all leaf nodes contain information about all keywords and pointers to records containing these keywords ),B*The non-root and non-leaf nodes in the tree add pointers pointing to brothers.;B*The number of non-leaf node keywords defined in the tree must be at least(2/3)*MThat is, the minimum block usage is2/3(ReplaceB+Tree1/2). A simple example is provided, as shown in:

B+Split the tree: When a node is full, allocate a new node and1/2Copy the data to the new node, and add a pointer to the new node in the parent node;B+Tree Splitting only affects the original node and the parent node, but does not affect the sibling node. Therefore, it does not need to point to the sibling node.

B*Split of a tree: When a node is full, if its next sibling node is not full, move part of the data to the sibling node, and then insert a keyword into the original node, finally, modify the keywords of the sibling node in the parent node (because the keyword range of the sibling node has changed). If the sibling node is full, add a new node between the original and sibling nodes and copy each node.1/3To the new node, and then add a pointer to the new node at the parent node.

So,B*Probability Ratio of the new node allocated by the treeB+The tree should be low and the space usage should be higher;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.