D & F Data Structure series-B-tree and B + tree

Source: Internet
Author: User

B tree

Definition: a B-tree T is a root tree of the following nature:

1) each node X has the following fields:

A) N [X], number of keywords currently stored in node X,

B) The N [x] keywords are stored in non-descending order. Therefore, key1 [x] <= key2 [x] <=... <= keyn [x] [X],

C) leaf [x] is a Boolean value. If X is a leaf, it is true. If X is an inner node, it is false.

2) Each inner node contains N [x] + 1 pointer pointing to its child c1 [X], C2 [X],..., CN [x] + 1 [X]. Leaf nodes have no children, so their Ci domains are meaningless.

3) each keyword keyi [x] separates the key fields stored in each subtree. If Ki is the key word stored in the subtree that uses CI [x] as the root

Ki <= key1 [x] <= k2 <= key2 [x] <=... <= keyn [x] <= keyn [x] + 1

4) each leaf node has the same depth, that is, the tree height H.

5) each node can contain a keyword tree with an upper and lower bound. These bounds can be expressed by a fixed value T> = 2, which is the minimum degree of B.

A) each non-root node must have at least one keyword for the T-1. Each non-root internal node has at least t children. If the fruit tree is not empty, the root node contains at least one keyword.

B) each node can contain up to 2 T-1 keywords, so an internal node can have up to 2 t children. Let's say that a node is full, if it happens to have 2 T-1 keywords.

T = 2, B is the simplest. At this time, each inner node has two, three, or four children, that is, a 2-3-4 tree. However, in practice, a large T is usually used.

Purpose:

Tree B is a multi-channel balanced search tree designed for disks or other secondary storage devices.

Why is the data structure designed for disks different from that designed for Random storage?

Tree B promotes the binary search tree in a natural way. If the inner node X of Tree B contains N [x] keywords, X has n [x] + 1 children. The keyword field in node X is used to divide the keyword field processed by X into N [x] + 1 split point of the subdomain. each subdomain has a child of X for processing. When you search for a keyword in a tree B, you can compare the N [x] keywords stored in node X, and make a n [x] + 1 decision. The structure of leaf nodes is different from that of internal nodes.

In a typical B-tree application, the data volume to be processed is large, so it cannot be loaded into the primary storage at one time. The B-tree algorithm selects the required page (the concept of "page" requires knowledge of computer composition principles and operating systems) and copies it to the primary storage, then write the modified page back to the disk. Because the B-tree algorithm requires only a certain number of pages in the primary storage at any time, the size of the primary storage does not limit the size of the B-tree that can be processed. In Tree B, the size of a node is usually equivalent to a complete disk page.

 

For a large B-tree stored on a disk, the branch factor is usually 50 to 2000, depending on the size of the keyword relative to a page. Selecting a large branch factor can greatly reduce the height of the number and the number of disk accesses required when searching for any keyword. A B-tree with a branch factor of 1001 and a height of 2 is displayed. It can store more than 1 billion keywords. Even so, the root node can be permanently stored in the memory, searching for a keyword in this tree requires up to two disk accesses!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.