Data structure-B tree

Source: Internet
Author: User
Tags in degrees

1. Preface:

The dynamic find tree is mainly: two fork find tree ( Binary Search Tree ), Balanced binary lookup tree ( Balanced Binary Search Tree ), red-black tree (Red-black Tree) , B-tree/b+-tree/b*-tree (B~tree) . The first three is a typical two-fork search tree structure, where the time complexity O(log2N) is associated with the depth of the tree, so reducing the depth of the tree naturally increases the efficiency of the search.

but we have to face a practical problem: in large-scale data storage, the implementation of index query in a real context, the tree node storage of the number of elements is limited (if the number of elements is very large, the search degenerated into a linear search within the node), This results in a two-fork lookup tree structure that causes disk I/O to read and write too frequently due to the depth of the tree, which leads to inefficient queries (why this happens and will be explained in external memory - disk), So how to reduce the depth of the tree (of course, can not reduce the amount of data queried), a basic idea is: the use of multi-fork tree structure (because the number of tree node elements is limited, naturally the node's subtree number is limited).

in this way, we propose a new search tree structure--The multi-path lookup tree. Inspired by the balanced binary tree, it is natural to think of a balanced multi-path search tree structure, which is the first topic to be elaborated in this article b~tree (B- tree structure ).

B-tree (b-tree tree is B-tree) This magical tree is in Rudolf Bayer, Edward M. McCreight (1970) wrote an essay on Organization and maintenance of Large Ordered Indices First proposed in Wikipedia: Http://en.wikipedia.org/wiki/B-tree, which describes the source of B-tree and the associated open source address.

Before you begin to introduce B~tree, understand the relevant hardware knowledge before you can understand why you need to b~tree this external memory data structure.

2. External Memory - disk

Computer storage devices are generally divided into two types: internal memory (main memory) and external memory (external memory). Memory accesses are fast, but they are small, expensive, and cannot be stored for long periods of time (the data disappears without power).

External memory-disk is a direct-access storage device (DASD). It is characterized by a small change in access time. Any character group can be accessed directly, with large capacity and faster speed than other external memory devices.

2.1 structure of the disk

The disk is a flat disc (similar to a phonograph record). There are many circles on the disk called tracks, and the data is recorded on these tracks. The disk can be either monolithic or a set of disks consisting of several platters with two faces on each platter. As an example of the 6-disc group shown in 11.3, a total of 10 faces can be used to hold information except for the top and bottom outer sides without storing the data.

When the disk drive is performing read/write functions. The disc is mounted on a spindle and rotates at high speed around the spindle, and when the track is passed under the Read/write head (also called the head), the data can be read/written.

The general disk is divided into fixed head (head fixed) and movable head disc. Each track of the fixed head has a separate head, which is fixed and is specifically responsible for reading/writing the data on this track.

The heads of the active head disks (e.g.) are movable. There is only one head on each disc (the head is bidirectional, so both the front and the sides can read and write). It can move from one track to another on that side of the track. All heads are mounted on the same boom, so all heads on different discs are moved simultaneously (action uniform). When the disc rotates around the spindle, the head and the rotating platter form a cylinder. The track with the same radius on each disc makes up a cylindrical face, which we call a cylinder. Therefore, the number of cylinders is that of the track on the disk.

2.2 the disk reads / writing principle and efficiency

The data on the disk must be uniquely marked with a three-dimensional address: The cylinder number, the plate number, the block number (the disk block on the track).

The following 3 steps are required to read/write a specified data on a disk:

(1) The first move arm moves the head to the desired cylinder according to the cylinder number, which is called positioning or locating.

(2) in the 6-disk group shown in 11.3, all the heads are positioned on the 10 tracks of 10 disks (the heads are bidirectional). The track on the specified disc is determined according to the disk face number.

(3) After the disc is determined, the disc begins to rotate and the track segment of the specified block number is moved to the head.

After the above three steps, the specified data storage location is found. You are now ready to start reading/writing operations.

Access to a specific information, consisting of 3 parts of time:

Seek time Ts: The time required to complete the above steps (1). This part of the time is the most expensive, the maximum can reach about 0.1s.

Wait (latency time) Tl: The time required to complete the above steps (3). As the disc revolves around the spindle speed is very fast, generally 7200 rpm (computer hard disk performance indicators, the average domestic hard disk speed is generally 5400rpm (notebook), 7200rpm of several). Therefore generally rotates around 0.0083s.

Transmission time (transmission times) Tt: The time that data is transmitted through the system bus to memory, typically transmitted in one byte (byte) approximately 0.02us=2*10^ ( -8) s

Disk read data is based on disk blocks (block) as the basic unit. All data in the same block can be read all at once. The cost of disk IO is mainly spent on the lookup time TS. Therefore, we should try to keep the relevant information in the same disk block, the same track. Or at least on the same cylinder or adjacent cylinder, in order to read/write information to minimize the number of head-to-back movement, avoid too much time to find TS.

Therefore, in large-scale data storage, a large amount of data is stored in the external memory disk, while in the external disk read/write blocks (block) of the data, the first need to locate a block in the disk, how to effectively find the data on the disk, need a reasonable and efficient external memory data structure, is the b-tree structure to be highlighted below, and the associated variant structure:b+-tree structure and b*-tree structure.

3. B-tree

Before the specific explanation, there is a point, again stressed: B-tree, that is, a B. tree. Because the original English name of B-Tree is b-tree, and many people in China like to b-tree translation B-tree, in fact, this is a very bad literal translation, it is easy to make people misunderstand. As one might think of a B-tree as a tree, and a tree of trees. In fact,B-tree refers to the B-tree . It is hereby stated.

We know that the B-tree is a multi-fork designed for disk or other storage devices (as you can see below, the B-tree has multiple branches, that is, multiple forks) to balance the lookup tree with respect to the binary. Similar to the red and black trees described earlier in this blog, it is better to reduce disk i/0 operations. Many database systems generally use B-tree or B-tree variant structures, such as the b* tree, which is about to be introduced below, to store information.

The biggest difference between a B-tree and a red-black tree is that the nodes of a B-tree can have many children, from several to thousands of. Why do you say B-trees are similar to red-black trees? Because, like the red and black trees, a B-tree with n nodes is also O (LGN), but may be much smaller than the height of a red-black tree, and its branching factor is larger. Therefore, the B-tree can implement various dynamic collection operations such as INSERT, delete, etc. within O (logn) time.

As shown, that is a B-tree, a key word for the English consonant B-tree, now to find the letter R from the tree species (including n[x] key words x,x have n[x]+1] children (that is, an inner node x if it contains n[x] keywords, then x will contain n[x]+1 children). All leaf nodes are at the same depth, and shaded nodes are the nodes to check when the letter R is found:

Believe that, from what you can easily see, an inner node x if it contains n[x] keyword, then x will contain n[x]+1 children. If there are 3 children in the inner node with 2 keyword D h, 4 children are included in the inner node with 3 keywords Q T x.

B-Tree is also called balanced multi-path search tree. a tree M-Order B-Tree (M-fork Tree) The following features :

    1. Each node in the tree contains a maximum of M children (m>=2);
    2. Root nodes and leaf nodes, each of the other nodes has at least [Ceil (M/2)] Children (where ceil (x) is an upper-bound function);
    3. Joghen nodes are not leaf nodes, there are at least 2 children (special case: No Child root node, that is, the root node is a leaf node, the whole tree has only one root);
    4. All leaf nodes appear on the same layer, and the leaf nodes do not contain any keyword information (which can be seen as an external contact or a contact where the query failed, in fact these nodes do not exist, pointers to these nodes are null);
    5. Each non-terminal node contains n keyword information: (N,P0,K1,P1,K2,P2,......,KN,PN). which
      A) Ki (I=1...N) is the keyword, and the keyword is sorted in ascending order of K (i-1) < Ki.
      b) Pi is a contact point pointing to Subtree, and the key of the pointer P (i-1) to all nodes of the subtree is less than Ki, but both are greater than K (i-1).
      c) The number of keywords n must satisfy: [Ceil (M/2) -1]<= n <= m-1.

For the 5th above, the following: Each node in the B-tree can contain keywords (such as the previous D H and Q T X) have an upper bound and a lower bound. These two realms can be represented by a minimum degree called a B-tree (the Chinese version of the algorithm is translated in degrees) T (t>=2).

    • Each non-root node must contain at least t-1 keywords. Each non-root inner node has at least one child of T. If the tree is non-empty, the root node contains at least one keyword;
    • Each node can contain more than one 2t-1 keyword. Therefore, an internal node can have up to 2t children. If a node happens to have a 2t-1 keyword, we say that the node is full (and later on the b* tree as a common variant of the B-tree, the b* tree requires that each inner node is at least 2/3 full, rather than half-full as required by the B-Tree here);
    • When the key word t=2 (t=2 means, tmin=2,t can >=2) when the B-tree is the simplest ( There are many people would mistakenly think that B-tree is a binary search tree, but the binary search tree is a binary search tree, B-Tree is a B-tree, The real most accurate definition of a B-tree is: a balanced multi-path lookup tree with a T (t>=2) keyword . Each inner node may therefore contain 2, 3, or 4 children, i.e. a 2-3-4 tree, whereas in practice, a much larger T-value is usually used.

Each node in the B-tree can contain a large number of keyword information and branches according to the actual situation (of course, it cannot exceed the size of the disk block, depending on disk drives, the size of the general block is around 1k~4k); So the depth of the tree is reduced, This means finding an element as long as a few nodes are read into memory from the external memory disk and quickly accessing the data to be found.

For the sake of simplicity, here is a small amount of data to construct a 3-fork tree form, the actual application of the B-tree node in a lot of keywords. The above figure, such as the root node, where 17 represents a file name for a disk, the Red square indicates where the contents of the 17 file are stored on the hard disk, and P1 represents a pointer to the 17 left subtree.

Its structure can be simply defined as:

typedef struct {

/* Number of files */

int file_num;

/* File name (key) */

char * file_name[max_file_num];

/* Pointer to child node */

Btnode * Btptr[max_file_num+1];

/* The location where the file is stored on the hard disk */

File_hard_addr Offset[max_file_num];

}btnode;

If each disk block can hold exactly one node of the B-tree (with exactly 2 file names). Then a Btnode node represents a disk block, and the subtree pointer is the address that holds the other disk block.

Below, let's simulate the process of finding file 29:

(1) Locate the root disk Block 1 of the file directory according to the root node pointer, and import the information into the memory. "Disk IO operation 1 times"

(2) In memory, there are two file names 17,35 and three data that store other disk page addresses. According to the algorithm we find 17<29<35, so we find the pointer p2.

(3) According to the P2 pointer, we locate the disk Block 3 and import the information into the memory. "Disk IO operation 2 times"

(4) In memory, there are two file names 26,30 and three data that store other disk page addresses. According to the algorithm we find 26<29<30, so we find the pointer p2.

(5) According to the P2 pointer, we locate the disk Block 8 and import the information into the memory. "Disk IO operation 3 times"

(6) There are two filenames in memory at this time 28, 29. According to the algorithm we find the file 29, and locate the disk address of the file memory.

Analyzing the above procedure, it is found that 3 disk IO operations and 3 memory lookup operations are required. As for the file name lookup in memory, because it is an ordered table structure, you can use binary lookup to improve efficiency. As for the 3 disk IO operations, the decision factors that affect the overall B-tree lookup efficiency.

Of course, if we use a balanced binary tree disk storage structure to find, disk IO operations at least 4 times, up to 5 times. And the more files, the smaller the number of disk IO operations used by the B-tree than the balanced binary tree, the higher the efficiency.

Data structure-B tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.