**b~ Tree**

**1. Preface:**

The **dynamic find tree is mainly: two-fork search tree (binary search trees), Balanced binary search tree (Balanced binary search trees), red-black tree (Red-black tree), b-tree/b+-tree/b*- Tree (B~tree). The first three is a typical two-fork search tree structure, the time complexity** *O*(log2*N*) is **related to the depth of the tree, so the depth of the tree will naturally improve the search efficiency; there is a real problem: in large-scale data storage, Implementing an index query in such a real context, the number of elements stored by the tree node is limited (if the number of elements is very large, the lookup is degraded into a linear lookup within the node), resulting in a two-fork lookup tree structure due to the depth of the tree and disk I/O read too frequently, In turn, resulting in inefficient queries (why this happens, to be explained in external memory-disk), then how to reduce the depth of the tree (not to reduce the amount of data queried, of course), a basic idea is to adopt a multi-fork tree structure (because the number of tree node elements is limited, Naturally, the number of subtrees of the node is also limited. **

**in this way, we propose a new search tree structure--The multi-path lookup tree. Inspired by the balanced binary tree, it is natural to think of a balanced multi-path search tree structure, which is the subject of this article B~tree (B-tree structure), b-tree this magical tree is in** Rudolf Bayer, Edward M. McCreight (1970) Wrote a paper "Organization and maintenance of Large Ordered Indices" was first proposed. Refer to Wikipedia: Http://en.wikipedia.org/wiki/B-tree, which also describes the B-tree name source and the associated open source address.

Before you begin to introduce B~tree, understand the relevant hardware knowledge before you can understand why you need to b~tree this external memory data structure.

**2. External memory-disk**

Computer storage devices are generally divided into two types: internal memory (main memory) and external memory (external memory). Memory accesses are fast, but they are small, expensive, and cannot be stored for long periods of time (the data disappears without power).

External memory-disk is a direct-access storage device (DASD). It is characterized by a small change in access time. Any character group can be accessed directly, with large capacity and faster speed than other external memory devices.

**2.1 Structure of the disk**

The disk is a flat disc (similar to a phonograph record). There are many circles on the disk called tracks, and the data is recorded on these tracks. The disk can be either monolithic or a set of disks consisting of several platters with two faces on each platter. As an example of a 6-disc set, a total of 10 faces can be used to hold information except for the top and bottom outer sides without storing data.

When the disk drive is performing read/write functions. The disc is mounted on a spindle and rotates at high speed around the spindle, and when the track is passed under the Read/write head (also called the head), the data can be read/written.

The general disk is divided into fixed head (head fixed) and movable head disc. Each track of the fixed head has a separate head, which is fixed and is specifically responsible for reading/writing the data on this track.

The heads of the active head disks (e.g.) are movable. There is only one head on each disc (the head is bidirectional, so both the front and the sides can read and write). It can move from one track to another on that side of the track. All heads are mounted on the same boom, so all heads on different discs are moved simultaneously (action uniform). When the disc rotates around the spindle, the head and the rotating platter form a cylinder. The track with the same radius on each disc makes up a cylindrical face, which we call a cylinder. Therefore, the number of cylinders is that of the track on the disk.

**2.2 Disk read/write principle and efficiency**

The data on the disk must be uniquely marked with a three-dimensional address: The cylinder number, the plate number, the block number (the disk block on the track).

The following 3 steps are required to read/write a specified data on a disk:

(1) The first move arm moves the head to the desired cylinder according to the cylinder number, which is called positioning or locating.

(2) in 6 sets, all heads are positioned on 10 tracks of 10 disks (heads are bidirectional). The track on the specified disc is determined according to the disk face number.

(3) After the disc is determined, the disc begins to rotate and the track segment of the specified block number is moved to the head.

After the above three steps, the specified data storage location is found. You are now ready to start reading/writing operations.

Access to a specific information, consisting of 3 parts of time:

Seek time Ts: The time required to complete the above steps (1). This part of the time is the most expensive, the maximum can reach about 0.1s.

Wait (latency time) Tl: The time required to complete the above steps (3). As the disc revolves around the spindle speed is very fast, generally 7200 rpm (computer hard disk performance indicators, the average domestic hard disk speed is generally 5400rpm (notebook), 7200rpm of several). Therefore generally rotates around 0.0083s.

Transmission time (transmission times) Tt: The time that data is transmitted through the system bus to memory, typically transmitted in one byte (byte) approximately 0.02us=2*10^ ( -8) s

Disk read data is based on disk blocks (block) as the basic unit. All data in the same block can be read all at once. The cost of disk IO is mainly spent on the lookup time TS. Therefore, we should try to keep the relevant information in the same disk block, the same track. Or at least on the same cylinder or adjacent cylinder, in order to read/write information to minimize the number of head-to-back movement, avoid too much time to find TS.

Therefore, in large-scale data storage, a large amount of data is stored in the external memory disk, while in the external disk read/write blocks (block) of the data, the first need to locate a block in the disk, how to effectively find the data on the disk, need a reasonable and efficient external memory data structure, is the b-tree structure to be highlighted below, and the associated variant structure:**b+-tree structure and b*-tree structure. **

**3.b-tree**

B-tree is also called balanced multi-path lookup tree. The characteristics of an M-order B-tree (M-fork tree) are as follows:

(where Ceil (x) is a function that takes the upper limit)

1) Each node in the tree has a maximum of M children;

2) root nodes and leaf nodes, each node has at least ceil (M/2) children;

3) Joghen node is not a leaf node, then at least 2 children (special case: No Child root node, that is, the root node is a leaf node, the whole tree has only one root);

4) All leaf nodes appear on the same layer, and the leaf nodes do not contain any keyword information (which can be seen as an external node or a node where the query failed, in fact these nodes do not exist, pointers to these nodes are null);

5) Each non-terminal node contains n keyword information: (N,P0,K1,P1,K2,P2,......,KN,PN). which

A) Ki (I=1...N) is the keyword, and the keywords are sorted by order K (i-1) < Ki.

b) Pi is a contact point pointing to Subtree, and the key of the pointer P (i-1) to all nodes of the subtree is less than Ki, but both are greater than K (i-1).

c) The number of keywords n must meet: Ceil (M/2)-1 <= n <= m-1.

Each node in the B-tree can contain a large number of keyword information and branches according to the actual situation (of course, it cannot exceed the size of the disk block, depending on the disk drives), the size of the general block is around 1k~4k, so the depth of the tree is reduced. This means finding an element as long as a few nodes are read into memory from the external memory disk and quickly accessing the data to be found.

For simplicity, here is a small amount of data to construct a 3-fork tree form. The above figure, such as the root node, where 17 represents a file name for a disk, the Red square indicates where the contents of the 17 file are stored on the hard disk, and P1 represents a pointer to the 17 left subtree.

Its structure can be simply defined as:

typedef struct {

/* Number of files */

int file_num;

/* File name (key) */

char * file_name[max_file_num];

/* Pointer to child node */

Btnode * Btptr[max_file_num+1];

/* The location where the file is stored on the hard disk */

File_hard_addr Offset[max_file_num];

}btnode;

If each disk block can store exactly one b-tree node (2 file names exactly). Then a Btnode node represents a disk block, and the subtree pointer is the address that holds the other disk block.

To simulate the process of finding file 29:

(1) Locate the root disk Block 1 of the file directory according to the root node pointer, and import the information into the memory. "Disk IO operation 1 times"

(2) In memory, there are two file names 17,35 and three data that store other disk page addresses. According to the algorithm we find 17<29<35, so we find the pointer p2.

(3) According to the P2 pointer, we locate the disk Block 3 and import the information into the memory. "Disk IO operation 2 times"

(4) In memory, there are two file names 26,30 and three data that store other disk page addresses. According to the algorithm we find 26<29<30, so we find the pointer p2.

(5) According to the P2 pointer, we locate the disk Block 8 and import the information into the memory. "Disk IO operation 3 times"

(6) There are two filenames in memory at this time 28, 29. According to the algorithm we find the file 29, and locate the disk address of the file memory.

Analyzing the above procedure, it is found that 3 disk IO operations and 3 memory lookup operations are required. As for the file name lookup in memory, because it is an ordered table structure, you can use binary lookup to improve efficiency. As for the 3 disk IO operations, the determining factor that affects the overall B-tree lookup efficiency.

Of course, if we use a balanced binary tree disk storage structure to find, disk IO operations at least 4 times, up to 5 times. And the more files, the less disk IO operations The B-tree uses to balance the binary tree, and the more efficient it is.

The above only introduces the b-tree of this structure, as well as the insertion and deletion of the tree nodes, and the implementation of the relevant algorithms and code, which will give the corresponding examples in the further study.

The above is a brief introduction to the use of b-tree this structure how to access the data on the external memory disk, let's take another example of this b-tree insert (insert), delete Basic operation is described in detail:

The following is a 5-step B-tree example, as shown:

It satisfies the above conditions: root node and leaf node, each other node has at least ceil (5/2) = 3 children (at least 2 keywords); Of course up to 5 children (up to 4 keywords). The keywords in uppercase letters, in ascending alphabetical order.

The nodes are defined as follows:

typedef struct{

int Count; Number of key elements in the current node

ItemType Key[4]; Storing an array of keyword elements

Long branch[5]; Pseudo-pointer array, (number of records) for easy determination of merging and splitting situations

} NodeType;

**insert Operation** : When inserting an element, first in B-tree, if it does not exist, it ends at the leaf node, and then inserts the new element in the leaf node, note: If the leaf node space is sufficient, Here, you need to move the element to the right that is larger than the newly inserted keyword, and if the space is full so that there is not enough space to add the new element, divide the node into a new adjacent right node and split the half number of key elements into the parent node (of course, If the parent node space is full, the split action is also required, and when the key element in the node moves to the right, the relevant pointer also needs to move to the right. If a new element is inserted at the root node and the space is full, the split operation is performed so that the intermediate key element in the original root node moves up to the new root node, thus causing the tree's height to increase by one level.

Let's step through an example. Insert the following character letter to the empty 5-order B-tree: C N G A H E K Q M F W L T Z D P R X Y s,5 Order means a node with a maximum of 5 children and 4 keywords, the root node outside the nodes have at least 2 keywords, first, the node space enough, 4 letters into the same knot Points, such as:

When we try to insert H, the node discovers that there is not enough space to divide it into 2 nodes, move the middle element g up to the new root node, and in the implementation, we leave A and C in the current node, and H and N place the new right neighbor node. Such as:

When we insert e,k,q, we don't need any split operation.

Inserting m requires a split, note that M happens to be an intermediate keyword element, so it moves up to the parent node

Inserting f,w,l,t does not require any split operations

When inserting z, the right-most leaf node space is full, splitting is required, the middle element t is moved up to the parent node, note that by moving the middle element up, the tree eventually remains balanced, and the node of the split result has 2 key elements.

Inserting d causes the leftmost leaf node to be split, D is exactly the middle element, moves up to the parent node, and then the letter p,r,x,y is inserted without any split operation.

Finally, when inserting s, the nodes that contain n,p,q,r need to be split, the intermediate element Q is moved to the parent node, but the situation is that the space in the parent node is full, so also split, move the middle element m above the parent node to the newly formed root node. Note that the third pointer that was previously in the parent node is modified to include the D and G nodes. The completion of such a specific insert operation, the following describes the delete operation, the deletion operation is relative to the insert operation to consider the situation more points.

**Remove (delete)** **Action:** first find the element to be deleted in the b-tree, if the element exists in the B-tree, then the element is deleted in its node, if the element is deleted, first determine whether the element has left and right child nodes, if there is, then move up the child node in a similar element to the parent. , and then the situation after the move, if not, after the move after the direct deletion.

After deleting an element and moving the element, if the number of elements in a node is less than Ceil (M/2)-1, you need to see whether one of its neighboring sibling nodes is plump (the number of elements in the node is greater than Ceil (M/2)-1), and if it is plump, borrow an element from the parent node to satisfy the condition; That is, when the number of nodes is less than ceil (M/2)-1, the node is "merged" into a node with a neighboring sibling node to satisfy the condition. Let's go through the following examples to learn more about it.

An example of a 5-step b-tree constructed with the above insertion operation is to remove h,t,r,e in turn.

First delete element H, of course first find h,h in a leaf node, and the leaf node number 3 is greater than the minimum number of elements ceil (M/2) -1=2, the operation is very simple, we only need to move K to the original position of H, Move the position of L to K (that is, the element following the delete element in the node moves forward)

Next, delete T, because T is found in the leaf node, but in the middle node, we find his successor W (the next element in ascending alphabetical order), move w up to the position of T, and then delete the W in the child node that contains W, where the number of elements in the child's node is greater than 2. No merging operations are required.

The next step is to remove r,r in the leaf node, but the number of elements in the node is 2, the deletion results in only 1 elements, which is already less than the minimum number of elements ceil (5/2) -1=2, if one of its neighboring sibling nodes is more plump (the number of elements is greater than ceil (5/2) -1=2), You can borrow an element from the parent node. Then the most plump adjacent sibling node is moved up the last or the first element to the parent node, in this instance, the right adjacent sibling node is more plump (3 elements greater than 2), so first borrow an element of the parent node w down to the leaf node, instead of the original s position, S forward , then x moves up to the parent node in the adjacent right sibling node, and finally deletes the x in the adjacent right sibling node, followed by the element forward.

The last step to delete E, the deletion will cause a lot of problems, because E is located in the number of nodes just to meet the minimum number of elements (Ceil (5/2) -1=2), and the adjacent sibling node is the same situation, delete an element can not meet the conditions, Therefore, the node is required to merge with an adjacent sibling junction, first moving the element in the parent node (the element between the two two node elements that need to be merged) to its child nodes, and then merging the two nodes into a single node. So in this example, we first move the element d in the parent node down to the node where E has been deleted and only F, and then merge the nodes with D and F and adjacent sibling nodes containing a,c into a single node.

You might think that this is the end of the delete operation, but in fact, in the case of this particular situation, you will immediately find that the parent node contains only one element g, no standard, which is unacceptable. If the neighbor of the problem node is more plump, you can borrow an element from the parent node. Assuming that the right sibling node (containing q,x) has more than one element (the right side of Q has elements), then we move M down to the small sub-node of the element and move the Q up to the position of M, at which point the Zuozi of Q becomes the right subtree of M, which is the N,p node that is attached to the right pointer of M. So in this case, we have no way to borrow an element that can only be combined with a sibling node into a single node, and the only element in the root node, m, moves down to the child node so that the height of the tree is reduced by one layer.

In order to further discuss the situation of deletion in detail. Give another example:

Here is a different 5-step b-tree, so let's try to remove C

The D element in the right child node of element c is then removed to the position of C, but when the element is moved up, there is only one element of the node.

And because it contains the node of E, its neighbors are just out of poverty (the minimum number of elements is 2), it is impossible to borrow elements to the parent node, so only merge operations, so here will contain a, a, a left sibling node and e containing the node to merge into a node.

In this case, there is only one element f node, at this time, the adjacent sibling knot is plump (the number of elements is 3> the minimum number of elements 2), so you can think of the parent node borrowing elements, the parent node in the J down to the node, the corresponding if the node in the J after the element is moved forward, Then the first element (or the last element) in the adjacent sibling node is moved up to the parent, and the subsequent element (or the preceding element) moves forward (or back), noting that the node containing the k,l was previously attached to the left of M and now becomes attached to the right of J. Thus each node satisfies the B-tree structural properties.

If you want to know the relevant code, see the final reference.

**4.****B+-tree**

**B+-tree**: It is a b-tree deformation tree which is produced by the file system.

The difference between the **B+-tree** of a M-order and the b-tree of M-order is:

1. N subtrees tree nodes contain n keywords; (b-tree is n subtrees tree has n-1 keywords)

2. All the leaf nodes contain information about all the keywords, and pointers to the records containing these keywords, and the leaf nodes themselves are linked by the size of the keywords from a large order of origin. (B-tree leaf node does not include all the information needed to find)

3. All non-terminal nodes can be considered as the index part, and the nodes contain only the largest (or smallest) keywords in the nodes of their sub-roots. (B-tree's non-final node also contains valid information that needs to be looked up)

A) Why is a B + tree better suited to the file index and database index of the operating system in the actual application than B-tree?

**1) b+-tree disk read and write cost less**

The internal node of the **B+-tree** does not have pointers to specific information about the keyword. As a result, the internal nodes are relatively b-tree smaller. If you keep all of the same internal nodes in the same disk block, the number of keywords that the disk block can hold is more. The more keywords you need to find when you read into memory at once. The number of Io reads and writes is correspondingly lower.

For example, suppose that a disk block in a disc holds 16bytes, while a keyword of 2bytes, a keyword specific information pointer 2bytes. An internal node of a 9-order B-tree (a node with a maximum of 8 keywords) requires 2 disks fast. And the **B+-tree** internal node only needs 1 disks fast. When an internal node needs to be read into memory, the B-tree is more than **b+-tree** a block lookup time (the disk is the time of the disc rotation).

**2) B+-tree query efficiency is more stable**

Because a non-endpoint is not a node that ultimately points to the contents of a file, it is only the index of the keyword in the leaf node. So any keyword search must take a path from the root node to the leaf node. The path length of all keyword queries is the same, resulting in a query efficiency equivalent for each data.

b) Application of **B+-tree** : VSAM (Virtual storage access method) file (source paper the ubiquitous Btree D COMER-1979)

**The detailed introduction of B+-tree will be given an example in future study, to be written ...**

**5.b*-tree**

B*-tree is a variant of **B+-tree** , in which the non-root and non-leaf nodes of **b+-tree** are added to the pointers to the brothers; B*-tree defines the number of non-leaf node keywords to be at least (2/3) *m, that is, the minimum usage of the block is 2/ 3 (instead of the B + Tree 1/2). A simple example is given, as shown in:

B+-tree: When a node is full, a new node is allocated, and 1/2 of the data in the original node is copied to the new node, and the pointer to the new node is added to the parent node, and the B+-tree split affects the original and parent nodes without affecting the sibling nodes, so it does not need to point to the sibling's pointer.

B*-tree: When a node is full, if its next sibling node is not full, then move part of the data to the sibling node, insert the keyword at the original node, and finally modify the keyword of the sibling node in the parent node (because the sibling node's keyword range has changed); If the brothers are full, A new node is added between the original node and the sibling node, each copying 1/3 of the data to the new node, and finally adding pointers to the new node at the parent node.

Therefore, the probability of b*-tree allocating new nodes is lower than b+-tree, and the space utilization rate is higher.

**6. Summary**

B-tree,b+-tree,b*-tree summarized as follows:

B-tree: Ordered array + balanced multi-fork tree;

B+-tree: Ordered array chain list + balanced multi-fork tree;

B*-tree: a plump b+-tree.

In large-scale data storage file system, B~tree series data structure, plays a very important role, for the storage of different data, node-related information is also different, here according to their own understanding, painting a search for the staff number as the key word, the employee number is 38 simple record. (assuming that each physical block holds 3 indexes, the basic unit of the I/O operation of the disk is block, disk access is time consuming, and b+-tree effectively reduces the number of accesses to the disk.) ）

The index structure in databases like mysql,db2,oracle needs to be understood in depth, but many b-tree-related open source code can be found on the web.

References (downloads can be found under Google) and related URLs:

1. Organization and maintenance of Large Ordered Indices

2. The ubiquitous B tree

3. Http://en.wikipedia.org/wiki/Btree (given some foreign open source addresses)

4. http://cis.stvincent.edu/html/tutorials/swd/btree/btree.html (include C + + source code)

5. http://slady.net/java/bt/view.php (If you understand the B-TREE structure, the address can be found online in search, insert, delete) operations. ）

b~ Tree