Use of B-tree and B + trees: A detailed introduction to data search and database indexing _mssql

Source: Internet
Author: User

1. B-Tree definition

A B-tree is a balanced, multi-channel lookup tree that is useful in file systems.

Definition: An M-order B-tree, or an empty tree, or a M-fork tree that satisfies the following attributes:
At most, there are M Shang trees in each node of the ⑴ tree;
⑵ Jogen node is not a leaf node, there are at least two Shang trees;

All ⑶ nodes other than the root node have at least [M/2] Shang trees;
⑷ all non-terminal nodes contain the following information data:

Where: Ki (i=1,2,..., N) is the key code, and Ki<ki+1,

Ai is a pointer to a Zishugen node (i=0,1,..., N), and the key code for all nodes in the subtree of the pointer Ai-1 is less than the Ki (i=1,2,..., N), and the key code for all nodes in the subtree is greater than KN.

N is the number of key codes.
⑸ all leaf nodes appear on the same level with no information (can be considered as external nodes or lookup failed nodes, in fact these nodes do not exist, the pointer to these nodes is empty).

That is, all leaf nodes have the same depth, equal to the height of the tree.

A four-step B-tree with a depth of 4.

The B-Tree lookup resembles the binary sort tree lookup, the difference is the B-tree each node is the multiple key code ordinal table, in the arrival of a node, first in the ordered table to find, if found, then find success, otherwise, to the corresponding pointer information pointing to the subtree to find, when the leaf node, Indicates that there is no corresponding key in the tree.

The process of finding the keyword 47 on the B-Tree of the diagram above is as follows:

1 first from the beginning, according to the root node pointer to find the * node, because there is only one keyword in the *a node, and the given value of > keyword 35, if the existence must be in the pointer A1 within the subtree.

2 Follow the pointer to find the *C node, which has two keywords (43 and 78), and the < < 78, if there is more than in the pointer A1 in the subtree.

3 Similarly, follow the pointer to find the *G node, locate the keyword 47 in the node, and find the success.

2. Search algorithm

Copy Code code as follows:

typedef int KEYTYPE;
#define M 5/*b The order of the tree, tentatively set to 5*/
typedef struct node{
int keynum; The number of key codes in the/* node, that is, the size of the node.
struct Node *parent; /* Point to parent node * *
KeyType key[m+1]; /* Critical code vector, No. No. 0 Unit not used/*
struct Node *ptr[m+1]; /* Subtree pointer vector * *
Record *recptr[m+1]; /* Record pointer vector/*
}nodetype; /*b tree Node Type * *

typedef struct{
NodeType *pt; * * point to the Found node/
int i; /* Key code number in the node, node ordinal interval [1...m]*/
int tag; /* 1: Find success, 0: Find failure * *
}result; /*b Tree's lookup result type * *

Result Searchbtree (NodeType *t,keytype kx)
/* Find the key code kx on the M-order B-Tree T, back (Pt,i,tag). If the lookup succeeds, the eigenvalues tag=1,*/
/* Pointer pt the first key of the node is equal to KX, otherwise, the eigenvalue tag=0, equal to the KX key code record * *
/* should be inserted between the first and the I+1 key points in the node indicated in the pointer pt * *
p=t;q=null;found=false;i=0; /* initialization, p pointing to the open node, q to the parents of P
while (P&&!found)
{N=p->keynum;i=search (P,KX); /* Find in p-->key[1...keynum] * *
if (i>0&&p->key[i]= =kx) found=true; * * Find * *
else {q=p;p=p->ptr[i];}
if (found) return (p,i,1); /* Find Success * *
else return (q,i,0); /* Lookup is unsuccessful, back to KX insertion position Information * *

Analysis of B-Tree lookup algorithm

As you can see from the lookup algorithm, finding in the B-tree consists of two basic operations:

(1) Finding nodes in B-tree;

(2) Find the key word in the node.

Since a B-tree is typically stored on disk, the previous lookup operation is performed on disk. The next lookup operation is in memory, that is, after finding the node of the pointer p on disk, the information in the node is read into memory, and then the keyword of finding the query equals K is used in order lookup or binary. Obviously, a lookup on disk is much more time-consuming than a single lookup in memory.

Therefore, the number of lookups on disk, that is, the hierarchical tree on the B-tree where the key is to be found, is the primary factor in determining the efficiency of B-tree lookup.

So how deep is the worst case for M-order B-trees that contain n key codes? Similar analysis can be carried out according to the two-fork balance tree. First, the minimum number of nodes on each layer of M-order B-numbers is discussed.

Defined by a B-tree: B-Tree contains n keywords. So there are n+1 leaves all in the j+1 layer.

1 The first layer is the root, at least one node, the root has at least two children, so there are at least two nodes in the second layer.

2 root and leaves, other nodes have at least [M/2] a child, so the third layer at least 2*[M/2] a node, on the fourth floor at least 2*[m/2]2 nodes ...

3 then at least the J+1 layer has 2*[m/2]j-1 nodes, and the J+1 layer node is the leaf node, so the number of leaf nodes n+1. Yes:

That is, in the N-keyword lookup, the number of nodes involved in the node from the root node to the keyword does not exceed:

Insertion of the 3.b-tree

B-Tree generation is also from the empty tree, one by one to insert the keyword. However, because the number of keywords in the B-tree node must be ≥ceil (M/2)-1, therefore, each time you insert a keyword instead of adding a leaf node to the tree, you first add a keyword to a non terminal node in the lowest level, and if the number of keywords for that node does not exceed m-1, the insertion completes, Otherwise, the "splitting" of the node will be generated,

As shown in figure (a) is a 3-order B-tree (the graph omits the F node (i.e. the leaf node), it is assumed that the keyword 30,26,85 should be inserted sequentially.

1 first by looking to determine the location of the insertion. A lookup is made from the root *a to determine that 30 should be inserted in the *d node. Since the number of keywords in *d is no more than 2 (that is, m-1), the first keyword insertion is completed: for example (b)

2 Similarly, the *d should also be inserted by identifying the keyword 26 by looking. Because the number of *D nodes is more than 2, the *d must be split into two nodes, and the keyword 26 and the first and last two pointers remain in the *d node, and the keyword 37 and its first and last two pointers are stored in the new generated node *d '. At the same time, the pointer to the keyword 30 and indicating node *d is inserted into the node of its parent. The insertion completes because the number of keywords in the *b node is not more than 2. if (c) (d)

3) (e)-(g) after insertion of 85;

Insert algorithm:

Copy Code code as follows:

int Inserbtree (NodeType **t,keytype kx,nodetype *q,int i) {
/* Inserts a key code between the key[i],key[i+1] of the node *q on the M-Order B-Tree *t kx*/
* If the knot is too large, then the necessary node-splitting adjustment along the two-parent chain, so that *t still be m-order B-tree.
while (q&&!finished)
Insert (Q,I,X,AP); /* inserts x and AP respectively into q->key[i+1] and q->ptr[i+1]*/
if (q->keynum<m) finished=true; /* Insert Complete * *
{/* Split node *p*/
S=m/2;split (Q,AP);x=q->key[s];
/* Move Q-&GT;KEY[S+1...M],Q-&GT;PTR[S...M] and q->recptr[s+1...m into a new node *ap*/
if (q) i=search (Q,KX); * * Find the insertion position of the KX in the parent node *q.
if (!finished)/* (*T) is an empty tree or root node that has been split into *q* and ap*/
Newroot (T,Q,X,AP); /* Generate a new root node *t containing information (T,X,AP), the original *t and AP as subtree pointers.

4. B-Tree deletion

Conversely, if you delete a keyword on the B-tree, you should first find the node where the keyword is located, and remove it, if the node is the lowest non-terminal node, and the number of keywords is not less than ceil (M/2), the deletion is complete, otherwise the "merge" node operation. If the deletion of the keyword is not a terminal node in the Ki, you can pointer ai refers to the subtree of the smallest keyword y instead of ki, and then in the corresponding node to delete Y. For example, by deleting 45 from the B-Tree of 4.1 (a) below, you can *f 50 in the node to replace 45, and then delete 50 from the *f node.

Figure 4.1 (a)

So, let's just talk about removing the keywords from the lowest-end node. The following three possibilities are possible:

(1) The number of keywords in the node where the deletion of the keyword is not less than ceil (M/2), you simply delete the keyword ki and the corresponding pointer ai from the node, and the rest of the tree does not change, for example, delete the keyword 12 from the B-tree shown in Figure 4.1 (a), and the deleted B-tree, as shown in Tutu 4.2 (a):

Figure 4.2 (a)

(2) The number of keywords in the node where the deletion keyword is equal to Ceil (M/2)-1, the number of keywords in the node adjacent to the right sibling (or left sibling) is greater than ceil (M/2)-1, and the minimum (or maximum) keyword in its sibling node is moved up to the parent node. The parent node is less than (or greater than) and the keyword immediately below the move key is moved down to the node where the deleted keyword is located.

[For example], delete 50 from Tutu 4.2 (a), and move 61 of its right sibling node to *e node. The number of keywords in the *e node is moved to *f, so that both *f and *g are not less than ceil (m-1)-1, and the number of keywords in the parent node is unchanged, as in Tutu 4.2 (b) as shown.

Figure 4.2 (b)

(3) The number of keywords in the node and its adjacent sibling nodes is equal to Ceil (M/2)-1. Suppose the node has a right sibling, and its right sibling node address is indicated by a pointer AI in the parent node. After the deletion of the keyword, the remaining keywords and pointers in its node, together with the key word ki in the parent node, are merged into the AI's point of sibling (if there is no right sibling, merge to the left sibling node).

[For example], delete 53 from the B-tree shown in Figure 4.2 (b), delete the *f node and merge the remaining information in *f (the pointer "null") with 611 of the parent *e node into the right sibling *g. The deleted tree is shown in Figure 4.2 (c).

Figure 4.2 (c)

If so, the number of keywords in the parent node is less than Ceil (M/2)-1, and so on.

[For example], after you delete the keyword 37 from the B-tree in Figure 4.2 (c), the remaining information ("Pointer C") in the parent B node should be merged with the keyword 451 in the parent *a node to the right sibling *e, and the deleted B-tree is shown in Figure 4.2 (d).

Figure 4.2 (d)

B-Tree is mainly used in file system

To store a large database file on a hard disk to reduce the number of accesses to the hard disk The purpose of this paper is to propose a balanced multichannel lookup tree--b-tree structure based on its performance analysis, it is shown that its retrieval efficiency is quite high in order to improve B-tree performance ' There are many B-tree variants, try to improve the B-tree

B + Tree B + Tree is a B-tree deformation tree which is required by the file system. A M-order B + tree and a M-order-
The difference between trees is:
The nodes of N-Shang trees in ⑴ contain n key codes;
⑵ all the leaf nodes contain information about all the key codes and pointers to the records containing these key codes, and
The leaf node itself depends on the size of the key code in the order of the large sequence link.
All ⑶ nodes can be regarded as the index part, and the node contains only the largest (or smallest) key number in the node of the child root.
As shown in a 3-order B + Tree:

There are usually two head pointers on the B + tree, one pointing to the root node and the other to the smallest node of the leaf. So there are two kinds of lookup operations for B + trees: One is to find from the smallest keyword, the other is to start from the root node and make a random lookup.
The process of random lookup, insertion, and deletion on a B + tree is basically similar to the B-tree. Only when looking, the key code on the terminal node is not equal to the given value, it does not terminate, but continues downward until the leaf node. Therefore, in B +
Tree, regardless of whether the search succeeds or not, each lookup is a path from root to leaf node.

application of B + tree in database

1. The role of indexes in the database

In the process of database system usage, the data query is the most frequently used data operation.

The most basic query algorithm is, of course, sequential lookup (linear search), traversing the table and then line by row to match whether the row value is equal to the keyword to find, and its time complexity of O (n).  But the time complexity of O (n) algorithm scale small table, load light database, also can have good performance. However, when the data increases, the time complexity of O (n) algorithm is obviously bad, performance is quickly reduced.

Fortunately, the development of computer science provides a number of better search algorithms, such as binary search (binary search), binary tree search (binary) and so on. A little analysis will find that each lookup algorithm can only be applied to a specific data structure, such as binary lookup requires that the data be ordered, and binary tree lookup can only be applied to the binary lookup tree, but the organization of the data itself can not fully meet the various data structures (for example, It is theoretically impossible to organize both columns sequentially, so in addition to data, the database system maintains a data structure that satisfies a particular lookup algorithm, which refers to data in some way, so that advanced lookup algorithms can be implemented on these data structures. This data structure is the index.

An index is a structure that sorts the values of one or more columns in a database table. Compared to searching all rows in a table, the index points to data values that are stored in the specified columns in the table, and then arranges the pointers in the order that they are specified, helping to get the information more quickly. In general, you need to create an index on a table only if you frequently query the data in an indexed column. The index consumes disk space and affects the speed at which data is updated. But in most cases, the data retrieval speed advantage that the index brings is much more than its shortcoming.

2. B + Tree application in database indexing

At present, most database system and file system adopt B-tree or its variant B+tree as index structure.

1 Application of database index

In the application of database indexing, B + trees are organized in the following ways:

The ① of leaf nodes. The lookup key for a B + tree is the primary key of the data file, and the index is dense. In other words, the first record of a data file in a leaf node is provided with a key, an allusion, the data file can be sorted by primary key or not by primary key, the data file is sorted by primary key, and B + tree is sparse index, there is a key for each block of data file in the leaf node, the data file does not have key attribute sort, And the property is a B + Tree lookup key, in the leaf node for the data file appears in each property k with a key, refers to, where the pointer performs a sort key value of the first record in K.

The ② node is organized in the form of non leaf nodes.  Non-leaf nodes in B + Trees form a multilevel sparse index on the leaf nodes. There are at least ceil (M/2) pointers in each non-leaf node, with at most m pointers.

2. Insert and delete of B + Tree index

① When you insert new data into a database and you also need to insert the corresponding index key value into the database index, you need to insert a new key value into the B + tree. That is, the B-tree insertion algorithm we mentioned above.

② When you delete data from a database and you also need to remove the corresponding index key value from the database index, you need to remove the key value from the B + tree. B-Tree deletion algorithm

Why use B-tree (B+tree)

The data structure such as the red-black tree of the evolutionary breed of binary lookup tree can also be used to realize the index, but the file system and the database system generally adopt b-/+tree as index structure.

In general, the index itself is also very large, it is not possible to store all in memory, so the index is often stored in the form of an index file on the disk. In this way, the index lookup process will generate disk I/O consumption, relative to memory access, I/O access consumption of several orders of magnitude, so the evaluation of a data structure as an index of the most important indicator is the number of disk I/O operations in the search process of the gradual complexity. In other words, the structure of the index organizes to minimize the number of disk I/O accesses during the lookup process. Why use B-/+tree is also related to the principle of disk access.

the principle of locality and disk pre-read

Because of the nature of the storage media, the disk itself is much slower to access than main memory, coupled with the cost of mechanical exercise, disk access speed is often one of the hundreds of of main memory, so in order to improve efficiency, to minimize disk I/O. To do this, the disk is often not read strictly on demand, but read each time, even if only a single byte, the disk will start from this location, sequentially reading a certain length of data into memory. The rationale for this is the famous local principle in computer science:

When a data is used, the data near it is usually used immediately.

The data that is required during program execution is usually more centralized.

Because of the high efficiency of disk sequential reads (no seek time is required, only a small amount of rotation time), prefetching can improve I/O efficiency for locally-accessible programs.

The length of the prefetch is typically the integer multiple of the page. Pages are logical blocks of Computer Management memory, and hardware and operating systems often split main memory and disk storage into contiguous blocks of equal size, each of which is called a page (in many operating systems, the page size is typically 4k), and main memory and disk Exchange data in page units. When the program to read the data is not in main memory, will trigger a page fault exception, the system will be sent to the disk read signal, the disk will find the starting position of data and read one or several pages back to load in memory, and then abnormal return, the program continues to run.

We analyze b-/+tree above we need to access the nodes at most once:

h =

The database system cleverly utilizes the disk prefetching principle to set the size of a node to be equal to a page so that each node needs only one I/O to be fully loaded. In order to achieve this goal, the following techniques need to be used to actually implement B-tree:

Each time you create a new node, directly request a page space, so that a node is physically also stored in a page, plus the computer storage allocation is page-aligned, the implementation of a node only once I/O.

A maximum of h-1 I/O (the root node resident memory) is required at one time in B-tree, and the asymptotic complexity is O (h) =o (LOGMN). In general practical applications, M is a very large number, usually more than 100, so H is very small (usually no more than 3).

To sum up, using B-tree as index structure efficiency is very high.

The structure of the red-black tree, H is obviously much deeper. Because logically close nodes (parent-child) are physically far away from being able to exploit locality, the red-black tree has an O (h) I/O asymptotic complexity, which is significantly worse than B-tree.

MySQL's B-tree index (technically, b+tree)

In MySQL, there are four main types of indexes, namely B-tree Index, Hash Index, fulltext Index and R-tree index. We mainly analyze the B-tree index.

The B-tree index is the most frequently used index type in the MySQL database, and all other storage engines except the Archive storage engine support the B-tree index. The Archive engine does not support indexing until MySQL 5.1, and only the index of a single auto_increment column is supported.

Not only in MySQL, but in many other database management systems, the B-tree index is also the primary index type, mainly because the storage structure of the B-tree index has a very good performance in the data retrieval of the database.

In general, the physical files of the B-tree index in MySQL are mostly stored in the structure of the Balance tree, where all the data actually needed resides on the leaf node of the tree, and the shortest path to any leaf node is long Degrees are exactly the same, so we all call it the B-tree index. Of course, it is possible that various databases (or MySQL's various storage engines) will be slightly modified for storage structure when storing their b-tree indexes. The actual storage structure used for the B-tree index of the INNODB storage engine is actually b+tree, a small transformation based on the B-tree data structure, with information about the index key stored on each leaf Node, and also stores points to Nod with the leaf E next to the next Leafnode pointer information (added sequential access pointer), this is mainly to speed up the retrieval of multiple adjacent Leaf Node efficiency considerations.

The following main discussion MyISAM and InnoDB two storage engine's index implementation way:

1. MyISAM Index Implementation:

1 PRIMARY KEY index:

The MyISAM engine uses B+tree as the index structure, and the data field of the leaf node holds the address of the record. The following diagram is the schematic diagram of the MyISAM primary key index :

(Figure MYISAM1)

Here is a total of three columns, assuming that we take Col1 as the primary key, the figure myisam1 is a myisam table of the main index (Primary key) schematic. You can see that the MyISAM index file only holds the address of the data record.

2) Secondary index (secondary key)

In MyISAM, the primary index and the secondary index (secondary key) are structurally indistinguishable, except that the primary index requires the key to be unique, and the key of the secondary index can be duplicated. If we establish a secondary index on the Col2, the structure of this index is shown in the following illustration:

It's also a b+tree,data domain. The address of the data record. Therefore, the algorithm of index retrieval in MyISAM is to search the index according to the B+tree search algorithm first, if the specified key exists, take out the value of its data field, then read the corresponding data record with the value of the data field as the address.

The MyISAM index is also called "nonclustered," and is so called to distinguish it from the InnoDB clustered index.

2. InnoDB Index Implementation

However InnoDB also uses B+tree as the index structure, but the concrete implementation way is different from MyISAM.

1 PRIMARY KEY index:

MyISAM index files and data files are separate, and the index file only holds the address of the data record. In InnoDB, the table data file itself is an index structure that is organized by b+tree, and the leaf node of this tree holds the complete data record. The key of this index is the primary key of the datasheet, so the InnoDB table data file itself is the primary index.

(Figure Inndb Primary key index)

(Figure Inndb Primary key index) is a schematic diagram of the InnoDB Primary index (also the data file), and you can see that the leaf node contains the complete data record. This index is called a clustered index. Because the InnoDB data file itself is clustered by the primary key, the INNODB requires the table to have a primary key (MyISAM can not), and if not explicitly specified, the MySQL system automatically selects a column that uniquely identifies the data record as the primary key, and if no such column exists, Then MySQL automatically generates an implied field for the InnoDB table as the primary key, which is 6 bytes long and has a type of length shaping.

2). Secondary index of InnoDB

All secondary indexes for InnoDB refer to the primary key as the data field. For example, the following figure is a secondary index defined on the COL3:

The InnoDB table is based on a clustered index. Therefore, the InnoDB index provides a very fast primary key lookup performance. However, its secondary index (secondary index, which is not a primary key) also contains primary key columns, so if the primary key definition is relatively large, the other indexes will be large. If you want to define and many indexes on a table, try to define the primary key as small as possible. InnoDB does not compress indexes.

The ASCII code of the character is used as the comparison criterion. Clustered indexes This implementation makes searching by primary key efficient, but a secondary index search requires a two-pass index: First retrieving the secondary index to obtain the primary key, and then retrieving the record with the primary key to the primary index.

The way the index is implemented for different storage engines is very helpful for proper use and optimization of indexes, for example, once you know InnoDB's index implementation, it's easy to see why you don't recommend using too long fields as primary keys, because all secondary indexes reference the primary index, and the too long primary index makes the secondary index too large. For example, using a non monotone field as a primary key is not a good idea in InnoDB because the InnoDB data file itself is a b+tree, and a non monotone primary key causes the data file to be frequently split and adjusted to maintain b+tree characteristics when inserting new records, which is very inefficient. Using the self-added field as the primary key is a good choice.

Differences between InnoDB Index and MyISAM index:

One is the difference between the primary index, the InnoDB data file itself is the index file. and MyISAM's index and data are separate.

The second is the difference between secondary indexes: the secondary index of the InnoDB data field stores the value of the corresponding record primary key instead of the address. MyISAM's secondary index and primary index are not much different.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.