Why use B + Tree Structure -- Implementation of MySQL index structure (1)

Source: Internet
Author: User
Tags mysql index

Why use B + Tree Structure -- Implementation of MySQL index structure (1)

Application of B + tree in Database

{

Why is B + tree used? The reason is:

1. Files are large and cannot be stored in the memory. Therefore, they must be stored on disks.

2. The structure of the index should minimize the number of disk I/O accesses during the search process (why to use B-/+ Tree is also related to the disk access principle .)

3. Local principle and disk pre-read. The pre-read length is generally an integral multiple of the page (in many operating systems, the page size is usually 4 k)

4. the database system cleverly utilizes the disk pre-read principle to set the size of a node to equal to one page, so that each node can be fully loaded only once I/O, (because the node has two arrays, the address is consecutive ). The structure of the red and black trees is much deeper than h. Because logically close nodes (Parent and Child) may be physically far away, locality cannot be exploited.

Differences between the structure of InnoDB and MyISAM

1. The primary key index of InnoDB. The MyISAM index file is separated from the data file, and the index file only stores the data record address. In InnoDB, the table data file itself is an index structure organized by B + Tree. The leaf node data field of this Tree stores complete data records. The key of this index is the primary key of the data table. Therefore, the data file of the InnoDB table is the primary index. Therefore, a primary key is required. If no definition is displayed, an implicit field is automatically generated as the primary key, this field is 6 bytes in length and its type is long integer. the Secondary Index (Secondary Index, which is also a non-primary key Index) of InnoDB will also contain primary key columns. For example, if the name is used to create an Index, the internal node will contain the name, the leaf node will contain the value of the primary key corresponding to the name. If the primary key definition is large, other indexes will also be large. myISAM engine uses B + Tree as the index structure. The data domain of the index file leaf node stores the data record address, pointing to the corresponding value in the data file, each node only has the value of this index column.

4. MyISAM primary index and Secondary index (Secondary key) have no difference in structure, but the primary index requires that the key be unique and the Secondary index can be repeated,

(Because the MyISAM secondary index stores the data record address on the leaf node, which is the same as the primary key index, InnoDB compared with B + can use the secondary index

You can quickly find all the data without having to traverse one primary key index. Therefore, this method is applicable to OLAP)

Differences between InnoDB indexes and MyISAM indexes:

First, the difference between the primary index and the InnoDB data file itself is the index file. The indexes and data of MyISAM are separated.

Second, the difference between secondary indexes: the secondary index data domain of InnoDB stores the value of the primary key of the corresponding record instead of the address. The secondary index of MyISAM is not much different from the primary index.

}

1. Role of indexes in Databases

During the use of the database system, data query is the most frequently used data operation.

The most basic query algorithm is linear search, which traverses the table and matches the row-by-row match to determine whether the value is equal to the keyword to be searched. the time complexity is O (n ). However, tables with low time complexity of O (n) algorithms and databases with low load can also have good performance. However, when the data increases, the time complexity of the O (n) algorithm is obviously bad, and the performance will soon decline.

Fortunately, the development of computer science provides many better search algorithms, such as binary search and binary tree search. If you perform a slight analysis, you will find that each search algorithm can only be applied to a specific data structure. For example, binary search requires that the retrieved data be ordered, while binary search can only be applied to binary search trees, however, the organizational structure of the data itself cannot fully satisfy all kinds of data structures (for example, theoretically it is impossible to organize both columns in order at the same time, the database system also maintains data structures that meet specific search algorithms. These data structures reference (point to) data in some way, so that advanced search algorithms can be implemented on these data structures. This data structure is an index.

An index is a structure that sorts the values of one or more columns in a database table. Compared to searching all rows in a table, indexes use pointers to the data values stored in the specified column in the table, and sort these pointers in the specified order to help you get information faster. Generally, an index must be created on a table only when data in the index column is frequently queried. Indexes occupy disk space and affect the data update speed. However, in most cases, the data retrieval speed advantage brought by indexes greatly exceeds its shortcomings.

2. Application of B + tree in database index

Currently, most database systems and file systems use B-Tree or its variant B + Tree as the index structure.

1)Application in database index

In the application of database indexes, the B + tree is organized as follows:

① Organization of leaf nodes. The search key of the B + tree is the primary key of the data file, and the index is dense. That is to say, in the leaf node, a key and pointer pair is set for the first record of the data file. The data file can be sorted by the primary key or not by the primary key. The data files are sorted by the primary key, the B + tree is a sparse index. In the leaf node, a key and pointer pair are set for each block of the data file. data files are not sorted by key properties, this attribute is the search key of the B + tree. In the leaf node, each attribute K in the data file has a key and pointer pair, the pointer executes the first of the records whose sorting key value is K.

② Organization of non-leaf nodes. The non-leaf nodes in the B + tree form a multi-level sparse index on the leaf nodes. Each non-leaf node has at least ceil (m/2) pointers and at most m pointers.

2)Insert and delete B + tree indexes

① When inserting new data into the database, you also need to insert the corresponding index key value to the database index. Then you need to insert a new key value to the B + tree. That is, the B-Tree Insertion Algorithm we mentioned above.

② When deleting data from the database, you also need to delete the corresponding index key value from the database index. You need to delete the key value from the B + tree. B-tree deletion algorithm


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.