PHP tutorial for several days did not bring you a knowledge of PHP, today make up Ah! This article is about: MySQL database indexing principle, hope to bring help to everyone!
The first part mainly discusses the mathematical basis of MySQL database index from the data structure and algorithm theory level.
The second part discusses the topics such as clustered index, nonclustered index, and overlay index, which is based on the schema of the index in INNODB data storage engine in MySQL database.
The third part discusses the strategy of high performance using indexes in MySQL.
I. Data structure and algorithm theory
INNODB Storage Engine Implementation index data structure is B + Tree, the following describes several data structures, step by step explain why to use B + Tree
1.1
The B + Tree index is constructed similar to a binary tree and quickly finds data based on key values. But the B + tree does not represent two forks, but rather represents balance. Note: The B + Tree index can find only the page where the data row is located. The database then reads the page into memory, then finds it in memory, and then the data is finally found.
The following is a binary lookup method: Arranging records in order (increment or decrement), in the search process using a jumping way to find, for example: 5, 10, 19, 21, 31, 37, 42, 48, 50, 52 This 10 number,:
You can find 48 with three times of search speed. If it is a sequential lookup, it will take 8 times. For the above 10 numbers, the average number of lookups for sequential lookups is 5.5 times, and the binary lookup method is 2.9 times, in the worst case, the number of sequential lookups is 10, and the number of binary lookups is 4. Binary find slots in page directory in InnoDB are stored in the order of the primary key, and the query for each specific record is searched by two points in page directory.
1.2
Binary search Tree
The number represents the key value for each node, and the key value of the Zuozi is always less than the key value followed by the binary lookup tree, and the key value of the right subtree is always greater than the key value followed. The key values are obtained through the middle order traversal: 2, 3, 5, 6, 7, 8.
The binary lookup tree has an average number of lookups of 2.3 times. But binary search trees can be arbitrarily constructed, such as constructs
But this is similar to the order lookup, so it refers to the idea of a balanced binary tree, the AVL tree.
1.3
Definition: Conform to the definition of binary search tree, and then must meet the height of the left and right two subtrees of any node with a maximum difference of 1.
Balanced binary Tree Although the search speed is very fast but the cost of maintaining a balanced binary tree is very large, it usually takes 1 or more left and right spins to get the balance of the tree after insertion or renewal.
1.4
Features of B + trees:
All records are in the leaf node, and are sequentially stored, each leaf node (page units) is a logical continuous storage, is a two-way circular linked list.
B + Tree Inserts you must ensure that the records in the post-insertion leaf node are still sorted, so there are three things to consider when inserting:
B + Tree Index in the database has a feature is its high fan out, so in the database, B + Tree height is generally 2-3 layers, that is, to find a key value of the row records, up to 2-3 io, and the general disk can do at least 100 io per second, 2-3 times means the query time is only 0.02-0.03 seconds.
Two, clustered index, non-clustered index
The difference between a clustered index and a nonclustered index is whether the page node holds a whole row of records
2.1 Clustered Index
The InnoDB Storage Engine table is an indexed organization table in which the data in the table is stored in the primary key order. The clustered index constructs a B + tree according to the primary key of each table, and the leaf node holds the row record data for the entire table, so that the leaf node of the clustered index becomes the data page. This attribute of the clustered index determines that the data in the indexed organization table is also part of the index. As well as the B + tree data structure, each data page is linked through a doubly linked list.
The actual data can only be sorted by a B + tree, so each table can have only one clustered index. In many cases, the query optimizer is very inclined to take a clustered index because the clustered index allows us to find the data directly on the leaf nodes of the index. In addition, because the logical order of the data is defined, the clustered index can be quickly accessed against the range-worthy query. The query optimizer can quickly discover that a range of data needs to be scanned. Note that the records in each page are also maintained by a doubly linked list.
2.2 Nonclustered indexes
Also called a secondary index, the page level does not contain all the data for the row. In addition to the key values, the page node contains a bookmark in the index at each page level that tells the InnoDB storage engine where to find the row data that corresponds to the index. Because the InnoDB Storage engine table is an indexed organization table, the secondary index bookmark for the InnoDB storage engine is the clustered index key for the corresponding row data. Is the relationship between a clustered index and a secondary index:
When looking for data through a secondary index, the InnoDB storage engine traverses the secondary index and obtains a primary key to the primary key index through a leaf-level pointer, and then finds a complete row record through the primary key index. For example: To find data in a secondary index tree with a height of 3, you need to traverse the secondary index 3 times to find the specified primary key, and if the clustered index tree has the same height of 3, you also need to do three lookups on the clustered index to find the page where the full row data resides. Therefore, 6 logical IO is required to access the final data page.
PHP Tutorial: MySQL Database Indexing principle