A Brief Introduction to MySQL's B-tree index and INDEX OPTIMIZATION, and a brief introduction to mysql
MySQL MyISAM and InnoDB engines use B + tree indexes by default ("BTREE" is displayed during queries). This article discusses two issues:
- Why is the index structure of B + tree selected for mainstream databases such as MySQL?
- How to understand the common MySQL index optimization ideas based on the index structure?
Why cannot indexes be fully loaded into memory?
The selection of the index structure is based on the following nature: when the data volume is large, indexes cannot be fully loaded into the memory.
Why cannot indexes be fully loaded into the memory? Assume that the tree structure is used to organize the index. A simple estimation is as follows:
- Assume that a single index node contains 100 million data rows and a unique index, the leaf node occupies about 200 MB, and the entire tree can contain a maximum of MB.
- Assume that a row of Data occupies B, the total data occupies about 2 GB.
Assume that the index is stored in the memory. That is to say, every 2 GB of data stored on a physical disk occupies 1/10 MB of memory, and the index: the data usage is about. Is the proportion occupied by 1/10 large? Physical disks are much cheaper than memory. Taking a server with a 16 GB hard disk and 1 TB of memory as an example, if you want to store a 1 TB hard disk, at least GB of memory is required, much larger than 16 GB.
Considering that a table may have multiple indexes, joint indexes, and less data rows, the actual usage ratio is usually greater than 1/10, and in some cases it can reach 1/3. In an index-based storage architecture, indexes cannot be fully loaded into memory because the data usage is too high.
Other Structure Problems
Because the memory cannot be loaded, it is bound to rely on disk (or SSD) storage. The memory read/write speed is tens of thousands of times that of the disk (related to specific implementations). Therefore, the core issue is "How to Reduce disk read/write times ".
First, we do not consider the page table mechanism. Suppose that each read or write directly goes through the disk, then:
- Linear Structure: average read/write O (n) times
- Binary Search Tree (BST): average read/write O (log2 (n) times; if the fruit tree is unbalanced, the worst read/write O (n) times
- Self-balancing Binary Search Tree (AVL): Based on BST, the self-balancing algorithm is added to read/write maximum O (log2 (n) times.
- Red/black tree (RBT): another self-balancing search tree, read/write maximum O (log2 (n) times
BST, AVL, and RBT optimize the number of reads and writes from O (n) to O (log2 (n). AVL and RBT have more self-balancing functions than BST, reduce the number of reads and writes to the maximum O (log2 (n )).
Assume that the auto-incrementing primary key is used, the primary key itself is ordered, and the number of reads and writes in the tree structure can be optimized to the tree height. The lower the tree height, the fewer reads and writes. The self-balancing ensures the stability of the tree structure. If you want further optimization, you can introduce the B and B + trees.
What problems does B-tree solve?
Many articles refer to B-Tree as B-(subtract) Tree, which may be a misunderstanding of its English name "B-Tree" (what's more, B is called a binary tree or a binary search tree ). Especially when talking with the B + tree. Assume that B + (plus) is a B-(minus) Tree. In fact, the English name of B + is B +-Tree ".
If you leave the maintenance operation aside, B is like a "m-cross search tree" (m is the maximum number of sub-trees), with the time complexity of O (logm (n )). However, B-tree is designed with an efficient and simple maintenance operation, which enables the depth of B-tree to be maintained in approx. log (ceil (m/2) (n )~ Logm (n) greatly reduces tree height.
Emphasize again:
Do not worry about time complexity. Unlike simple algorithms, disk I/O count is a greater factor. Readers can deduce that the time complexity of Tree B is the same as that of AVL. However, because Tree B has fewer layers and fewer disk IO times, the performance of Tree B is superior to binary trees such as AVL in practice.
Similar to the Binary Search Tree, each node stores multiple keys and Subtrees. the Subtrees and keys are arranged in order.
The Directory of the Page table is extended external storage + accelerated disk read/write. A Page is usually 4 K (equal to the size of the disk data block, see inode and block analysis ), each time the operating system loads content from the disk to the memory in units of a page (the track cost is apportioned), the page is modified, and the page will be written back to the disk. Considering the good nature of the page table, the size of each node can be approximately equal to one page (making m very large), so that each page loaded can completely overwrite one node, so that you can select the next subtree. For a page table, AVL (or RBT) is equivalent to a B-tree with one key + two Subtrees. Logically adjacent nodes are physically not adjacent. Therefore, when reading a 4 K page, most of the space on the page will be invalid data.
If keys and subtree node pointers all occupy 4B, the maximum m * (4 + 4) of B node is 8 m B, and the page size is 4 kb. Then m = 4*1024/8 m = 512, A 512-cross B-tree, 512/2 data, maximum depth log (3.02) (10 ^ 7) = ~ = 4. Compare binary trees such as AVL with log (2) (10 ^ 7) = 23.25 ~ = 24, the difference is more than 5 times. Shocked! The depth of the B-tree index is so great!
In addition, Tree B is very friendly to the local principle. If the key is relatively small (for example, the auto-increment key of 4B above), the cache can further accelerate the pre-read operations in addition to the page table addition. Meizzizi ~
What problems does B + tree solve?
B-tree surplus problem
However, if you want to apply the data to the database index, the B-tree has some problems:
- Undefined data row
- Unable to process Range Query
Question 1
There are multiple fields in the record of the data table. It is not enough to locate only the primary key, but also the data row. There are three solutions:
- Store the data rows corresponding to the key (multiple rows may be involved) in the subnode.
- Data rows are stored separately. A field is added to the node to locate the location of the Data row corresponding to the key.
- Modify the judgment logic of the key and subtree so that the subtree is greater than or equal to the previous key and smaller than the next key, and eventually all accesses will be stored on the leaf node; the location where data rows or data rows are directly stored in the leaf node.
Solution 1: directly pass the data storage row to reduce the number of sub-trees on the page, and m to decrease the tree height.
A field is added to the node in solution 2. If the pointer of 4B is used, the new m = 4*1024/12 m = 341.33 ~ = 341, maximum depth log (341/2) (10 ^ 7) = 3.14 ~ = 4.
The node m and depth of solution 3 remain unchanged, but the time complexity changes to stable O (logm (n )).
Solution 3 can be considered.
Question 2
In actual business, range queries are frequently performed. Tree B can only locate one index location (which may correspond to multiple rows), making it difficult to process range queries. There are two minor changes
Solution:
- Do not change. When querying, check the Left and Right bounds first, and then DFS (or BFS) traverses the nodes between the left and right bounds.
- On the basis of "Question 1-solution 3", because all data rows are stored on the leaf node, the leaf node of B is also ordered, and a pointer can be added, the next leaf node pointing to the current leaf node in the primary key order. When querying, the left edge is checked first, then the right edge is checked, and then the linear traversal from the left edge to the bounded edge is obtained.
At first glance, it seems that solution 1 is better than solution 2-the time complexity and constant term are the same, and solution 1 does not need to be changed. But don't forget the Locality Principle. no matter whether the node stores data rows or data row locations, solution 2 can still use the page table and cache to pre-read the information of the next node. Solution 1 faces the disadvantages of logical adjacent nodes and physical separation of nodes.
Extract B + tree
To sum up, solution 2 of Issue 1 and solution 1 of Issue 2 can be integrated into a solution (based on B-tree indexes ), solution 3 of Issue 1 and solution 2 of Issue 2 can be integrated into one (Index Based on B + tree ). In fact, some databases and file systems adopt the B-tree and some adopt the B + tree.
Because some monkeys do not yet understand the reason, mainstream databases, including MySQL, choose B + tree. That is:
Major changes are described above:
- Modify the organization logic of the key and subtree, and drop the index access to the leaf node.
- Concatenates leaf nodes in sequence to facilitate range query)
Addition, deletion, and query of Tree B and Tree B +
For the addition and deletion process of Tree B, refer to section 6 "insert and delete operations of Tree B" From Tree B, Tree B +, and tree B, the addition and deletion of the B + tree are the same. We will not go into details here.
Mysql INDEX OPTIMIZATION
According to the nature of the B + tree, it is easy to understand various common MySQL INDEX OPTIMIZATION ideas.
Currently, the differences between different engines are not considered.
Auto-increment key is used as the primary key first.
In the previous analysis, assume that the auto-increment key of 4B is used as the index, then m can reach 512, and the height is only 3. The auto-increment key has two advantages:
The auto-increment key is generally an integer type such as int, and the key is relatively compact, so that m can be very large and the index occupies a small space. In the most extreme example, if 50 B varchar (including the length) is used, then m = 4*1024/54 m = 75.85 ~ = 76, maximum depth log (76/2) (10 ^ 7) = 4.43 ~ = 5. In addition, the cost of missing cache and string comparison increases greatly. At the same time, as the number of keys increases from 4B to 50B, the space usage of the entire index tree is also extremely scary (if secondary indexes use primary keys to locate data rows, the space growth is more serious ).
The auto-increment makes the insert request of new data rows inevitably fall to the rightmost of the index tree, and the frequency of node splitting is low. Ideally, the index tree can reach the "full" state. When the index tree is full, on the one hand, the height is lower, and on the other hand, nodes are merged less frequently when the node is deleted.
Optimization experience:
Monkeys used the varchar (100) column as the primary key to store containerId. After 3 or 4 days, the database will be full, in the email, the DBA's sister-in-law expressed her contempt for me... Then, the auto-increment column is added as the primary key and the containerId is used as the secondary index of unique. The time and space optimization results are quite remarkable.
Leftmost prefix match
An index can be as simple as a column (a) or as complex as multiple columns (a, B, c, d), that is, a union index. If it is a joint index, the key is also composed of multiple columns. At the same time, the index can only be used to find whether the key exists (equal ), when a Range Query (>, <, between, and like left Match) is encountered, it cannot be further matched, and the subsequent degradation is linear search. Therefore, the order of columns determines the number of columns that can hit the index.
If there is an index (a, B, c, d), the query Condition a = 1 and B = 2 and c> 3 and d = 4, it hits a, B, and c in sequence on each node, and cannot hit d. That is, the leftmost prefix matching principle.
=, In automatic optimization order
No need to consider the order of =, in, etc. mysql will automatically optimize the order of these conditions to match as many index columns as possible.
If there is an index (a, B, c, d ), query condition c> 3 and B = 2 and a = 1 and d <4 and a = 1 and c> 3 and B = 2 and d <4 and so on, mySQL automatically optimizes to a = 1 and B = 2 and c> 3 and d <4, hitting a, B, and c in turn.
Index Columns cannot participate in Calculation
The query conditions with index columns involved in calculation are unfriendly to the index (or even the index cannot be used), for example, from_unixtime (create_time) = '2017-05-29 '.
The reason is simple. How can I find the corresponding key in the node? If linear scanning is performed, re-calculation is required each time, and the cost is too high. If binary search is performed, the size relationship needs to be determined based on the from_unixtime method.
Therefore, index Columns cannot participate in calculation. The above from_unixtime (create_time) = '2017-05-29 'statement should be written as create_time = unix_timestamp ('2017-05-29 ').
Do not create an index if it can be expanded
If you already have an index (a) and want to create an index (a, B), try to change index (a) to index (a, B ).
The cost of creating an index is easy to understand. If the index (a) is changed to the index (a, B), MySQL can directly change the index to the index (, B ).
You do not need to create an index with a prefix and a inclusion relationship.
If you already have an index (a, B), you do not need to create another index (a). However, if necessary, you still need to consider creating an index (B ).
Select a column with a high degree of discrimination as an index
It is easy to understand. For example, if you use gender as an index, the index can only divide million rows of data into two parts (for example, million male and million female), and the index is almost ineffective.
The differentiation formula is count (distinct <col>)/count (*), indicating the proportion of fields that are not repeated. The larger the proportion, the better the differentiation. The differentiation degree of the unique key is 1, while the differentiation degree of some State and gender fields in front of big data may be close to 0.
This value is hard to determine. Generally, join is required for more than 0.1 fields, that is, 10 records are scanned on average.
The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.