MySQL's high performance index

Last Update:2018-05-09 Source: Internet

Author: User

Tags create index mysql query

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL High performance index when the amount of DB reaches a certain magnitude, the efficiency of the full table scan is very low each time, so a common scenario is to create some necessary indexes as an optimization method, then the problem is:

So what is an index?
What is the implementation of the index?
What is the difference between a clustered index and a nonclustered index that we usually say?
How do I create and use an index?

I. Index INTRODUCTION MySQL official definition of an index is: The index is the data structure that helps MySQL to obtain data efficiently. In short, the index is data structure 1. The structure of several trees a. A B + tree is a balanced binary tree designed for disk or other storage devices, where all records are stored at the leaf node according to the size of the key, and each leaf node is directly connected with the B+tree. Binary tree Two The rule of the tree is that the parent node is greater than the left child node, which is less than the right child node C. The Balanced binary tree is first a binary tree, but the height difference between the left and right child nodes of any one node is not less than 1d. B-Tree first is a balanced binary tree, but also requires each leaf node to the root node of equal distance so B-tree and the difference between a + + tree what is it?

B + Tree leaf nodes can contain a pointer to another leaf node
The copy of B + Tree key value exists non-leaf node; key value + record stored in leaf node

2. InnoDB engine B + Tree MySQL's Innnodb engine uses a B + tree, only the leaf node stores the corresponding data column, has the following benefits

Leaf nodes usually contain more records, have higher fan-out (can be understood as the corresponding lower nodes of each node), so the height of the tree is low (---), and the height of the tree determines the number of disk IO, which affects the performance of the database. In general, the number of IO is consistent with the height of the tree
For composite indexes, the B+tree index is ordered sequentially from the index column names (left to right), so the random IO can be converted to sequential IO to boost IO efficiency, and the order by \group can be supported for sorting requirements;

3. Hash index hash index, compared to the B-tree, do not need to traverse from the root node to the leaf node, can be positioned once to the location, the query more efficient, but the shortcomings are obvious

can only satisfy "=", "in" and "<=>" queries, cannot use range queries

Because it is calculated by the hash value, so only the exact query, the hash value is not regular, can not guarantee the order and original consistency, so the scope of the query does not

Unable to sort

The reason is ditto

Partial indexes are not supported

The hash value is calculated based on a complete number of indexed columns, and if one or more of them is missing, the hash value cannot be calculated.

Hash collision

4. Clustered index vs. nonclustered index A. Clustered index InnoDB The data file itself is the index file, b+tree the leaf node data is itself, key key, non-leaf node storage <key,address>,address is the next layer of the address clustered index structure diagram: B. Nonclustered index non-clustered index, the data on the leaf node is the primary key (that is, the clustered index of the primary key, so the clustered index key, not too long). Why the primary key is stored, not the address of the record, the reason is quite simple, because the address of the record is not guaranteed to change, but the primary key can guarantee the non-clustered index structure diagram: From the structure of the nonclustered index, we can see the positioning process under this scenario:

First, the corresponding leaf node is located by the nonclustered index, and the corresponding primary key is found.
Based on the primary key found above, in the clustered index, locate the corresponding leaf node (get data)

5. Advantages of the Index

Avoid a full table scan (when you can't go to the index, you can only match by one; if you go to the index, you will be able to locate it according to the B-tree)
Using an index can help the server avoid sorting or temporary tables (pointers on leaf nodes, which can effectively support range queries; In addition, the leaf nodes themselves are sorted by key)
Index converts random io into sequential io

Although the index greatly improves query speed, it also slows down the updating of tables, such as INSERT, UPDATE, and delete on tables. Because when updating a table, MySQL not only saves the data, but also saves the index file.
Index files that create indexes that consume disk space. The general situation is not too serious, but if you create multiple combinations of indexes on a large table, the index file will swell up quickly.

4. Precautions

The index does not contain a column with null values
Use short Index
Indexed column Sorting

The MySQL query uses only one index, so if an index is already used in the WHERE clause, the column in order by is not indexed. So do not use sort operations where the default sorting of the database is acceptable, and do not include sorting of multiple columns if you need to create a composite index for those columns

Like statement operations

It is generally discouraged to use the like operation, which is also an issue if it is not used. Like "%aaa%" does not use the index and like "aaa%" can use the index

Do not perform calculations on columns

SELECT * from the users where year (adddate) <2007;

Try not to use not in and <> operations

5. SQL usage policy A. Using one SQL instead of multiple SQL is generally recommended to use one SQL instead of multiple SQL queries. Of course, if SQL execution is inefficient, or if a lock table is triggered by a delete, multiple SQL can be used to avoid blocking other sqlb. Decompose the associated query to make the associated join as far as possible in the application, as far as possible to perform a small and simple SQL

The decomposed SQL is simple, which facilitates the use of MySQL cache
Perform decomposed SQL to reduce lock contention
Better scalability and maintainability (SQL Simple)
Associative SQL uses the inline loop algorithm Nestloop, which can be used to process data in applications with HashMap, which is more efficient

C. Count

COUNT (*) counts the number of rows
Count (column name) counts the number of columns that are NOT NULL

D. Limit

limit offset, size, paging query, offset + size data is queried for the last size bar data

If limit 1000, 20 will be found to meet the conditions of 1020 data, and then the last 20 to return, so try to avoid the big flip query E. The Union needs to put the where, order by, and limit restrictions into each subquery in order to re-add efficiency. In addition, if necessary, use union all as much as possible, because the Union will add distinct to each subquery's temporary table, making a unique check on each temporary table, which is inefficient. 6. mysql uses query a. View index--unit Gbselect CONCAT (ROUND (SUM (index_length)/(1024*1024*1024), 6), ' GB ') as ' total index Size ' from Information_ Schema. TABLES WHERE table_schema like ' databaseName '; B. View table space Select CONCAT (ROUND (SUM (data_length)/(1024*1024*1024), 6), ' GB ') As ' total Data Size ' from INFORMATION_SCHEMA. TABLES WHERE table_schema like ' databaseName '; C. View information for all tables in a database select CONCAT (table_schema, '. ', table_name) as ' table name ' , table_rows as ' number of rows ', CONCAT (ROUND (data_length/(1024*1024*1024), 6), ' G ') as ' data Size ', CONCAT (ROUND (index_ length/(1024*1024*1024), 6), ' G ') as ' Index Size ', CONCAT (ROUND ((data_length+index_length)/(1024*1024*1024), 6), ' G ') As ' total ' from INFORMATION_SCHEMA. Tableswhere table_schema like ' databaseName ';

MySQL's high performance index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More