High-performance MySql evolution (7): Correct Use of Indexes

Last Update:2018-05-28 Source: Internet

Author: User

Tags percona

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The database engine uses indexes to improve query efficiency and adds many optimization policies for indexes. However, there are also many details during the indexing process. If you ignore these problems, hard-built indexes may not be well applied, and may affect the performance. The process of using indexes is listed below.

The database engine uses indexes to improve query efficiency and adds many optimization policies for indexes. However, there are also many details during the indexing process. If you ignore these problems, hard-built indexes may not be well applied, and may have a certain impact on performance. The following lists the principles to be observed in the process of using indexes.

1. Keep simple Columns

The so-called "simple column" means that when serving as a query condition, do not use operators. Functions process fields. Otherwise, related indexes cannot be used.

The following lists the two most common errors.
(1)Select wordfrom dictionary where id + 1 = 999;

Id + 1 should not be used

(2)Selectword from dictionary where to_days (CURRENT_DATE)-to_days (id) <= 10;

The TO_DAYS (id) method should not be used

2. Correctly Use the prefix length of the "prefix Index"

In some cases, you need to use a long string as an index. This index takes a lot of time and is less efficient because it occupies a large amount of space. In this case, only the first N characters of this field can be used as the index value. This policy is called "prefix Index ". For example, if you want to use BLOB/TEXT fields as indexes in MYSQL, you must use prefix indexes because these types cannot be used as indexes.

The use of "prefix Index" will lead to selection issues. For example, the index record of a long field in a table is as follows. If the value of the entire field is used as the index value, the efficiency is very low, however, there should be only one matching record at the end of the match. In this case, the selectivity is 1, which is the highest. If you use four characters as the prefix, the index will lose value because the number of indexes is equal to the number of table records. No matter how optimized, all scans are performed on the entire table. Therefore, the selection of prefix length is very important.

ABCDEFGHIJKDDD8

ABCDEFFSKDJKJKD7

ABCDKJH65654654K

ABCDEFGHIJKKJKJG

ABCDEFGHIJKFFFFEJ

You can use the following method to determine the prefix length and apply the example in other books. In this example, city is an index field.

2.1 First, determine the index value selection rate of the entire table:

Select count (distinct city)/count (*) from City_Demo;

Assume the execution result is 0.0312.

2.2 calculate the selection rate of several candidate lengths

Select count (DISTINCTLEFT (city, 3)/COUNT (*) AS sel3,

COUNT (distinct left (city, 4)/COUNT (*) AS sel4,

COUNT (distinct left (city, 5)/COUNT (*) AS sel5,

COUNT (distinct left (city, 6)/COUNT (*) AS sel6,

COUNT (distinct left (city, 7)/COUNT (*) AS sel7

FROM city_demo;

+ -------- +

| Sel3 | sel4 | sel5 | sel6 | sel7 |

+ -------- +

| 0.0239 | 0.0293 | 0.0305 | 0.0309 | 0.0310 |

2.3 select the nearest value. From the comparison results, 7 should be selected as the prefix length 2.4. then create the prefix index.

Alter table sakila. city_demo add key (city (7 ));

The last note is the disadvantage of "prefix Index": MYSQL cannot use prefix indexes for orderby, group by, and overwrite indexes (will be mentioned later)

3. Order of fields in composite indexes

When creating a B-Tree composite index, the matching sequence of B-Tree is compared according to the storage sequence, therefore, if the preceding fields can filter out more records, the following conditions will have fewer records, and of course the efficiency will be higher. Use the composite index below to make a simple explanation.

NAME

AREA

If you place NAME in the first position, you may only need to compare 10 records when using AREA for comparison. If you place AREA in the first position, the NAME is used for comparison, there may be 1000 records that need to be compared. We can simply infer that the NAME should be placed in the first column of the index. This principle determines that the Hash index does not apply to this rule.

In the small example above, you can use the field selection rate to determine the order of the index fields, and convert it to the SQL representation as follows,

Select count (distinct name)/COUNT (*) AS name_selecti.pdf,

COUNT (distinct area)/COUNT (*) AS area_selecti.pdf,

COUNT (*)

FROM student \ G

* *************************** 1. row ***************************

Area_selecti.pdf: 0.0001

Name_selecti.pdf: 0.0373

COUNT (*): 16049

Conclusion: Put the fields with high selection rate in the result before the combination.

4. Cluster Indexes)

The first thing to note is that "Aggregate Index" does not refer to a specific index type, but to the storage of index data. The specific implementation details are closely related to the Database Engine implementation. The basic idea is to "store indexes and data rows in a data structure, and data rows and adjacent key values are stored together ". If the query results can be overwritten by indexes, you do not need to retrieve the data in the database.

In Oracle, you can specify which indexes are aggregate indexes. In the current MySQL version, each storage engine cannot set an aggregate index in the "arbitrarily specified" way. In InnoDB, the primary key is used as the aggregate index by default. If no primary key is set up, a "non-empty and unique" index is used instead, if the "non-empty and unique" index does not exist, it will customize a primary key as an aggregate index.

In this storage mode, data rows with adjacent key values are stored together. Therefore, we recommend that you insert values progressively.

If you use a random primary key such as UUID, it will lead to a large number of random IO access, and the insertion efficiency will be very low.

5. Converting Indexes)

Covering indexes means that the index field contains all the fields to be queried, and thus the index will be overwritten.

For example, in the Student table (Name, Area:

Then selectName, areaFrom student wherename = 'Eric ', the query will be indexed and overwritten. from the execution plan perspective, if an EXPLAIN query Extra column contains "Using"Index"Indicates that this query is covered by the index.

Generally, when you use an index to query a record, you first query the index Node, and then search for the record in the database based on the Record Pointer stored in the index Node, this method is equivalent to two queries, and the database record query efficiency is often relatively low. Covering indexes because it contains the field values to be queried, it avoids access to database records and greatly improves the query efficiency.

Note that the index field value is used for overwriting the index, while the actual value of the index field is not stored in the "Hash/SPACE/full-text" index, therefore, they do not support covering indexes.

In addition to the questions raised above, we also need to pay attention to the MYSQL version. Versions later than MySQL provide better support for this feature, so try to use the latest version.

6. Sequential index scanning and sorting

MYSQL can followIndex Sequential Scan(Type in Explain is index) andSort operationTo achieve sorting.

If the index Sequential Scan and index overwriting are used, the sorting speed can be greatly improved. If the index cannot contain all columns to be queried, when getting the value of each record, you need to search for the corresponding record based on the Record Pointer, which may lead to a large number of random I/O generation, thus reducing the query efficiency. Therefore, when designing indexes, you must consider both the query conditions and sorting operations.

In order to make indexes better serve sorting actions, note the following issues during the spelling of SQL statements:

(1) Use "overwrite index"

(2) The order of indexes is the same as that of order by, and the order Direction of all columns is the same.

(3) When multiple tables are joined for sorting, all the condition fields of order by are in the first table.

(4) Order by, like where, must follow the "leftmost principle"

(5) If a column in the where or join clause is specified as a constant, the rule can be compensated (4 ),

For example, INDEX (Name, Age, ID .... Where name = "Eric" order by age, id can also be sorted by index order scan

7. Lightweight Indexing

MYSQL allows you to create multiple indexes on the same Column. MYSQL needs to maintain each duplicate index separately, and the query optimizer also takes time on these duplicate indexes, this may lead to a reduction in overall performance.

Therefore, we should try to keep the index "light"

· Duplicate Indexes

Definition: Create an index of the same type in the same order on the same column.

For example, the following statement creates three indexes on the ID. For the Primary KEY, UNIQUE and INDEX are duplicated.

Create table test( ID int not null PRIMARY KEY, UNIQUE(ID), INDEX(ID),)

· Redundant Indexes

Here, "redundancy" means that multiple indexes have the same type and have duplicate functions.

For example, if an index (A, B) is created and then an index (A) is created, (A) is considered repeated, however, if (B) is created, it is not considered redundant.

Redundant indexes generally occur when extended indexes exist. For example, an index (A) exists. Someone creates A new index (, b). At this time, (A) becomes A redundant index and should be deleted, or expanded to (A, B) using ()

Sometimes redundant indexes can also improve query performance. For example, if you want to add an extra-long string field to the index, this field will increase the index storage space, this reduces the query efficiency. Therefore, the proper use of redundant indexes is helpful for the performance.

The data in INFORMATION_SCHEMA can be used to determine whether the index is duplicate/redundant, or a specialized tool such as Percona-Toolkit's pt-duplicate-key-checker can be used to detect (http://www.percona.com/doc/percona-toolkit/2.1/pt-duplicate-key-checker.html)

· Unused Indexes

Similar to the preceding two cases, when a large number of unused indexes exist in the system, the query efficiency is also affected. You can use the following two methods to determine which indexes are not used.

(1) In Percona or MariaDB, you can open the userstates server variable and run normally for a period of time. Finally, you can use the INFORMATION_SCHEMA.INDEX_STATISTICS variable to determine

(2) Using specialized tools, such as Percona-Toolkit's pt-index-usage, this tool can not only identify the indexes that are not used, you can also learn about the query execution plan (http://www.percona.com/doc/percona-toolkit/2.1/pt-index-usage.html)

Note that no unused or useless index is used. For example, some indexes have uniqueness constraints. Although this index has never been used, however, you can avoid duplicate data. You must be careful when processing such indexes.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More