MySQL Index Performance Analysis overview

Last Update:2017-12-07 Source: Internet

Author: User

Tags mysql index

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previous Article MySQL index design outlines several factors that affect the design of the index, including filter factors, the width and size of the index slices, and matching columns and filter columns. In the latter part of the article, we introduce the design process and routines of the ideal Samsung index in the Book of database index design and optimization, so far, although we have mastered the design method of single table index, we have not analyzed the ability of estimating index time.

In this article, we will introduce two methods for analyzing index performance in the book: Basic Problem Method (BQ) and fast estimation upper bound (QUBE), which can help us quickly analyze and estimate the performance of indexes and identify problems in time.

Basic Problem Law

When we need to consider the analysis of an existing SELECT query, even if there is not enough time, you should use the basic problem method to evaluate the query, the content of the assessment is very simple: the existing index or the index to be added contains all the columns used in where, that is, for the current query, Whether there is an index is a half-width index.

In the previous article, we introduced the wide index and narrow index, the narrow index (username) is actually called the half-width index, which contains all the columns in the where username, the current index for the query only one star, although it avoids the invalid table query caused by random IO, However, if the performance of the current index still does not meet the needs, you can add age to change the index into a wide index (username, age) to avoid the performance impact of back table access; For simple queries in, the index (username, age) is actually a Samsung index, But for queries that contain an ORDER by or more complex, (username, age) may be just a two-star index:

At this point, if the index still does not meet the needs of performance, you might consider re-designing the index design method provided in the previous article in the MySQL Index design summary.

Although the basic problem method can quickly solve some of the problems caused by the index, but it does not guarantee enough performance, when the table has (city, username, age) index, the predicate is WHERE username="draveness" AND age="21" , using the basic problem method does not produce the correct results.

Fast estimation of upper bound method

The basic problem method is very simple, it can help us to evaluate the performance of a query in the shortest time, but it does not accurately reflect an index-related performance problem, and the rapid estimation of the upper bound method is a more accurate and complex method, and its purpose is to expose the problem of slow access path during program development. The output of this estimation method is the local response time (Response):

Local response times are time-consuming queries on the database server, not including any network latency and communication times for multi-tiered environments, including only the time-consuming task of performing queries.

Response time

The local response time is equal to the sum of the service time and the queue time, and one query request needs to wait for the CPU and the disk in the database, or because other transactions are reading and writing the same data, causing the current query to wait for the lock to be fetched, but the main part of the response time is the service time of the disk:

QUBE ignores other queuing times in addition to the disk queuing time during the calculation, which simplifies the entire evaluation process, while the disk's service time mainly includes synchronous read-write and asynchronous reads of several parts:

In excluding the above sections, we have a very simple estimation process, the entire estimated time input is only random read and sequential read and data acquisition of three input, and they are the main factors affecting the query:

The process of data acquisition does not need to be considered when comparing the effects of different indexes on the same query, because the same query uses different indexes to get the same result set, and the data gets exactly the same.

Access

When MySQL reads an index row or a table row, one access occurs, and when a full table scan or a scan index is used, the first row read is random access, random access requires the disk to seek and rotate, so it costs a lot, and all the rows that are read sequentially are read sequentially. The cost is only 1 per thousand of random access.

If the index rows and table rows are read in a large number of sequential terms, there may be some additional sporadic random accesses on the principle, but this is not important for estimating the overall query, and they are still evaluated as sequential access when calculating the local response time.

Example

Here, let's simply cite an example of how to calculate the local response time required for a query to use an index, assuming we have a table with users 10 million data:

In users addition to the primary key index in this table, there are several secondary indexes (username, city), (username, age), and (username), when we use a query such as the following:

Two query conditions have 0.05% and 12% of the filter factor, the query can directly use the existing secondary index (username, city), then we start with the total number of rows in the table and the filter factor to begin estimating the execution time of this step of SQL:

The query starts with a hit (username, city) index, which scans for eligible index slices, which in total access 10,000,000 * 0.05% * 12% = 600 data, including 1 random accesses and 599 sequential accesses, because the columns in the index do not Can meet the needs of the query, so for each index row will produce a random table access, to get the remaining column age information:

A total of 600 random accesses were generated during the process, and there were 600 fetch operations in the process of retrieving the result set, and the overall SQL query was given 601 random accesses, 599 sequential accesses, and 600 fetches, according to the formula in the previous section we can get this The time spent on a query is about 6075.99MS, or about 6s, which is unacceptable for most applications.

During the whole query process, the 600 random accesses of the back table query became the main contribution of this super slow query, in order to solve this problem, we just need to add a (username, city, age) index or add the New Age column after the existing (username, city) You can avoid random accesses of 600 times:

(Username, city, age) index for this query is actually a Samsung index, the contents of the index design can be read the previous article MySQL Index design Summary If the reader has sufficient time still strongly recommended database index design and optimization of this book.

Summarize

This article is the shortest article written this year, I would like to introduce in detail the database index design and optimization of the index performance analysis of the estimation method, carefully think about this part of the content is too much, examples are very rich, only through an article difficult to complete the full description of the whole content, So only a few of them are selected for a brief introduction, which is also why this article is called the outline.

If you have questions about the content of the article, you can leave a comment in the comments.

Reference

Database index design and optimization

Original link: MySQL Index performance Analysis Overview · Faith-Oriented programming

Follow:draveness GitHub

MySQL Index Performance Analysis Summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More