A slow query event caused by misjudgment of the online mysql optimizer. mysql misjudgment

Source: Internet
Author: User
Tags mysql query optimization

A slow query event caused by misjudgment of the online mysql optimizer. mysql misjudgment

Preface:

After receiving the crazy slow query and request timeout alarm, metrics analyzes the exception of the mysql request. cli-> show proceslist shows many slow queries. This SQL statement is not available in the past. This problem occurs only when the data volume increases. Although the feeds table is as large as 0.1 billion, due to the recent hot feature of feeds stream information, I/O is not frequently caused by low efficiency of innodb_buffer_pool_size. Later, after further explain execution plan analysis, the reason was obtained. The mysql query optimizer chose the index he thought was efficient.

Mysql query optimizer is reliable in most cases! However, when your SQL language contains multiple indexes, you must pay attention to it. The final result is often confusing. Because mysql can only use one index for the same SQL statement, which one can be used? When the data volume is small, the mysql optimizer will post the primary key index and give priority to index and unique. When you reach a data level and your query operations are in, the mysql Query Optimizer may choose the primary key!

Remember one sentence: mysql query optimization is based on Retrieval costs, not time costs. The optimizer calculates the cost based on the existing data status, rather than executing the SQL statement once.

Therefore, the mysql optimizer does not achieve the optimization effect every time. It cannot accurately estimate the cost. If you want to accurately obtain the cost of each index, you must perform it once to know it. Therefore, the price analysis only provides an estimate, since it is an estimate, there will be a false positive.

The table we are talking about here is the feed information flow table. We know that feeds information flow tables are frequently accessed, and the data volume is also large. However, the data structure of this table is very simple, and the index is also simple. There are two indexes in total, one is the primary key index and the other is the unique key index of unique.

As shown in the following figure, the table has hundreds of millions of data records. Because there are enough cache headers, and for that reason, the table does not come and is used for database/table sharding.

The problem is that when the data magnitude is less than 0.1 billion, the mysql optimizer chooses to use the index. When the data magnitude exceeds 0.1 billion, the mysql query optimizer chooses to use the primary key index. The problem is that the query speed is too slow.

This is normal:

mysql> explain SELECT * FROM `feed` WHERE user_id IN (116537309,116709093,116709377)     AND cid IN (1001,1005,1054,1092,1093,1095)  AND id <= 128384713 ORDER BY id DESC LIMIT 0, 11 \G;*************************** 1. row ***************************      id: 1 select_type: SIMPLE    table: feed  partitions: NULL     type: rangepossible_keys: PRIMARY,feed_user_target     key: feed_user_target   key_len: 6     ref: NULL     rows: 18   filtered: 50.00    Extra: Using where; Using index; Using filesort1 row in set, 1 warning (0.00 sec)

In the same SQL statement, the index selection of the mysql query optimizer also changes when the data volume changes significantly.

mysql> explain SELECT * FROM `feed` WHERE user_id IN (116537309,116709093,116709377)    AND cid IN (1001,1005,1054,1092,1093,1095)    AND id <= 128384713 ORDER BY id DESC LIMIT 0, 11 \G;*************************** 1. row ***************************      id: 1 select_type: SIMPLE    table: feed     type: rangepossible_keys: PRIMARY,feed_user_target     key: PRIMARY   key_len: 4     ref: NULL     rows: 11873197    Extra: Using where1 row in set (0.00 sec)

The solution is to use force index and force the query optimizer to use the given index. I am using a python development environment. Common python orm has the force index, ignore index, and user index parameters.

explain  SELECT * FROM `feed` force index (feed_user_target) WHERE user_id IN (116537309,116709093,116709377) ...

So how should we prevent this problem because of data improvement? Why does the mysql optimizer choose an inefficient index?

I have consulted several factory DBAs for this question. The answer is the same as ours. You can only find the problem through slow queries later, and then specify the force index in the SQL statement to solve the index problem. In addition, this type of problem will be avoided in the early stages of system launch. However, business developers often cooperate with DBAs for review at the early stage, but in the later stage, they want to save time or think that there is no problem, therefore, a mysql query error occurs.

I have a thorough understanding of the index rules for the mysql optimizer, and I will take the time to study the rules.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.