A specific case analysis of optimizing MySQL query operations, mysql query Case Analysis

Source: Internet
Author: User

A specific case analysis of optimizing MySQL query operations, mysql query Case Analysis

Problem description

A user reports that the execution time of an SQL statement is too slow to be accepted. The SQL statement looks simple (the table name and field name are modified in the description ):
SELECT count (*) FROM a JOIN B ON. 'S '= B.'s 'where. 'L'> '2017-03-30 00:55:00 'AND. 'L' <'2017-03-30 01:00:00 ';

The required fields are indexed. The table structure is as follows:

CREATE TABLE `a` (`L` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00',`I` varchar(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,`A` varchar(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,`S` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,`F` tinyint(4) DEFAULT NULL,`V` varchar(256) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT '',`N` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,KEY `IX_L` (`L`),KEY `IX_I` (`I`),KEY `IX_S` (`S`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;CREATE TABLE `b` (`R` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00',`V` varchar(32) DEFAULT NULL,`U` varchar(32) DEFAULT NULL,`C` varchar(16) DEFAULT NULL,`S` varchar(64) DEFAULT NULL,`I` varchar(64) DEFAULT NULL,`E` bigint(32) DEFAULT NULL,`ES` varchar(128) DEFAULT NULL,KEY `IX_R` (`R`),KEY `IX_C` (`C`),KEY `IX_S` (`S`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

From the statement perspective, this query plan is natural. We should first use a as the driving table and use the two indexes a. L and B. S successively. In fact, the explain result is:

+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+| 1 | SIMPLE | b | index | IX_S | IX_S | 195 | NULL | 1038165 | Using index || 1 | SIMPLE | a | ref | IX_L,IX_S | IX_S | 195 | test.b.S | 1 | Using where |+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+


According to the explain results, B is used as the driving table for the query.

In the previous article, we introduced that the jion sequence for MySQL is the method to select the minimum cost after analyzing the costs of various join orders respectively.

This join involves only two tables and is naturally independent of optimizer_search_depth. The question is, why is the join order we expected not selected?

MySQL Tips: MySQL provides the straight_join syntax to forcibly set the connection sequence.

explain SELECT count(*) FROM a straight_join b ON a.`S` = b.`S` WHERE a.`L` > '2014-03-30 00:55:00' AND a.`L` < '2014-03-30 01:00:00' ;+----+-------------+-------+-------+---------------+------+---------+------+---------+---------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+------+---------+------+---------+---------------------------------------------+| 1 | SIMPLE | a | range | IX_L,IX_S | IX_L | 4 | NULL | 63 | Using where || 1 | SIMPLE | b | index | IX_S | IX_S | 195 | NULL | 1038165 | Using where; Using index; Using join buffer |+----+-------------+-------+-------+---------------+------+---------+------+---------+---------------------------------------------+

MySQL Tips: In the explain results, the join query costs can be estimated by sequential concatenation rows.

? The join order is correct. The simple analysis query cost is: normal join is 1038165*1, and straight_join is 63*1038165. It seems that MySQL is correct. But there must be something wrong!

Exception Detected

Return to our original ideas. We expect Table a as the driving table because we think that table B can use IX_S indexes, but in fact it is actually used in staight_join, but this result is different from what we expected.

We know that the filtering of indexes determines whether an index will be selected in a query. is B. S poor?

MySQL Tips: show index from tbname the Cardinality value in the returned results can indicate the filtering of an index.

The show index results are too many and can be retrieved from the information_schema table.

mysql> select * from information_schema.STATISTICS where table_name='b' and index_name='IX_S'\G*************************** 1. row ***************************TABLE_CATALOG: defTABLE_SCHEMA: testTABLE_NAME: bNON_UNIQUE: 1INDEX_SCHEMA: testINDEX_NAME: IX_SSEQ_IN_INDEX: 1COLUMN_NAME: SCOLLATION: ACARDINALITY: 1038165SUB_PART: NULLPACKED: NULLNULLABLE: YESINDEX_TYPE: BTREECOMMENT:INDEX_COMMENT:

The CARDINALITY: 1038165 for this index is large. What is the estimated row of this table.

show table status like 'b'\G*************************** 1. row ***************************Name: bEngine: InnoDBVersion: 10Row_format: CompactRows: 1038165Avg_row_length: 114Data_length: 119160832Max_data_length: 0Index_length: 109953024Data_free: 5242880Auto_increment: NULLCreate_time: 2014-05-23 00:24:25Update_time: NULLCheck_time: NULLCollation: utf8_general_ciChecksum: NULLCreate_options:Comment:1 row in set (0.00 sec)

From Rows: 1038165, we can see that the distinction between IX_S indexes is considered very good, and it is already similar to the unique index.

MySQL Tips: The Rows displayed in the show table status result indicates the current row number of the table. This is an exact value for the MyISAM table, but an estimate for InnoDB.

Although it is an estimate value, the optimizer is guided by this. That is to say, the data in an explain above does not meet the expectation at all: the rows in the second row in the result of staight_join.

Stage conclusion

We found that the logic of the entire error is as follows: the execution plan of a-driven table, due to index B. S rows is estimated to be 1038165, which leads the optimizer to think that the cost is greater than that of the drive table B. In fact, the distinction of this index is 1. (Of course, the colleagues who are familiar with the explain results find that the type field in the second row is different from the Extra field)

That is to say, when each row obtained by straight_join goes to B for query, it goes through the full table scan. In MySQL, type conversion is the most common case. For example, a string field contains full numbers, but the input is not in the string format during query.

In this case, both are strings. Therefore, it is related to character sets.

Return to the two table structures and find that the declaration difference of the S field is COLLATE utf8_bin -- this is the root cause of this case: the S value obtained in Table a is utf8_bin, And the optimizer considers the type to be different, index B cannot be used directly. IX_S filter.

As for why indexes are still used, this is because overwriting indexes creates a "misunderstanding ".

MySQL Tips: if all the query results can be obtained from an index, the traversal index will be used instead of the traversal data.

As verification,

mysql> explain SELECT * FROM a straight_JOIN b ON binary a.`S` = b.`S` WHERE a.`L` > '2014-03-30 00:55:00' AND a.`L` < '2014-03-30 01:00:00' ;+—-+————-+——-+——-+—————+——+———+——+———+————————————————+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——-+—————+——+———+——+———+————————————————+| 1 | SIMPLE | a | range | IX_L | IX_L | 4 | NULL | 63 | Using where || 1 | SIMPLE | b | ALL | IX_S | NULL | NULL | NULL | 1038165 | Range checked for each record (index map: 0x4) |+—-+————-+——-+——-+—————+——+———+——+———+————————————————+

Because the result is select * and the index cannot be overwritten, the key in the second row is displayed as NULL)


Of course, the most direct idea is to modify the S field definitions of the two tables and change them to the same one. This method can avoid modifying the Business Code, but the DDL cost is slightly higher. Here we provide two Optimization Methods for SQL statements.

1、select count(*) from b join (select s from a WHERE a.`L` > '2014-03-30 00:55:00' AND a.`L` < '2014-03-30 01:00:00') ta on b.S=ta.s;

This method is intuitive. Pay attention to the order of B. S and ta. S.

2、SELECT count(*) FROM a JOIN b ON binary a.`S` = b.`S` WHERE a.`L` > '2014-03-30 00:55:00' AND a.`L` < '2014-03-30 01:00:00' ;

According to the previous analysis, B. S is defined as utf8_bin.

MySQL Tips: In the character set naming rules of MySQL, the difference between XXX_bin and XXX is Case sensitivity.

Here we add binary limitation for all a.s, first convert it to lowercase, that is, convert the temporary result set to utf8_bin, and then use the index directly when B. s is matched.

In fact, the two rewriting methods are essentially the same, but the difference is that writing 1 is implicit conversion. In theory, writing 2 is faster.


The join fields are designed to be of the same type as possible.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.