[Turn] A user SQL slow query analysis, cause and optimization

Source: Internet
Author: User
Tags mysql in string format

Source: http://blog.rds.aliyun.com/2014/05/23/%E4%B8%80%E4%B8%AA%E7%94%A8%E6%88%B7sql%E6%85%A2%E6%9F%A5%E8%AF%A2%E5 %88%86%e6%9e%90%ef%bc%8c%e5%8e%9f%e5%9b%a0%e5%8f%8a%e4%bc%98%e5%8c%96/

Problem description

A user reflects that the first line of an SQL statement executes at a slower time than is acceptable. The SQL statement looks simple (the table name and field name are modified in this article):
SELECT Count (*) from a JOIN B on a. ' s ' = B. ' s ' WHERE A. ' l ' > ' 2014-03-30 00:55:00 ' and A. ' L ' < ' 2014-03-30 01:00:00 ' ;

and the fields that the query requires are indexed, and the table is structured as follows:
CREATE table ' a ' (
' L ' timestamp not NULL DEFAULT ' 2000-01-01 00:00:00 ',
' I ' varchar (32 ) CHARACTER set UTF8 COLLATE utf8_bin default NULL,
' A ' varchar (+) CHARACTER SET UTF8 COLLATE utf8_bin default null, ' S ' varchar (CHARACTER) SET UTF8 COLLATE utf8_bin default null,
' F ' tinyint (4) default null,
' V ' varchar (CH) Aracter set UTF8 COLLATE utf8_bin default ',
' N ' varchar (+) CHARACTER SET UTF8 COLLATE utf8_bin DEFAULT NULL,
KEY ' ix_l ' (' L '),
Key ' ix_i ' (' I '),
Key ' ix_s ' (' S ')
) Engine=innodb DEFAULT Charset=utf8;

CREATE TABLE ' B ' (
' R ' timestamp not null default ' 2000-01-01 00:00:00 ',
' V ' varchar (+) default NULL,
' U ' var char (+) default null,
' C ' varchar (+) default null,
' S ' varchar (+) default null,
' I ' varchar (+) default NULL,
' E ' bigint (+) default null,
' ES ' varchar (+) default NULL,
Key ' Ix_r ' (' R '),
Key ' Ix_c ' (' C '),
Key ' Ix_ S ' (' s ')
) Engine=innodb DEFAULT Charset=utf8;

From the statement, this query plan is very natural, it should be used first as a driver table, the use of A.L and B.S these two indexes. And actually the result of explain is:
+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+
| ID | Select_type | Table | Type | Possible_keys | Key | Key_len | Ref | Rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+
| 1 | Simple | B | Index | ix_s | ix_s | 195 | NULL | 1038165 | Using Index |
| 1 | Simple | A | Ref | ix_l,ix_s | ix_s | 195 | TEST.B.S | 1 | Using where |
+----+-------------+-------+-------+---------------+------+---------+----------+---------+-------------+

Analysis

From the explain results, the query uses B as the driver table.

In the previous article we introduced that MySQL chooses the jion order is the least expensive method after analyzing the cost of the various join sequences separately.

This join involves only two tables, and nature has nothing to do with optimizer_search_depth. So our question is, why is the join order we were expecting not to be selected?

MySQL Tips:mysql provides straight_join syntax, which forces the connection sequence to be set.

Explain SELECT count (*) from a straight_join B on a. ' s ' = B. ' s ' WHERE A. ' l ' > ' 2014-03-30 00:55:00 ' and A. ' L ' < ' 201 4-03-30 01:00:00 ';

+----+-------------+-------+-------+---------------+------+---------+------+---------+------------------------- --------------------+

| ID | Select_type | Table | Type | Possible_keys | Key | Key_len | Ref | Rows | Extra |

+----+-------------+-------+-------+---------------+------+---------+------+---------+------------------------- --------------------+

| 1 | Simple | A | Range | ix_l,ix_s | ix_l | 4 | NULL | 63 | Using where |

| 1 | Simple | B | Index | ix_s | ix_s | 195 | NULL | 1038165 | Using where; Using index; Using Join Buffer |

+----+-------------+-------+-------+---------------+------+---------+------+---------+------------------------- --------------------+

MySQL tips:explain results, the query cost of join can be estimated by sequentially connecting rows.

The join sequence is a simple analysis of the query cost: Common join IS 1038165*1, Straight_join is 63*1038165. There seems to be nothing wrong with MySQL. But there must be something wrong!

Found exception

Back to our original idea. We expect table A as the driver table because we think that table B can use the ix_s index, and actually staight_join when it does, but the result is different from what we expected.

We know that the filtering of indexes is the important factor that determines whether an index will be selected in the query, then B. is the filtration of s not good?

MySQL tips:show index from Tbname returns the value of cardinality in the result to indicate the filtering of an index.

Show index has too many results and can be taken from the Information_schema table.

Mysql> SELECT * from INFORMATION_SCHEMA. STATISTICS where table_name= ' B ' and index_name= ' ix_s ' \g
1. Row ***************************
Table_catalog:def
Table_schema:test
Table_name:b
Non_unique:1
Index_schema:test
index_name:ix_s
Seq_in_index:1
Column_name:s
Collation:a
cardinality:1038165
Sub_part:null
Packed:null
Nullable:yes
Index_type:btree
COMMENT:
Index_comment:

Can this index of cardinality:1038165, already very big. What is the estimated line of the table?

Show table status like ' B ' \g
1. Row ***************************
Name:b
Engine:innodb
Version:10
Row_format:compact
rows:1038165
avg_row_length:114
data_length:119160832
max_data_length:0
index_length:109953024
data_free:5242880
Auto_increment:null
Create_time:2014-05-23 00:24:25
Update_time:null
Check_time:null
Collation:utf8_general_ci
Checksum:null
Create_options:
Comment:
1 row in Set (0.00 sec)

As seen from rows:1038165, the ix_s of this index is considered very good and is already similar to a unique index.

MySQL Tips: The rows you see in the Show Table status results are used to represent the current number of rows in the table. This is an exact value for the MyISAM table, but it is an estimate for InnoDB.

Although it is an estimate, the optimizer is guided by this, meaning that the data in one of the above explain is completely out of the expectation: rows in the second row of the staight_join result.

Stage conclusion

We find the logic of the entire error to be this: the execution plan with a for the driver table, because index B. A rows estimate of s of 1038165 causes the optimizer to assume that the cost is greater than B for the driver table. In fact, the index has a 1 degree of differentiation. (Of course, the Explan results are more familiar to the students will find that the second row of the Type field and the Extra field is weird)

That is, Straight_join gets every row to go to B in the query, all go to the full table scan. The most common occurrence of this condition in MySQL is the type conversion. For example, a string field, although it contains a full number, but the query is not passed in the string format.

In this case, two of them are strings. Therefore, the character set is related.

Returning to the two table structure, it is found that the declaration of the S field is different from the COLLATE Utf8_bin-this is the root cause of this case: the S value of a table is Utf8_bin, the optimizer considers that the type differs and cannot be used directly on index B. ix_s filter.

As for why the index is also used, this is because the overlay index brings "misunderstanding".

MySQL Tips: If all the results of a query can be fully obtained from an index, the traversal of the index is preferred instead of traversing the data.

As a validation,

Mysql> explain SELECT * from a straight_join b on binary a. ' s ' = B. ' s ' WHERE A. ' l ' > ' 2014-03-30 00:55:00 ' and A. ' L ' < ' 2014-03-30 01:00:00 ';

+--+ ————-+ ——-+ ——-+ ————— +--+ ——— +--+ ——— + ———————————————— +

| ID | Select_type | Table | Type | Possible_keys | Key | Key_len | Ref | Rows | Extra |

+--+ ————-+ ——-+ ——-+ ————— +--+ ——— +--+ ——— + ———————————————— +

| 1 | Simple | A | Range | ix_l | ix_l | 4 | NULL | 63 | Using where |

| 1 | Simple | B | All | ix_s | NULL | NULL | NULL | 1038165 | Range checked for each record (index map:0x4) |

+--+ ————-+ ——-+ ——-+ ————— +--+ ——— +--+ ——— + ———————————————— +

Because the result is a select *, the overwrite index cannot be used, so the key on the second row is displayed as null. (The author tears: If early out this results can be more convenient to check up)

Optimization

The most straightforward idea, of course, is to change the definition of the S field of two tables to the same. This approach avoids modifying the business code, but the cost of the DDL is slightly larger. There are two optimizations for SQL statements.

1, select COUNT (*) from the B join (select S from a WHERE A. ' l ' > ' 2014-03-30 00:55:00 ' and A. ' L ' < ' 2014-03-30 01:00:00 ' ) Ta on B.s=ta.s;

This is more intuitive and requires attention to the last B. The order of S and TA.S

2. SELECT Count (*) from a JOIN b on binary a. ' s ' = B. ' s ' WHERE A. ' l ' > ' 2014-03-30 00:55:00 ' and A. ' L ' < ' 2014-03-30 01:00:00 ';

From the previous analysis know is because of B. S is defined as Utf8_bin.

MySQL Tips:mysql in the character set naming rules, the difference between Xxx_bin and XXX is case sensitive.

Here we will a.s all add binary qualification, first to lowercase, that is, to convert the temporary result set to Utf8_bin, then use B. The index can be used directly when the s is matched.

In fact, the essence of the two rewriting methods is the same, the difference is that 1 is implicit conversion. In theory, 2 is faster.

Summary

The fields that do join are designed to be exactly the same as the type.

» Reprint Please indicate source: Alirds official technical Support Blog» "A user SQL slow query analysis, cause and optimization"

[Turn] A user SQL slow query analysis, cause and optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.