In-depth explanation of explain for MySQL Query Optimization

In-depth explanation of explain for MySQL Query Optimization _ MySQL

Last Update:2018-04-12 Source: Internet

Author: User

Tags mysql query optimization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In-depth analysis of explain for MySQL Query optimization MySQLexplain

BitsCN.com

When analyzing query performance, it is also useful to consider the EXPLAIN keyword. The EXPLAIN keyword is generally placed before the SELECT query statement to describe how MySQL performs the query operation, and the number of rows to be executed in the result set returned by MySQL. Explain helps us analyze the select statement and let us know why the query efficiency is low, so as to improve our query and make the query optimizer better work.

I. How does the MySQL Query optimizer work?
The MySQL Query Optimizer has several goals, but the primary goal is to use the index as much as possible and use the strictest index to eliminate as many data rows as possible. The final goal is to submit a SELECT statement to search for data rows, rather than exclude data rows. The reason the optimizer tries to exclude data rows is that the faster it can exclude data rows, the faster it can find data rows that match conditions. If you can perform the strictest test first, the query can be executed faster.
Each output row of EXPLAIN provides information about a table, and each row includes the following columns:

Item	Description
Id	The serial number of the Query in the execution plan selected by MySQL Query Optimizer. Indicates the order in which the select clause or operation table is executed in the query. the higher the id value, the higher the priority. The execution sequence is from top to bottom.

Select_type	Description
SIMPLE	Simple select queries, without union and subqueries
PRIMARY	Select query on the outermost layer
UNION	The second or subsequent select query in UNION does not depend on the result set of the external query.
DEPENDENT UNION	The second or subsequent select query in the UNION statement depends on the result set of the external query.
SUBQUERY	The first select query in a subquery does not depend on the result set of an external query.
DEPENDENT SUBQUERY	The first select query in the subquery depends on the result set of the external query.
DERIVED	It is used when a subquery exists in the from clause. MySQL recursively executes these subqueries and places the results in the temporary table.
UNCACHEABLE SUBQUERY	Subqueries whose result sets cannot be cached must be re-evaluated for each row of the external layer query.
UNCACHEABLE UNION	The second or subsequent select queries in UNION are non-cacheable subqueries.

Item	Description
Table	Table referenced by the output row

Type indicates the type used by the connection, which is sorted by the best to the worst type.	Description
System	The table has only one row (= system table ). This is a special case of the const connection type.
Const	Const is used to compare the primary key with a common value. When the queried table has only one row, System is used.
Eq_ref	Const is used to compare the primary key with a common value. When the queried table has only one row, System is used.
Ref	A single row cannot be selected based on the keyword. Multiple rows that meet the condition may be found. It is called ref because the index must be compared with a reference value. This reference value is either a constant or a result value from a multi-table query in a table.
Ref_or_null	Like ref, but MySQL must find a null entry in the first search result and perform a second search.
Index_merge	This indicates that the index merge optimization is used.
Unique_subquery	This type is used IN some IN queries, rather than regular ref: value IN (SELECT primary_key FROM single_table WHERE some_expr)
Index_subquery	This type is used IN some IN queries, similar to unique_subquery, but the query is a non-unique index: value IN (SELECT key_column FROM single_table WHERE some_expr)
Range	Only retrieve rows in a given range and use an index to select rows. The key column shows the index used. When the =, <>,>, >=, <, <=, is null, <=>, BETWEEN, or IN operator IS used to compare the keyword columns with constants, range can be used.
Index	A full table scan only scans the table in the index order instead of rows. The main advantage is that sorting is avoided, but the overhead is still very large.
All	The worst case is full table scan from start to end.

Item	Description
Possible_keys	Lists the indexes that MySQL can use in this table to facilitate queries. If it is null, no index is available.

Item	Description
Key	MySQL actually selects the index from possible_key. If it is NULL, no index is used. In rare cases, MYSQL selects an optimized index. In this case, you can use index (indexname) in the SELECT statement to force an INDEX or use ignore index (indexname) to force MYSQL to IGNORE the INDEX.

Item	Description
Key_len	The length of the index used. The shorter the length, the better.

Item	Description
Ref	Which column of the index is used

Item	Description
Rows	MYSQL considers that the number of rows that must be checked to return the requested data

Item	Description
Rows	MYSQL considers that the number of rows that must be checked to return the requested data

The following two items in extra indicate that MYSQL cannot use indexes at all, and the efficiency is greatly affected. This should be optimized as much as possible.

Extra item	Description
Using filesort	MySQL uses an external index to sort the results, instead of reading the relevant content from the table in the index order. It may be sorted in memory or disk. Sorting operations that cannot be completed by using indexes in MySQL are called "file sorting"
Using temporary	Indicates that MySQL uses a temporary table for sorting query results. It is common in sorting order by and grouping query group.

The following is an example to illustrate the usage of the explain statement.
First, let's look at a table:

Create table if not exists 'article' ('id' int (10) unsigned not null AUTO_INCREMENT,
'Author _ id' int (10) unsigned not null,
'Category _ id' int (10) unsigned not null,
'View' int (10) unsigned not null,
'Comments' int (10) unsigned not null,
'Title' varbinary (255) not null,
'Content' text not null,
Primary key ('id ')
);

Insert several more data records:

Insert into 'article'
('Author _ id', 'Category _ id', 'view', 'comments', 'title', 'Content') VALUES
(1, 1, 1, 1, '1', '1 '),
(2, 2, 2, 2, '2', '2 '),
(1, 1, 3, 3, '3', '3 ');

Requirements:
Query the article_id with the most views when category_id is 1 and comments is greater than 1.
Try it out first:

EXPLAIN
SELECT author_id
FROM 'article'
WHERE category_id = 1 AND comments> 1
Order by views DESC
LIMIT 1/G

See some output results:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 3
Extra: Using where; Using filesort
1 row in set (0.00 sec)

Obviously, type is ALL, that is, the worst case. Using filesort also appeared in Extra, which is also the worst case. Optimization is required.

Well, the simplest solution is to add indexes. Okay. let's give it a try. The where clause uses three fields: category_id, comments, and views. Therefore, it is easiest to use a joint index.

Alter table 'article' add index x ('Category _ id', 'comments', 'view ');

The results have improved, but they are still bad:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: range
Possible_keys: x
Key: x
Key_len: 8
Ref: NULL
Rows: 1
Extra: Using where; Using filesort
1 row in set (0.00 sec)

Type is changed to range, which is tolerable. However, Using filesort in extra is still unacceptable. But we have already created an index. why is it useless? This is because according to the working principle of the BTree index, category_id is sorted first. if the same category_id is encountered, comments are sorted again. if the same comments is encountered, views are sorted again. When the comments field is in the middle of the joint index, because the comments> 1 condition is a range value (so-called range), MySQL cannot use the index to search for the following views, that is, the index after the range type query field is invalid.
We need to discard comments and delete the old index:

Drop index x ON article;

Then create a new index:

Alter table 'article' add index y ('Category _ id', 'view ');

Then run the query:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: const
Rows: 1
Extra: Using where
1 row in set (0.00 sec)

As you can see, the type is changed to ref, and the Using filesort in Extra disappears, and the result is very satisfactory.
Let's look at a multi-table query example.
First, define three tables class and room.

Create table if not exists 'class '(
'Id' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('id ')
);
Create table if not exists 'book '(
'Bookid' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('bookid ')
);
Create table if not exists 'phone '(
'Phoneid' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('phoneid ')
) Engine = innodb;

Then insert a large amount of data separately. Php script for data insertion:

$ Link = mysql_connect ("localhost", "root", "870516 ");
Mysql_select_db ("test", $ link );
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into class (card) values ({$ j })";
Mysql_query ($ SQL );
}
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into book (card) values ({$ j })";
Mysql_query ($ SQL );
}
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into phone (card) values ({$ j })";
Mysql_query ($ SQL );
}
Mysql_query ("COMMIT ");
?>

Then let's look at a left join query:

Explain select * from class left join book on class. card = book. card/G

The analysis result is:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Obviously, we need to optimize the second ALL.
Try to create an index:

Alter table 'book' add index y ('card ');

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: test. class. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

We can see that the type in the second row is changed to ref, and rows is also changed to 1741*18, which is obviously optimized. This is determined by the left join feature. The left join condition is used to determine how to search rows from the right table. there must be rows on the LEFT. Therefore, the right side is our key point and indexes must be created.
Delete the old index:

Drop index y ON book;

Create a new index.

Alter table 'class' add index x ('card ');

Result

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Almost unchanged.
Then let's look at a right connection query:

Explain select * from class right join book on class. card = book. card;

The analysis result is:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ref
Possible_keys: x
Key: x
Key_len: 4
Ref: test. book. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

The optimization is obvious. This is because the right join condition is used to determine how to search rows from the left table, and there must be rows on the RIGHT. Therefore, the left is our key point and indexes must be created.
Delete the old index:

Drop index x ON class;

Create a new index.

Alter table 'book' add index y ('card ');

Result

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Almost unchanged.

Finally, let's take a look at the inner join situation:

Explain select * from class inner join book on class. card = book. card;

Result:

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ref
Possible_keys: x
Key: x
Key_len: 4
Ref: test. book. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

Delete the old index:

Drop index y ON book;

Result

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Create a new index.

Alter table 'class' add index x ('card ');

Result

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

In conclusion, inner join and left join are similar, and the right table needs to be optimized. Right join needs to optimize the left table.

Let's take a look at the three-table query example.

Add a new index:

Alter table 'phone' add index z ('card ');
Alter table 'book' add index y ('card ');

Explain select * from class left join book on class. card = book. card left join phone on book. card = phone. card;

* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: test. class. card
Rows: 1000
Extra:
* *************************** 3. row ***************************
Id: 1
Select_type: SIMPLE
Table: phone
Type: ref
Possible_keys: z
Key: z
Key_len: 4
Ref: test. book. card
Rows: 260
Extra: Using index
3 rows in set (0.00 sec)

The type of the last two rows is ref and the overall rows optimization is good.

The explain syntax in MySql can help us rewrite the query and optimize the table structure and index settings to maximize query efficiency. Of course, when there is a large amount of data, the indexing and maintenance cost is also high. it usually takes a long time and a large amount of space. if you create an index on different column combinations, the space overhead is higher.Therefore, it is best to set indexes in fields that require frequent queries.

BitsCN.com

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More