In-depth explanation of explain for MySQL Query Optimization

Source: Internet
Author: User
Tags mysql query optimization

When analyzing query performance, it is also useful to consider the EXPLAIN keyword. The EXPLAIN keyword is generally placed before the SELECT query statement to describe how MySQL performs the query operation, and the number of rows to be executed in the result set returned by MySQL. Explain helps us analyze the select statement and let us know why the query efficiency is low, so as to improve our query and make the query optimizer better work.

I. How does the MySQL query optimizer work?
The MySQL query optimizer has several goals, but the primary goal is to use the index as much as possible and use the strictest index to eliminate as many data rows as possible. The final goal is to submit a SELECT statement to search for data rows, rather than exclude data rows. The reason the optimizer tries to exclude data rows is that the faster it can exclude data rows, the faster it can find data rows that match conditions. If you can perform the strictest test first, the query can be executed faster.
Each output row of EXPLAIN provides information about a table, and each row includes the following columns:

Item Description
Id The serial number of the Query in the execution plan selected by MySQL Query Optimizer. Indicates the order in which the select clause or operation table is executed in the query. The higher the id value, the higher the priority. The execution sequence is from top to bottom.

Select_type Description
SIMPLE Simple select queries, without union and subqueries
PRIMARY Select query on the outermost layer
UNION The second or subsequent select query in UNION does not depend on the result set of the external query.
DEPENDENT UNION The second or subsequent select query in the UNION statement depends on the result set of the external query.
SUBQUERY The first select query in a subquery does not depend on the result set of an external query.
DEPENDENT SUBQUERY The first select query in the subquery depends on the result set of the external query.
DERIVED It is used when a subquery exists in the from clause. MySQL recursively executes these subqueries and places the results in the temporary table.
UNCACHEABLE SUBQUERY Subqueries whose result sets cannot be cached must be re-evaluated for each row of the external layer query.
UNCACHEABLE UNION The second or subsequent select queries in UNION are non-cacheable subqueries.

Item Description
Table Table referenced by the output row

Type indicates the type used by the connection, which is sorted by the best to the worst type. Description
System The table has only one row (= system table ). This is a special case of the const connection type.
Const Const is used to compare the primary key with a common value. When the queried table has only one row, System is used.
Eq_ref Const is used to compare the primary key with a common value. When the queried table has only one row, System is used.
Ref A single row cannot be selected based on the keyword. Multiple rows that meet the condition may be found. It is called ref because the index must be compared with a reference value. This reference value is either a constant or a result value from a multi-Table query in a table.
Ref_or_null Like ref, but MySQL must find a null entry in the first search result and perform a second search.
Index_merge This indicates that the index merge optimization is used.
Unique_subquery This type is used IN some IN queries, rather than regular ref: value IN (SELECT primary_key FROM single_table WHERE some_expr)
Index_subquery This type is used IN some IN queries, similar to unique_subquery, but the query is a non-unique index: value IN (SELECT key_column FROM single_table WHERE some_expr)
Range Only retrieve rows in a given range and use an index to select rows. The key column shows the index used. When the =, <>,>, >=, <, <=, is null, <=>, BETWEEN, or IN operator IS used to compare the keyword columns with constants, range can be used.
Index A full table scan only scans the table in the index order instead of rows. The main advantage is that sorting is avoided, but the overhead is still very large.
All The worst case is full table scan from start to end.
Item Description
Possible_keys Lists the indexes that MySQL can use in this table to facilitate queries. If it is null, no index is available.

Item Description
Key MySQL actually selects the index from possible_key. If it is NULL, no index is used. In rare cases, MYSQL selects an optimized index. In this case, you can use index (indexname) in the SELECT statement to force an INDEX or use ignore index (indexname) to force MYSQL to IGNORE the INDEX.

Item Description
Key_len The length of the index used. The shorter the length, the better.

Item Description
Ref Which column of the index is used

Item Description
Rows MYSQL considers that the number of rows that must be checked to return the requested data

Item Description
Rows MYSQL considers that the number of rows that must be checked to return the requested data

The following two items in extra indicate that MYSQL cannot use indexes at all, and the efficiency is greatly affected. This should be optimized as much as possible.

Extra item Description
Using filesort MySQL uses an external index to sort the results, instead of reading the relevant content from the table in the index order. It may be sorted in memory or disk. Sorting operations that cannot be completed by using indexes in MySQL are called "File Sorting"
Using temporary Indicates that MySQL uses a temporary table for sorting query results. It is common in sorting order by and grouping query group.

The following is an example to illustrate the usage of the explain statement.
First, let's look at a table:

Copy codeThe Code is as follows: create table if not exists 'Article' ('id' int (10) unsigned not null AUTO_INCREMENT,
'Author _ id' int (10) unsigned not null,
'Category _ id' int (10) unsigned not null,
'View' int (10) unsigned not null,
'Comments' int (10) unsigned not null,
'Title' varbinary (255) not null,
'Content' text not null,
Primary key ('id ')
);

Insert several more data records:Copy codeThe Code is as follows: insert into 'Article'
('Author _ id', 'category _ id', 'view', 'comments', 'title', 'content') VALUES
(1, 1, 1, 1, '1', '1 '),
(2, 2, 2, 2, '2', '2 '),
(1, 1, 3, 3, '3', '3 ');

Requirements:
Query the article_id with the most views when category_id is 1 and comments is greater than 1.
Try it out first:Copy codeThe Code is as follows: EXPLAIN
SELECT author_id
FROM 'Article'
WHERE category_id = 1 AND comments> 1
Order by views DESC
LIMIT 1 \ G

See some output results:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 3
Extra: Using where; Using filesort
1 row in set (0.00 sec)

Obviously, type is ALL, that is, the worst case. Using filesort also appeared in Extra, which is also the worst case. Optimization is required.

Well, the simplest solution is to add indexes. Okay. Let's give it a try. The where clause uses three fields: category_id, comments, and views. Therefore, it is easiest to use a joint index.

Copy codeThe Code is as follows: alter table 'Article' add index x ('category _ id', 'comments', 'view ');

The results have improved, but they are still bad:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: range
Possible_keys: x
Key: x
Key_len: 8
Ref: NULL
Rows: 1
Extra: Using where; Using filesort
1 row in set (0.00 sec)

Type is changed to range, which is tolerable. However, Using filesort in extra is still unacceptable. But we have already created an index. Why is it useless? This is because according to the working principle of the BTree index, category_id is sorted first. If the same category_id is encountered, comments are sorted again. If the same comments is encountered, views are sorted again. When the comments field is in the middle of the joint index, because the comments> 1 condition is a range value (so-called range), MySQL cannot use the index to search for the following views, that is, the index after the range type query field is invalid.
We need to discard comments and delete the old index:Copy codeThe Code is as follows: drop index x ON article;

Then create a new index:Copy codeThe Code is as follows: alter table 'Article' add index y ('category _ id', 'view ');

Then run the query:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: article
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: const
Rows: 1
Extra: Using where
1 row in set (0.00 sec)

As you can see, the type is changed to ref, and the Using filesort in Extra disappears, and the result is very satisfactory.
Let's look at a multi-Table query example.
First, define three tables class and room.Copy codeThe Code is as follows: create table if not exists 'class '(
'Id' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('id ')
);
Create table if not exists 'book '(
'Bookid' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('bookid ')
);
Create table if not exists 'phone '(
'Phoneid' int (10) unsigned not null AUTO_INCREMENT,
'Card 'int (10) unsigned not null,
Primary key ('phoneid ')
) Engine = innodb;

Then insert a large amount of data separately. Php script for data insertion:Copy codeThe Code is as follows: <? Php
$ Link = mysql_connect ("localhost", "root", "870516 ");
Mysql_select_db ("test", $ link );
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into class (card) values ({$ j })";
Mysql_query ($ SQL );
}
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into book (card) values ({$ j })";
Mysql_query ($ SQL );
}
For ($ I = 0; I I <10000; $ I ++)
{
$ J = rand (1, 20 );
$ SQL = "insert into phone (card) values ({$ j })";
Mysql_query ($ SQL );
}
Mysql_query ("COMMIT ");
?>

Then let's look at a left join query:Copy codeThe Code is as follows: explain select * from class left join book on class. card = book. card \ G

The analysis result is:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Obviously, we need to optimize the second ALL.
Try to create an index:Copy codeThe Code is as follows: alter table 'book' add index y ('Card ');

Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: test. class. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

We can see that the type in the second row is changed to ref, and rows is also changed to 1741*18, which is obviously optimized. This is determined by the left join feature. The left join condition is used to determine how to search rows from the right table. There must be rows on the LEFT. Therefore, the right side is our key point and indexes must be created.
Delete the old index:Copy codeThe Code is as follows: drop index y ON book;

Create a new index.Copy codeThe Code is as follows: alter table 'class' add index x ('Card ');

ResultCopy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Almost unchanged.
Then let's look at a right connection query:Copy codeThe Code is as follows: explain select * from class right join book on class. card = book. card;

The analysis result is:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ref
Possible_keys: x
Key: x
Key_len: 4
Ref: test. book. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

The optimization is obvious. This is because the right join condition is used to determine how to search rows from the left table, and there must be rows on the RIGHT. Therefore, the left is our key point and indexes must be created.
Delete the old index:Copy codeThe Code is as follows: drop index x ON class;

Create a new index.Copy codeThe Code is as follows: alter table 'book' add index y ('Card ');

ResultCopy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Almost unchanged.

Finally, let's take a look at the inner join situation:

Copy codeThe Code is as follows: explain select * from class inner join book on class. card = book. card;

Result:Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ref
Possible_keys: x
Key: x
Key_len: 4
Ref: test. book. card
Rows: 1000
Extra:
2 rows in set (0.00 sec)

Delete the old index:Copy codeThe Code is as follows: drop index y ON book;

ResultCopy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

Create a new index.Copy codeThe Code is as follows: alter table 'class' add index x ('Card ');

ResultCopy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
2 rows in set (0.00 sec)

In conclusion, inner join and left join are similar, and the right table needs to be optimized. Right join needs to optimize the left table.

Let's take a look at the three-Table query example.

Add a new index:

Copy codeThe Code is as follows: alter table 'phone' add index z ('Card ');
Alter table 'book' add index y ('Card ');

Copy codeThe Code is as follows: explain select * from class left join book on class. card = book. card left join phone on book. card = phone. card;

Copy codeThe Code is as follows: **************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: class
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20000
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: book
Type: ref
Possible_keys: y
Key: y
Key_len: 4
Ref: test. class. card
Rows: 1000
Extra:
* *************************** 3. row ***************************
Id: 1
Select_type: SIMPLE
Table: phone
Type: ref
Possible_keys: z
Key: z
Key_len: 4
Ref: test. book. card
Rows: 260
Extra: Using index
3 rows in set (0.00 sec)

The type of the last two rows is ref and the overall rows optimization is good.

The explain syntax in MySql can help us rewrite the query and optimize the table structure and index settings to maximize query efficiency. Of course, when there is a large amount of data, the indexing and maintenance cost is also high. It usually takes a long time and a large amount of space. If you create an index on different column combinations, the space overhead is higher.Therefore, it is best to set indexes in fields that require frequent queries.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.