When analyzing query performance, it is also useful to consider the explain keyword. The Explain keyword is typically placed in front of a SELECT query statement to describe how MySQL performs a query operation and the number of rows that MySQL needs to perform to successfully return the result set. Explain can help us analyze the SELECT statement, let us know why the query is inefficient, and thus improve our query so that the query optimizer can work better.
How does the MySQL query optimizer work
The MySQL query optimizer has several goals, but the main goal is to use the index as much as possible and use the strictest indexes to eliminate as many rows of data as possible. The ultimate goal is to submit a SELECT statement to find rows of data instead of excluding rows of data. The reason the optimizer tries to exclude rows of data is that the faster it excludes rows of data, the faster the data rows that match the criteria are found. If you can do the most rigorous testing first, the query can execute faster.
Each output row of the EXPLAIN provides information about a table, and each row includes the following columns:
Item |
Description |
Id |
MySQL query Optimizer The serial number of the query in the selected execution plan. Represents the order in which a SELECT clause or action table is executed in a query, and the higher the priority of the ID value, the more it executes first. IDs are the same, and the execution order is from top to bottom. |
Select_type Query Type |
Description |
Simple |
Simple select query with no union and subqueries |
PRIMARY |
The outermost select query |
UNION |
The second or subsequent select query in the UNION, which does not depend on the result set of the external query |
DEPENDENT UNION |
The second or subsequent select query in the UNION, dependent on the result set of the external query |
Subquery |
The first select query in a subquery that does not depend on the result set of an external query |
DEPENDENT subquery |
The first select query in a subquery, dependent on the result set of an external query |
DERIVED |
Used in cases where subqueries are in the FROM clause. MySQL executes these subqueries recursively, putting the results in a temporary table. |
Uncacheable subquery |
The result set cannot be cached subqueries and must be reassessed for each row of the outer query. |
Uncacheable UNION |
The second or subsequent select query in the UNION, belonging to a subquery that is not cached |
Item |
Description |
Table |
The table referenced by the output line |
Type important items that show the types of connections used, sorted by best to worst type |
Description |
System |
Table has only one row (= system table). This is a special case of the const connection type. |
Const |
Const is used to compare PRIMARY keys with constant values. System is used when the query's table has only one row. |
Eq_ref |
Const is used to compare PRIMARY keys with constant values. System is used when the query's table has only one row. |
Ref |
A connection cannot select a single row based on a keyword, and may find multiple rows that meet the criteria. It is called ref because the index is compared to a reference value. The reference value is either a constant or a result value of a multiple-table query from a list. |
Ref_or_null |
Like ref, but MySQL must find the null entry in the results of the initial lookup, and then make two lookups. |
Index_merge |
Explains that index merge optimization is in use. |
Unique_subquery |
This type is used in some in queries instead of the regular ref:value in (SELECT primary_key from single_table WHERE some_expr) |
Index_subquery |
This type is used in some in queries, similar to Unique_subquery, but the query is a non unique index: value in (SELECT key_column from single_table WHERE some_ Expr |
Range |
Retrieves only the rows of a given range, using an index to select rows. The key column shows which index is used. When using the =, <>, >, >=, <, <=, is NULL, <=>, BETWEEN, or in operators, you can use range when you compare key columns with constants. |
Index |
Full table scans, which are done in index order instead of rows when scanning tables. The main advantage is to avoid sorting, but the overhead is still very large. |
All |
Worst case scenario, full table scan from beginning to end. |
Item |
Description |
Possible_keys |
Indicates which indexes MySQL can use in this table to help with queries. If NULL, there is no index available. |
Item |
Description |
Key |
MySQL actually chooses the index to use from Possible_key. If NULL, the index is not used. In rare cases, MYSQL chooses to optimize an index that is insufficient. In this case, the use index (indexname) can be used in a SELECT statement to force an index or to force MYSQL to ignore the index with IGNORE index (indexname) |
Item |
Description |
Key_len |
The length of the index used. Without loss of accuracy, the shorter the length the better. |
Item |
Description |
Ref |
Shows which column of the index is used |
Item |
Description |
Rows |
The number of rows that MYSQL considers must be checked to return the requested data |
Item |
Description |
Rows |
The number of rows that MYSQL considers must be checked to return the requested data |
The following 2 entries in extra mean that MYSQL cannot use indexes at all, and efficiency can be significantly affected. This should be optimized to the extent possible.
Extra items |
Description |
Using Filesort |
Indicates that MySQL will use an external index for the result instead of reading the contents in indexed order. May be sorted on memory or on disk. The sort operation in MySQL that cannot take advantage of index completion is called "File sort" |
Using Temporary |
Indicates that MySQL uses temporary tables when sorting query results. Common in sort order by and grouped query group by. |
Here is an example to illustrate the use of the explain.
First come to a table:
Copy Code code as follows:
CREATE TABLE IF not EXISTS ' article ' (' id ' int (a) unsigned not NULL auto_increment,
' author_id ' int (a) unsigned not NULL,
' category_id ' int (a) unsigned not NULL,
' Views ' int (a) unsigned not NULL,
' Comments ' int (a) unsigned not NULL,
' title ' varbinary (255) not NULL,
' Content ' text not NULL,
PRIMARY KEY (' id ')
);
Insert a few more data:
Copy Code code as follows:
INSERT into ' article '
(' author_id ', ' category_id ', ' views ', ' comments ', ' title ', ' content ') VALUES
(1, 1, 1, 1, ' 1 ', ' 1 '),
(2, 2, 2, 2, ' 2 ', ' 2 '),
(1, 1, 3, 3, ' 3 ', ' 3 ');
Demand:
Query category_id is 1 and comments is greater than 1, the views most article_id.
Check it out first:
Copy Code code as follows:
EXPLAIN
SELECT author_id
From ' article '
WHERE category_id = 1 and comments > 1
Order by Views DESC
LIMIT 1\g
Look at some of the output results:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
Rows:3
Extra:using where; Using Filesort
1 row in Set (0.00 sec)
Obviously, type is all, which is the worst case. A Using Filesort is also present in the Extra, which is also the worst case. Optimization is a must.
Well, the simplest solution would be to add an index. OK, let's have a try. The Category_id,comments,views three fields are used after the query's condition, where it is. Then a federated index is the easiest.
Copy Code code as follows:
ALTER TABLE ' article ' ADD INDEX x (' category_id ', ' comments ', ' views ');
The result was a certain improvement, but it was still bad:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:range
Possible_keys:x
Key:x
Key_len:8
Ref:null
Rows:1
Extra:using where; Using Filesort
1 row in Set (0.00 sec)
Type becomes range, which can be tolerated. However, using Filesort in extra is still unacceptable. But we have indexed, why not? This is because according to the working principle of the Btree index, sort the category_id first, and then sort the comments if you encounter the same category_id, and then sort the views if you encounter the same comments. When the comments field is in the middle position in the federated index, because the comments > 1 condition is a range value (called range), MySQL cannot use the index to retrieve the following views section, which means the index after the range type query field is invalid.
Then we need to discard comments and delete the old index:
Copy Code code as follows:
Then create a new index:
Copy Code code as follows:
ALTER TABLE ' article ' ADD INDEX y (' category_id ', ' views ');
Then run the query again:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:article
Type:ref
Possible_keys:y
Key:y
Key_len:4
Ref:const
Rows:1
Extra:using where
1 row in Set (0.00 sec)
As you can see, the type becomes the Using filesort in Ref,extra and disappears, and the result is ideal.
Let's look at an example of a multiple-table query.
First, define 3 table class and room.
Copy Code code as follows:
CREATE TABLE IF not EXISTS ' class ' (
' ID ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' id ')
);
CREATE TABLE IF not EXISTS ' book ' (
' BookID ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' BookID ')
);
CREATE TABLE IF not EXISTS ' phone ' (
' Phoneid ' int (a) unsigned not NULL auto_increment,
' Card ' int (a) unsigned not NULL,
PRIMARY KEY (' Phoneid ')
) engine = InnoDB;
Then insert a large amount of data separately. PHP script to insert data:
Copy Code code as follows:
<?php
$link = mysql_connect ("localhost", "root", "870516");
mysql_select_db ("Test", $link);
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into class (card) values ({$j})";
mysql_query ($sql);
}
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into book (card) VALUES ({$j})";
mysql_query ($sql);
}
For ($i =0 $i <10000; $i + +)
{
$j = rand (1,20);
$sql = "INSERT into the phone (card) values ({$j})";
mysql_query ($sql);
}
mysql_query ("COMMIT");
?>
Then look at a left-join query:
Copy Code code as follows:
Explain select * FROM class LEFT join book on class.card = Book.card\g
The results of the analysis are:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2 rows in Set (0.00 sec)
Obviously the second all is for us to optimize.
Try setting up an index:
Copy Code code as follows:
ALTER TABLE ' book ' ADD INDEX y (' card ');
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:ref
Possible_keys:y
Key:y
Key_len:4
Ref:test.class.card
rows:1000
Extra:
2 rows in Set (0.00 sec)
You can see that the type of the second row becomes ref,rows and 1741*18, and the optimization is more obvious. This is determined by the left connection attribute. The left JOIN condition is used to determine how to search for rows from the right table, there must be a left-hand side, so the right side is our key point and must be indexed.
To delete an old index:
Copy Code code as follows:
Create a new index.
Copy Code code as follows:
ALTER TABLE ' class ' ADD INDEX x (' card ');
Results
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2 rows in Set (0.00 sec)
Basically no change.
Then look at a right connection query:
Copy Code code as follows:
Explain select * FROM class right join book on class.card = Book.card;
The results of the analysis are:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:class
Type:ref
Possible_keys:x
Key:x
Key_len:4
Ref:test.book.card
rows:1000
Extra:
2 rows in Set (0.00 sec)
Optimization is more obvious. This is because the right JOIN condition is used to determine how to search for rows from the left table, which must be on the right-hand side, so the left side is our key point and must be indexed.
To delete an old index:
Copy Code code as follows:
Create a new index.
Copy Code code as follows:
ALTER TABLE ' book ' ADD INDEX y (' card ');
Results
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2 rows in Set (0.00 sec)
Basically no change.
Finally, let's look at the inner join:
Copy Code code as follows:
Explain SELECT * FROM class INNER join book on class.card = Book.card;
Results:
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:class
Type:ref
Possible_keys:x
Key:x
Key_len:4
Ref:test.book.card
rows:1000
Extra:
2 rows in Set (0.00 sec)
To delete an old index:
Copy Code code as follows:
Results
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2 rows in Set (0.00 sec)
Create a new index.
Copy Code code as follows:
ALTER TABLE ' class ' ADD INDEX x (' card ');
Results
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2 rows in Set (0.00 sec)
To sum up, inner join and left join, all need to optimize the right table. And right join needs to optimize the left table.
Let's take another look at the three-table query example
To add a new index:
Copy Code code as follows:
ALTER TABLE ' phone ' ADD INDEX z (' card ');
ALTER TABLE ' book ' ADD INDEX y (' card ');
Copy Code code as follows:
Explain select * FROM class LEFT join book on Class.card=book.card left join phone on book.card = Phone.card;
Copy Code code as follows:
1. Row ***************************
Id:1
Select_type:simple
Table:class
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:20000
Extra:
2. Row ***************************
Id:1
Select_type:simple
Table:book
Type:ref
Possible_keys:y
Key:y
Key_len:4
Ref:test.class.card
rows:1000
Extra:
3. Row ***************************
Id:1
Select_type:simple
Table:phone
Type:ref
Possible_keys:z
Key:z
Key_len:4
Ref:test.book.card
rows:260
Extra:using Index
3 Rows in Set (0.00 sec)
The type in the 2 rows is ref and the total rows optimization is good and works fine.
The explain syntax in MYSQL can help us rewrite queries, optimize the structure of tables, and index settings, thus maximizing query efficiency. Of course, in large-scale data, the cost of the establishment and maintenance of the index is also very high, often requires a long time and a larger space, if the index on different combinations of columns, the cost of space will be greater.
Therefore, the index is best set in fields that require frequent queries.