[MySQL Notes] SQL optimization tool-explain the output format of the explain command
If you have experience using MySQL, you may encounter slow SQL queries in your project. In some scenarios, you can easily locate the problem (for example, when a single table operation has slow SQL queries, check the SQL statement carefully and it is usually easy to locate the index problem). In some complex business scenarios (such as multi-table joint query of dozens of fields and perform group or sort operations ), manual check of SQL statements is usually difficult to find the root cause of SQL bottlenecks. At this time, the explain Command provided by MySQL came in handy.
This note mainly describes the explain output results and provides the optimization ideas for SQL based on the explain output.
1. EXPLAIN syntax and usage
For more information about the syntax of the explain command, see the official documentation.
This command is mainly used to output the SQL Execution Plan of MySQL optimizer, that is, MySQL will explain how to process the input SQL (whether to use an index or which index to use, the order in which multiple tables are joined and the associated fields ).
The explain prompt can help you to understand which fields should be indexed, or confirm whether optimizer will join multiple tables in a reasonable order during SQL Execution. For example, if there is an SQL statement like this:
Select t1.id, t2.link, t3.detail from t1, t2, t3 where t1.id <100 and t1.id = t2.base _ id and t3.link _ sign = t2.sign;
When this statement is executed, optimizer may not necessarily join the three tables in the order listed from, and the join order of the tables may affect the SQL Performance.
In this scenario, if you want optimizer to perform join operations in the order listed by the from statement, there are two methods:
1) Add the STRAIGHT_JOIN keyword after the select keyword to prompt optimizer to join in the order listed in the from table. For the specific syntax, see the SELECT Document.
2) Adjust the position of the joined fields of each table in the SQL where condition before and after the equal sign
The following content describes how to determine the execution sequence of optimizer on each table during multi-table join through the explain output, and how to adjust the SQL statement to influence the execution plan of optimizer.
2. EXPLAIN output format description
The explain command returns a line of information for each table in SQL to explain how optimizer operates the table, the order of the tables listed in the output is also the processing order of each table when MySQL actually executes the SQL statement.
MySQL uses
Nested-loopThe algorithm processes all join operations. The algorithm principle is described here. It is helpful to understand join operations. We recommend that you understand them.
The explain statement contains the following fields for each row of records output from each table:
The descriptions are as follows.
1) id
This field identifies the select statement id. If there is only one select statement in SQL (even for multi-table join queries), this value is 1, otherwise it will increase sequentially. If SQL is the result of union, the value is NULL.
2) select_type
This field describes the type of the select statement. The possible values are as follows (from the official website ):
Simple is the most common type, indicating that SQL only contains one select statement; derived indicates the data table represented by this row (derived table) in fact, it is the output result of the subquery contained in the from clause. Other types are easy to understand. Read the official documentation and do not go into details here.
3) table
This field indicates the table from which the dataset is represented by each line output by explain. Its value is usually the specific table name. When the dataset is the result of union, its value may
Yes <unionM, N>. When the dataset comes from the derived table, its value may be <derivedN>. The M or N mentioned here are the values of the id field.
4) type
This field shows how tables are joined. The value of this field is complex. For details, see the documentation on the official website. Only the most common values are listed here.
A. system/const
Const indicates the dataset represented by the preceding "table" field. Only one row of records can hit the query conditions of the current execution plan, for example, where
When the clause compares the primary key or unique index of a table with the constant, the type field corresponding to the execution plan is const.
System is only a special case of the const value. It indicates that the data set to be operated in this step has only one row of records.
They can only appear in the type field value of a single table query SQL.
B. eq_ref
This value indicates that the associated fields in the dataset in this step are index fields and
Only one recordMeets the association conditions of the dataset for which the operation is planned in the previous step.
This
Is the optimal join type that may be obtained when multi-table join queries are performed.(Because it usually indicates that the associated field is the primary key or unique index of the table to be operated in this execution plan ).
C. ref
This value indicates that the joined field in the dataset that executes the plan operation in this step is the index field
Not only one recordMeets the association conditions of the dataset for which the operation is planned in the previous step.
More than one record that meets the association condition indicates that the association field is not a primary key or unique index. When the number of records that meet the association condition is small, this join_type = 'ref 'scenario is reasonable, but it is obviously not as efficient as join_type = 'eq _ ref.
D. ref_or_null
This join type is similar to the ref type, but it indicates that MySQL will perform additional searches for fields containing NULL values. For example, the join type of the following SQL statement is ref_or_null:
SELECT * FROM ref_table WHERE key_column = expr OR key_column is null;
E. index_merge
This value indicates that MySQL performs index merge optimization on the execution plan of this step. The SQL statements that trigger index merge usually include the 'or' operation. Common examples are as follows:
SELECT * FROM tbl_name WHERE key1 = 10 OR key2 = 20;
SELECT * FROM tbl_name WHERE (key1 = 10 OR key2 = 20) AND non_key = 30;
SELECT * FROM t1, t2 WHERE (t1.key1 IN (1, 2) OR t1.key2 LIKE 'value % ') AND t2.key1 = t1.some _ col;
SELECT * FROM t1, t2 WHERE t1.key1 = 1 AND (t2.key1 = t1.some _ col OR t2.key2 = t1.some _ col2 );
F. range
This value indicates the execution plan of this step.
There are more than one record that only operates a single table and meets the query conditions.May appear in SQL statements with in or between operations.
It is only applicable to single table operation scenarios. It is also different from the previously mentioned join_type scenario where the value of join_type is 'ref '.Because the ref may appear in the single-table/Multi-table join operation scenario.
G. ALL
This value indicates that the full table scan is performed on the dataset in this step, which is a scenario that must be optimized. Generally, you can create a proper index for some fields to avoid full table scanning.
H. index
This value indicates that MySQL scans the index tree when executing this step, while ALL scans the entire table.
It may appear in two scenarios:
Case1.The field to be returned by the select statement in this step is the leftmost prefix matching set of the index field of the dataset. If table A has created A joint index (f1, f2) containing two fields, select f1 from A where f2 = 'yyy' may trigger MySQL to scan the index tree, in this case, the Extra field of the Execution Plan contains "Using index" to indicate that it is scanning the index tree, because f1 is (f1, f2) the final prefix matches one element in the Set, while select f2 from A where f2 = 'yyy' scans the entire table.
Case2.In this step, we plan to perform a full table scan in the order of indexes to search for qualified data. In this case, the Extra field of the execution plan does not contain "Using index". This full table scan is also a scenario that must be optimized.
5) possible_keys
The value of this field may be used as an index by MySQL. If the value is NULL, no field will be used as an index, so the query efficiency is not high. In this case, you need to optimize the index structure of the data table.
6) key
The value of this field is the index actually used by MySQL.
It is worth noting that:The value of this field may not be a candidate index field listed by possible_keys. For example, the field to be returned by the current query SQL is the leftmost prefix matching field of an index field in the data table, however, if the SQL where condition does not use the index field of the data table, possible_keys may be NULL, and the key field value may be the data table index field that can cover the field to be queried, at this time, MySQL will scan the index tree, although inefficient, but it is faster than scanning the entire table. This scenario is exactly the case1 mentioned in the previous explanation of join_type = 'index.
In addition, you can use "force index" or "use index" in the select statement to force MySQL to use the candidate index fields listed in possible_keys.
7) key_len
The value of this field indicates the length of the key field. When MySQL uses a joint index field as the index used for SQL Execution, the key_len field can imply to what extent MySQL uses the Union Index (the leftmost prefix match field for how long. If the key field value is NULL, The key_len field value is also NULL.
8) ref
The value of this field indicates which column or constants in the data table will be used for comparison with the index specified by the key field.
9) rows
The value of this field indicates the number of rows that must be scanned when MySQL executes the query corresponding to this plan.
This value is of reference significance for SQL optimization. Generally, the smaller the value, the higher the query efficiency.
10) Extra
The value of this field contains other additional information when MySQL executes the query. This field may have many values. For details, refer to the instructions on the official website.
In addition to explain, MySQL also supportsExplain extendedCommand to analyze the optimizer execution plan. The latter has one more filtered field in the output result, and the output extra information can be analyzed using the show warnings statement.
3. How to optimize the SQL statement based on explain output
If you understand the meaning behind each field in the explain output result, optimizing SQL Performance accordingly will become efficient and reliable.
The concept of trouble shoot inefficient SQL using explain in engineering practice is related to the engineer's experience and ability. Here we recommend several technical documents to encourage others.
1) a PDF on the official Oreilly Website: Explain Demystified
2) A sharing document on slideshare: Mysql Explain Explained, which explains MySQL explain output step by step and the typical ideas for optimizing SQL accordingly, which is worth reading.
3) an article on the official technical blog of Meituan: MySQL index principle and slow query optimization. This article introduces the principles of B + Tree, it also provides several engineering scenarios using explain to optimize SQL, which is worth further reading.