MySQL execution Plan explain detailed 2015-08-10 13:56:27
Category: MySQL
The explain command is the primary way to see how the query optimizer decides to execute a query.
This feature has limitations and does not always tell the truth, but its output is the best information that can be obtained, and it is worth taking the time to understand, because you can learn how the query executes.
Call explain
Adding Explain,mysql before select sets a tag on the query that, when executing the query plan, causes it to return information about each step in the execution plan, rather than executing it.
It returns one or more rows of information, showing each part of the execution plan and the order of execution.
This is a simple explain effect:
Each table in the query has only one row in the output, and if the query is a join of two tables, there will be two rows in the output.
The alias form is counted as a table, so if you join a table with yourself, there will be two rows in the output.
The meaning of "table" here is quite wide, can be a subquery, a union result and so on.
And there are two variants of explain.
EXPLAIN extended will tell the server to "reverse compile" the execution plan as a SELECT statement.
You can see this generated statement immediately after running show warnings, which comes directly from the execution plan, not the original SQL statement, which has become a data structure.
In most scenarios, it is not the same as the original statement, and you can detect how the query even Citi translates the statements.
EXPLAIN extended is available in MySQL 5.0 and above, and a filtered column is added in 5.1.
EXPLAIN partitions will display the partition that the query will access if the query is based on a partitioned table.
It will be available in MySQL version 5.1 and above.
Explain restrictions:
· Explain will never tell you how triggers, stored procedures, or UDFs affect queries
· Stored procedures are not supported, although you can manually extract queries and explain them individually
· It will not tell you the specific optimizations that MySQL has made in the execution plan
· It does not display all information about the execution plan of the query
· It does not differentiate between things with the same name, for example, it uses "Filesort" for both memory arrangement and temporary files, and displays "Using temporary" for both on-disk and in-memory temporary tables
· Can be misleading, for example, it will display a full index scan for a query with a very small limit (MySQL 5.1 explain will show more accurate information about the number of rows checked, but earlier versions did not consider limit)
overriding non-select queries
MySQL explain can only interpret select queries and does not interpret stored program calls and INSERT, UPDATE, DELETE, or other statements. However, you can rewrite some non-select queries to take advantage of explain. To achieve this, simply convert the statement to an equivalent select that accesses all the same columns, and any volume column must be in the select list, the associated clause, or the WHERE clause.
If you want to rewrite the following update statement so that it can take advantage of explain
- UPDATE Sakila.actor
- INNER JOIN sakila.film_actor USING (actor_id)
- SET actor.last_update=film_actor.last_update;
The following explain statement is not equivalent to the above update because it does not require the server to get the Last_update column from any one of the tables
This difference is very important. For example, the output shows that MySQL will use the overwrite index, but when retrieving and updating the last_updated column, the overwrite index cannot be used, and the following rewriting method is closer to the original statement:
Rewriting queries like this is not very scientific, but it's good enough to help understand how queries are done.
(MySQL 5.6 will allow interpretation of non-select queries)
When displaying a query plan, it is important to understand that there is no "equivalent" read query for a write query. A select query only needs to find a copy of the data and return it. Any query that modifies the data must find and modify all of its copies on all indexes, which is often much more expensive than a seemingly equivalent select query.
columns in the explain
The meaning of each column in the explain results is shown in the next section.
The rows in the output appear in the order in which MySQL actually executes the query part, which is not always consistent with the original SQL.
"id column"
This column always contains a number that identifies the row to which the select belongs, and if there is no subquery or union in the statement, there will be only a unique select, so each row will display a 1 in the column, otherwise the inner SELECT statement will generally be numbered sequentially, corresponding to its position in the original statement.
MySQL will select query atmosphere simple and complex types, complex types can be divided into three categories: simple subqueries, so-called derived tables (subqueries in the FROM clause), and union queries.
The following is a simple subquery:
The subquery and union in the FROM clause adds more complexity to the ID column.
The following is a basic subquery in the FROM clause:
As you know, this query executes with an anonymous temporary table, and MySQL internally references the temporary table in the outer query through an alias (Der), and you can see the ref column in more complex queries.
Finally, here is a union query:
Note that the third additional row, the result of the Union is always placed in an anonymous temporary table, after MySQL reads the result out of the temporary table, the temporary table does not appear in the original SQL, so its ID is listed as null.
Compared to the previous example (in the FROM clause that demonstrates the subquery), the temporary table generated from the query appears in the result in the last row instead of the first row.
So far these are very straightforward, but the mix of these three types of statements makes the output very complex, as we'll see later.
"select_type column"
This column shows whether the corresponding row is a simple or complex select (which, if the latter, is one of the three complex types). The simple value means that the query does not include the subquery and union, and if the query has any responsible sub-parts, the outermost part is marked as primary, and the other parts are marked as follows:
Subquery
Select (in other words, not in the FROM clause) that is contained in a subquery in the select list is marked as subquery
DERIVED
The derived value is used to denote that the select,mysql contained in a subquery in the FROM clause is executed recursively and the result is placed in a temporary table. The server internally calls its "derived table" because the temporary table is derived from the subquery.
UNION
The second and subsequent select in the Union is marked as Unoin, and the first select is marked as if it were executed as a partial out query. This is why the first select in the union in the previous example is shown as primary. If the union is contained by a subquery in the FROM clause, then its first select is marked as derived.
UNION RESULT
The select that is used to retrieve the results from an anonymous temporary table from union is marked as union result.
In addition to these values, subquery and union can also be marked as dependent and uncacheable.
Dependent means that select relies on the data found in the outer query.
Uncacheable means that some features in select prevent the results from being cached in a item_cache.
(Item_cache is not documented, it is not the same as the query cache, although it can be negated by some of the same types of artifacts, such as the rand () function.) )
"table column"
This column shows which table the corresponding row is accessing and, in general, it is fairly straightforward: which table it is, or the table's name (if an alias is defined in SQL).
You can see from the top of this column that the MySQL Association Optimizer selects the Association order for the query, for example, you can look at the order in which MySQL chooses the association order in the following query different from the one specified in the statement:
The MySQL execution plan is always the left depth priority tree, and if you drop the plan, you can read the leaf nodes sequentially, which correspond directly to the rows in explain, and the previous query plan looks like this:
derived tables and unions
Table columns can become much more complex when there are subqueries or unions in the FROM clause, and there is really no "table" to refer to because the anonymous temporary table created by MySQL only exists during query execution.
When there is a subquery in the FROM clause, the table column is in the form of <derivedN>, where n is the ID of the subquery. This is always "forward reference"-in other words, n points to the following line in the explain output.
When there is a union, the table column of union result contains a list of IDs that participate in the Union. This is always "backward reference", since union result appears after all the participating rows in the union, and if more than 20 id,table Vallecano in the list are truncated to prevent too long, it is not possible to see all the values at this time. Fortunately, you can still guess which lines are included, because you can see the ID of the first row, and everything that appears between this line and union result is somehow contained.
An example of a complex select type
Here is a meaningless query, which we use as a compact example of some complex select type
The limit clause is only for convenience, in case you intend not to implement it in explain manner to see the results.
Here are some of the results of explain:
We deliberately let each query part access different tables so that we can figure out where the problem is, but it's still hard to solve, starting with the above:
Line 1th forward refers to der_1, which is marked as <DERIVED3>, and in the original SQL is line 2nd, to see which rows in the output refer to the SELECT statement in <derived3>, looking down.
Line 2nd, whose ID is 3 because it is part of the 3rd select in the query, is classified as the derived type because it is nested inside the subquery in the FROM clause and is in line 4th in the original SQL.
The 3rd row has an ID of 2, and the 3rd line in the original SQL, it is reasonable to note that it is behind a line with a higher ID, implying that it is executed later. It is classified as dependent subquery, which means that the result depends on the outer query (i.e., a correlated subquery). The outer query in this example starts at line 2nd and retrieves the select of the data from the der_1.
Line 4th is classified as union, which means that it is the 2nd or subsequent select in the Union, whose table is <derived6>, which means that the data is retrieved from the subquery of the clause from and attached to the temporary table of Union. As before, to find the explain line that displays the query plan for this subquery, look down.
Line 5th is the der_2 subquery in line 8th of the original SQL, explain called it <derived6>.
Line 6th is an ordinary subquery in the select list of <derived6>, which has an ID of 7, which is very important ...
...... Because it is larger than 5, and 5 is the ID of line 7th. Why is it important? Because it shows the boundaries of the <derived6> subquery. When explain outputs a row of select type derived, a "nested range" is started. If the ID of the subsequent row is smaller (in this case, 5 is less than 6), it means that the nested range has been closed. This lets us know that line 7th is part of the select list that retrieves data from <derived6>-for example, the first part of the select list in line 4th (the 7th line in the original SQL). This example is fairly easy to understand and does not need to know the meaning and rules of nested scopes, but it is sometimes not so easy. About this line in the output another thing to note is that since there are user variables, it is listed as uncacheable subquery.
The last line of union result, which represents the stage in which rows are read from a temporary table in union. You can start with this line backwards, and if you like, it returns the row results with IDs 1 and 4, which refer to <derived3> and <derived6> respectively
As you can see, the combination of these complex select types makes the output of the explain very difficult to understand, and it is easier to understand the rules, but it still takes a lot of time.
Reading the output of explain often needs to jump in the list, for example, to see the 1th line of output, just staring at, it is impossible to know that it is part of the Union, only to see the last 1 lines you will understand.
the Type column
"
The MySQL user manual says this column shows the "Association type", but we think it's more accurate to say that the type of access--in other words, MySQL decides how to find rows in a table. The following are the most important methods of access, from worst to best:
All:
This is called a full-table scan, meaning that MySQL has to scan the entire table, from beginning to end, to find the rows that are needed. (There is an exception, such as using limit in the query, or "Using Distinct/not exists" in the extra column.)
Index
This is the same as a full-table scan, except that MySQL scans the table in index order instead of rows, its main advantage is to avoid sorting, the biggest disadvantage is to bear the cost of reading the entire table in indexed order. This usually means that if the rows are accessed in random order, the overhead will be very large.
If you see "Using index" in the extra column, it means that MySQL is using index overrides, which only scans the indexed data instead of each row in the index order, which is much less expensive than a full table scan in indexed order.
Range
A range scan is a restricted index scan that starts with a certain point in the index and returns a row that matches that value, which is a bit better than a full index scan because it doesn't have to traverse all the indexes, and the obvious scan is a query with between or a > in the WHERE clause.
When MySQL uses an index to find a range of values, such as in () and or lists, it is also shown as a range scan, however, these two are quite different types of access and have significant performance differences.
The cost of such a scan is equivalent to the index type.
Ref
This is an index access (sometimes called an index lookup) that returns all rows that match a single value, however, it may find more than one qualifying row, so it is a mixture of lookups and scans, which can only occur if the non-unique prefix of a non-unique index or a uniqueness index is used. It is called ref because the index is compared to a reference value. This reference value is either a constant or a result value from the previous table in a multi-sheet query.
Ref_or_null is a variant above ref, which means that MySQL must perform a second lookup in the results of the initial lookup to find the null entry.
Eq_ref:
With this index lookup, MySQL knows that it will return at most one qualifying record, which can be seen when MySQL uses a primary key or a unique index lookup, which compares them to a reference value. MySQL has done an excellent job of optimizing this type of access because it knows that there is no need to estimate the range of matching rows or to continue looking after matching rows have been found.
Const,system:
When MySQL can optimize a part of a query and convert it to a constant, he will use these types of access, for example, if you select the primary key of this row by placing the primary key of a row in a WHERE clause, MySQL can convert the query into a constant. You can then effectively remove the table from the join execution.
Null:
This access means that MySQL can break down query statements during the optimization phase, or even need to access tables or indexes at the execution stage. For example, selecting the minimum value from an index column can be done by looking up the index separately, without having to access the table at execution time.
"Possible_keys column"
This column shows which indexes the query can use, based on the columns that the query accesses and the comparison operators that are used. This list is created early in the optimization process, so some of the listed indexes may not be useful for subsequent optimization processes.
"key Column"
This column shows which index MySQL decides to use to optimize access to the table. If the index does not appear in the Possible_keys column, then MySQL chooses it for another reason-for example, it may choose an overwrite index, even if there is no WHERE clause.
In other words, Possible_keys reveals which indexes can help you find efficiently, and key shows which index is optimized to minimize the cost of the query. Here is an example:
"key_len column"
This column shows the number of bytes that MySQL uses in the index, and if MySQL is using only some of the columns in the index, then you can use this value to figure out which columns to use, and keep in mind that MySQL 5.5 and previous versions can only work with the leftmost prefix of the index, for example, film_ The primary key for the actor is two smallint columns, and each smallint column is two bytes, so each entry in the index is 4 bytes, and here's an example of a query:
Based on the Key_len column in the results, you can infer that the query uses a unique first column--actor_id column to perform an index lookup, and when we calculate the usage of the column, be sure to consider the character set page of the character columns.
To view the execution plan:
The average length in this query is 13 bytes, which is the total length of column A and column B, the column A is 3 characters, each of the UTF8 is a maximum of 3 bytes, and column B is a 4-byte integer.
MySQL does not always show how much an index really uses, for example, if you perform a like query on a prefix pattern match, it will show that the full width of the column is being used.
The Key_len column shows the maximum possible length in the indexed field, not the actual number of bytes used in the table data, and in the previous example MySQL always shows 13 bytes, even if a column happens to contain only one character length. In other words, Key_len is computed by looking up the definition of a table, not the data in the table.
"ref column"
This column shows the columns or constants used by the previous table to look up values in the index of the key column record, below is an example that shows the association condition and alias Combination, and note that the Ref column reflects how the film table is aliased in the query text:
"Rows column"
This column is the number of rows that MySQL estimates to read in order to find the desired row. This number is the number of loops in the inline Loop Association plan, which means that it is not the number of rows that MySQL thinks it will eventually read from the table, but the average of the rows that MySQL must read in order to find the rows that meet the criteria at each point in the query. (This standard includes the conditions given in SQL and the top of a table from the join order.) )
Depending on the statistics of the table and the selection of the index, this estimate may be very imprecise, and in mysql5.0 and earlier versions, it also does not reflect the limit clause, for example, the following query does not really check 1057 rows.
By multiplying the values of all rows columns, you can roughly estimate the number of rows that the entire query will check, for example, the following query will check about 2600 rows.
Keep in mind that this number is the number of rows that MySQL thinks it wants to check, not the number of rows in the result set, but also realize that there are many optimizations, such as associative buffers and caches, that cannot affect the display of rows, and that MySQL may not have to really read all the rows it estimates. It also does not know any information about the operating system or the hardware cache.
"Extra column"
This column contains additional information that is not suitable for display in other columns. Most of the values that can appear here are recorded in the MySQL user manual.
The most important values that are common are as follows.
"Using Index"
This value indicates that MySQL will use the overwrite index to avoid accessing the table. Do not confuse the overlay index with the index access type.
"Using where"
This means that the MySQL server will filter after the storage engine retrieves rows, and many of the where conditions involve columns in the index, and when (and if) it reads the index, it can be inspected by the storage engine, so not all queries with a WHERE clause will display a "Using where". Sometimes the presence of a "Using where" is a hint: A query can benefit from a different index.
"Using Temporary"
This means that MySQL uses a temporary table when sorting the results of the query.
"Using Filesort"
This means that MySQL sorts the results by using an external index instead of reading rows from the table in the index order. MySQL has two kinds of file sorting algorithms, both of which can be done in memory or on disk, explain will not tell you which sort of file MySQL will use, and will not tell you whether the sort will be done in memory or on disk.
"Range checked for each record (index Map:n)"
This means that there is no good index, the new index will be recalculated on each line of the join, N is the bitmap that is displayed in the Possible_keys column, and is redundant.
output in a tree-shaped format
MySQL users often prefer to format the output of explain as a tree to show the execution plan more accurately. In fact, the way explain looks at the execution plan is a bit clumsy, the tree structure is not suitable for tabular output, and when there are a large number of values in the extra column, the disadvantage is even more obvious, as is the case with union, which is not the same as other types of joins that MySQL can do. It is not very suitable for explain.
If you have a good understanding of the rules and characteristics of explain, it is possible to use a tree-shaped execution plan. But this is a bit boring, it is best left to the automated tool processing, Percona Toolkit contains pt-visual-explain, it is such a tool.
improvements in MySQL 5.6
mysql5.6 will contain an important improvement to explain: the ability to interpret queries like update, INSERT, and so on, although you can convert DML statements to equivalent "select" Queries and explain, the results do not fully reflect how statements are executed. Thus this is still very helpful. When developing a pt-upgrade that uses similar Percona toolkit, we have discovered more than once that the optimizer does not execute as we expect in the code path when the query is converted to select. Thus explain a query without the need to convert to select is very helpful for us to understand exactly what happens during execution.
mysql5.6 will also include a series of modifications to the query optimization and execution engine, allowing anonymous temporary tables to be materialized as late as possible, rather than always creating and populating them when using partial queries to this temporary table, which allows MySQL to directly interpret query statements for the tape query without actually executing the subquery first.
Finally, mysql5.6 will improve the relevant parts of the optimizer by adding optimization tracking capabilities to the server, allowing users to view the choices that the optimizer sits on, as well as the input (for example, the cardinality of the index) and the reason for the decision. This is very helpful, not only for understanding the execution plan of the server selection, but also for why this plan is chosen.
MySQL Execution Plan explain detailed