MySQL Using temporary; Using filesort inner join optimization, filesortinner

Source: Internet
Author: User

MySQL Using temporary; Using filesort inner join optimization, filesortinner
Problem

Using the show full processlist statement, you can easily find the problematic SQL statement, as shown below:

SELECT post.*FROM postINNER JOIN post_tag ON post.id = post_tag.post_idWHERE post.status = 1 AND post_tag.tag_id = 123ORDER BY post.created DESCLIMIT 100

Note: Because post and tag are many-to-many relationships, an association table post_tag exists.

Try to use EXPLAIN to query the SQL Execution Plan (Due to space limitations, the results are deleted ):

+----------+---------+-------+-----------------------------+| table    | key     | rows  | Extra                       |+----------+---------+-------+-----------------------------+| post_tag | tag_id  | 71220 | Using where; Using filesort || post     | PRIMARY |     1 | Using where                 |+----------+---------+-------+-----------------------------+

The following shows the optimized SQL statement. The only change is to change the connection method to "STRAIGHT_JOIN 」:

SELECT post.*FROM postSTRAIGHT_JOIN post_tag ON post.id = post_tag.post_idWHERE post.status = 1 AND post_tag.tag_id = 123ORDER BY post.created DESCLIMIT 100

Try to use EXPLAIN to query the SQL Execution Plan (Due to space limitations, the results are deleted ):

+----------+----------------+--------+-------------+| table    | key            | rows   | Extra       |+----------+----------------+--------+-------------+| post     | status_created | 119340 | Using where || post_tag | post_id        |      1 | Using where |+----------+----------------+--------+-------------+

Compared with the results of the two EXPLAIN statements before and after optimization, the optimized SQL statement is larger than the "rows" statement, but does not have the "Using filesort" statement. In summary, the performance is improved.

Reminder: Pay attention to the order in which each table appears in the two EXPLAIN results. It will be explained later.

Explanation

For the first SQL statement, why does the MySQL optimizer select a time-consuming execution solution? For the second SQL statement, why is the performance improved after the connection method is changed to STRAIGHT_JOIN?

All this has to begin with the MySQL multi-Table connection processing method. First, the MySQL optimizer needs to determine who is driving the table, that is, which table is used as the benchmark. When dealing with such problems, the MySQL optimizer adopts a simple and crude solution: When the result set of a table is small, the table is used as the driving table. Of course, the actual processing method of the MySQL optimizer is much more complicated, for details, refer to: how to select the index and JOIN sequence of the MySQL optimizer.

Note: In the EXPLAIN result, the table that appears in the first row is the driver table.

In the example of continuing post to connect to post_tag, the MySQL optimizer has the following two options:

  • Use post as the driving table and use status_created index to filter. The result set contains 119340 rows.
  • Use post_tag as the driving table and use the tag_id index to filter the table. The result set contains 71220 rows.

Obviously, the post_tag filtering result set is smaller, so the MySQL optimizer selects it as the driver table. Sadly, we still need to sort it by the created field in the post table, that is to say, the sorting field is not in the driver table, so "Using filesort" and even "Using temporary" are inevitable 」.

Knowing the ins and outs makes optimization easy. To ensure that the sorting fields are in the driving table as much as possible, you must use post as the driving table. Therefore, you must use STRAIGHT_JOIN to force the join order.

In fact, in some special cases, the sorting field may not be in the driver table. For example, the result set of the driver table has only one row of records, and when connecting to other tables, the index except the connection field, the sorting field is also included. After the table is connected, the data in the index is arranged in order.

Now, when talking about the problem, you may encounter a problem similar to the following: the query statements that run well may suddenly become bad after a while. A major possible cause is that the data distribution has changed, resulting in a change in the MySQL optimizer's selection of the driver table, resulting in an index failure, so it is best to check more and pay attention to these situations.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.