[Slow query optimization] use MySQL subqueries with caution, especially when DEPENDENTSUBQUERY is marked down.

Source: Internet
Author: User
[Slow query optimization] use MySQL subqueries with caution, especially when you see the DEPENDENTSUBQUERY tag. The case sorting time is as follows: I have repeatedly stressed the importance of explain in slow query optimization 1 and 2, but sometimes I cannot see how the explain results guide optimization. At this time, I still need some help with other basic knowledge, and even need

[Slow query optimization] use MySQL subqueries with caution, especially when you see the dependent subquery tag. The case sorting time is as follows: I have repeatedly stressed the importance of explain in slow query optimization 1 and 2, but sometimes I cannot see how the explain results guide optimization. At this time, I still need some help with other basic knowledge, and even need

[Slow query optimization] use MySQL subqueries with caution, especially when the dependent subquery tag is displayed.
Case study time: Preface:
  1. I have repeatedly stressed the importance of explain in slow query optimization 1 and 2, but sometimes I cannot see how the explain results guide optimization. At this time, I still need some help with other basic knowledge,You even need to understand the implementation principles of MySQL, such as slow query and optimization of subqueries..
  2. The select_type field in the SQL Execution Plan shows"DEPENDENT SUBQUERY", It's time to get up!

-- Why is MySQL subquery sometimes bad --

Introduction: Why is this subquery so slow?

The following example shows a slow query. The online execution time is rather exaggerated. Why?

SELECT gid, COUNT (id) as count?

FROM shop_goods g1

WHERE status = 0 and gid IN (?

SELECT gid FROM shop_goods g2 WHERE sid IN? (152.1666,1466114, 1466110,1466102, 1466071,1453929)

)

Group by gid;

Its execution plan is as follows. Please note the keyword "dependent subquery ":

??? Id? Select_type ???????? Table ?? Type ??????????? Possible_keys ?????????????????????????? Key ?????????? Key_len? Ref ?????? Rows? Extra ???? ?
------? ------------------? ------? --------------? --------------------------------------? ------------? -------? ------? ------? -----------
???? 1? PRIMARY ????????????G1????? Index ?????????? (NULL )????????????????????????????????? Idx_gid? 5 ??????? (NULL )?850672?Using where
???? 2?DEPENDENT SUBQUERY?G2????? Index_subquery? Id_shop_goods, idx_sid, idx_gid? Idx_gid? 5 ??????? Func ???????? 1? Using where

?

Basic knowledge: What does Dependent Subquery mean?

Official meaning:

SUBQUERY: The first select in the subquery;

Dependent subquery: The first select in the subquery,Depends on the external Query.

In other words, that is?The query method of the sub-query to g2 depends on the query of the outer g1..

What does it mean? It means two steps:

Step 1: MySQL depends on? Select gid, count (id) from shop_goods where?Status = 0 group by gid;? To get a large result set t1, the data volume is in the rows = 850672.

Step 2: each record in the above big result set t1 will form a new query statement with the subquery SQL: select gid from shop_goods where sid in (15... blabla .. 29) and gid = % t1.gid %. Equivalent to that,The subquery must be executed for 0.85 million times....... Even if indexes are used in the two queries, it is not surprising that it is not slow.

In this way,The execution efficiency of subqueries is restricted by the number of records in the outer query, so it is better to split it into two separate query orders for execution..

?

Optimization policy 1:

If you don't want to split it into two independent queries, you can alsoJoin Table query with a temporary table, As shown below:

SELECT g1.gid, count (1)

FROM shop_goods g1,(Select gid from shop_goods WHERE sid in (152.1666,1466114, 1466110,1466102, 1466071,1453929) g2

Where g1.status = 0 and?G1.gid = g2.gid

Group by g1.gid;

The same result can be obtained in milliseconds.

Its execution plan is:

??? Id? Select_type? Table ?????????? Type ??? Possible_keys ????????????? Key ??????????? Key_len? Ref ??????????? Rows? Extra ???????????????????????? ?
------? -----------? --------------? ------? -------------------------? -------------? -------? -----------? ------? -------------------------------
???? 1? PRIMARY ????? ????? ALL ???? (NULL )???????????????????? (NULL )???????? (NULL )?? (NULL )?????????? 30? Using temporary; Using filesort
???? 1? PRIMARY ????? G1 ????????????? Ref ???? Idx_gid ?????????????? Idx_gid ?? 5 ??????? G2.gid ?????? 1? Using where ?????????????????? ?
???? 2?DERIVED ?????Shop_goods? Range ?? Id_shop_goods, idx_sid? Id_shop_goods? 5 ??????? (NULL )?????????? 30? Using where; Using index ??????

The official meaning of DERIVED is:

DERIVED: used for subqueries in the from clause. MySQL recursively executes these subqueries and places the results in the temporary table.

?

Reference from DBA: weakness of MySQL subquery

Hidba states that (Reference resource 3):

Mysql will rewrite the subquery when processing the subquery.

In general, we want to complete the subquery results from the inside out, and then use the subquery to drive the external query table to complete the query.

For example:

Select * from test where tid in (select fk_tid from sub_test where gid = 10)

Generally, we will perceive the execution sequence of the SQL statement as follows:

Obtain the records of fk_tid (, 6) based on the gid in the sub_test table,

Then, enter tid = 2, 3, 4, 5, and 6 in test to obtain the query data.

However, the actual mysql processing method is as follows:

Select * from test where exists (

Select * from sub_test where gid = 10 and sub_test.fk_tid = test. tid

)

Mysql will scan all the data in the test, and each piece of data will be uploaded to the subquery for association with sub_test. The subquery will not be executed first, so if the test table is large, performance issues will occur.

?

Reference of high-performance MySQL

Section 4.4.1 of "Limitations of the MySQL Query Optimizer" in section 4.4 of High Performance MySQL "Correlated Subqueries) "There are similar discussions:

MySQL sometimes optimizes subqueries very badly, especially IN () subqueries in the where clause .......

For example, find all the film in the sakila database sakila. film table. The actoress of these film includes Penelope Guiness (actor_id = 1 ). You can write as follows:

Mysql> SELECT * FROM sakila. film

-> WHERE film_id IN (

-> SELECT film_id FROM sakila. film_actor WHERE actor_id = 1 );

Mysql> explain select * FROM sakila. film ...;

+ ---- + -------------------- + ------------ + -------- + ------------------------ +

| Id | select_type? ? ? ? | Table? ? ? | Type? | Possible_keys? ? ? ? ? |

+ ---- + -------------------- + ------------ + -------- + ------------------------ +

| 1? | PRIMARY? ? ? ? ? ? | Film? ? ? | ALL? ? | NULL? ? ? ? ? ? ? ? ? |

| 2? |?DEPENDENT SUBQUERY? | Film_actor | eq_ref | PRIMARY, idx_fk_film_id |

+ ---- + -------------------- + ------------ + -------- + ------------------------ +

According to the EXPLAIN output, MySQL scans the full table into the film table and performs subqueries on each row that is found. This is a bad performance. Fortunately, it is easy to rewrite a join query:

Mysql> SELECT film. * FROM sakila. film

-> Inner join sakila. film_actor USING (film_id)

-> WHERE actor_id = 1;

Another method isYou can use GROUP_CONCAT () to execute a subquery as a separate query and manually generate the IN () list. Sometimes faster than join. (Note: Could you try it on our database? SELECT goods_id, GROUP_CONCAT (cast (id as char ))

FROM bee_shop_goods

WHERE shop_id IN (152.1666,1466114, 1466110,1466102, 1466071,1453929)

Group by goods_id ;)

MySQL has been criticized for this type of subquery execution plan.

?

When is subquery good?

MySQL does not always optimize subqueries badly. Sometimes it is very optimized. The following is an example:

Mysql> explain select film_id, export age_id FROM sakila. film

-> Where not exists (

-> SELECT * FROM sakila. film_actor

-> WHERE film_actor.film_id = film. film_id

->) G

...... (Note: For more information, see high-performance MySQL)

Yes, subqueries are not always optimized very poorly. Specific issues are analyzed, but don't forget to explain.

?

Reference resources:

, Wudongxu, mysql sub-query (in) implementation;

2,2012, iteye, and MySQL subqueries are slow;

3,2011, hidba, mysql subquery weakness? And? Mysql subquery encountered in the production database;

Slow query series:

[Slow query optimization] when creating an index, pay attention to field selectivity & range query. Pay attention to the field order of the combined index.

[Slow query and optimization] pay attention to who is the driving table for Table query. You don't know who is better at join. Please let mysql determine on its own.

Several images are presented:

@ Yi Du-Pan junyong: Xu Shiwei this ppt, covering all his understanding of golang, we recommend that you read: http://t.cn/zRI8tIH

?

-Over-

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.