When you use the Explain tool to view the execution plan of an SQL statement, you should be aware of the Select_type field when "DEPENDENT subquery" appears. You have fallen into the MySQL sub-query slow "pit" ... Let's look at a concrete example
There is such a query statement:
select Gid,count (ID) as COUNT from shop_goods G1 WHERE status =0 and GID in (SELECT gid from Shop_goods G2 WH ERE SID in (1519066,1466114,1466110,1466102,1466071,1453929))GROUP by GID;
With explain, the keyword "DEPENDENT subquery" appears, meaning that the first select of a subquery relies on external queries;
subquery: The first select in a subquery; DEPENDENT subquery: The first select in a subquery, depends on the outside query .
In other words, the subquery's query to G2 depends on the outer G1 query . It means two steps:
The first step, MySQL, is based on the select Gid,count (ID) from the shop_goods where status=0 GROUP by GID; A large result set T1 with a data volume of rows=850 672 up;
In the second step, each record in the above large result set T1 will consist of a new query statement with the subquery SQL: Select GID from Shop_goods where Sid in (15...blabla. ) and gid=%t1.gid%. is equal to saying that the subquery will be executed 850,000 times ... Even though these two-step queries are indexed, they are not slow;
As a result, the execution efficiency of a subquery is constrained by the number of records in the outer query, which is not as good as splitting into two separate query sequences .
The general optimization strategy for such statements is to split into two query statements that you do not want to split into two independent queries, or you can query with a temporary table join:
If you do not want to split into two independent queries, you can also query with a temporary table-linked table , as shown in the following optimized SQL:
SELECT g1.gid,count (1) From Shop_goods G1, (select gid from Shop_goods WHERE SID in (1519066,1466114,1466110,1466102,1466071,1453929)) G2 where G1.status=0 and G1.gid=g2.gid GROUP by G1.gid;
With explain look, this time there is a new keyword "DERIVED", meaning is used in the FROM clause in the case of subqueries. MySQL executes these subqueries recursively, putting the results in a temporary table, and then doing the join operation;
the official meaning of DERIVED is: used in cases where there is a subquery in the FROM clause. MySQL executes these subqueries recursively, putting the results in a temporary table.
Section 4.4 of the high-performance MySQL "MySQL query optimizer limitations (limitations of the MySQL Optimizer)", section 4.4.1, "Associating subqueries (correlated subqueries)" There is a similar discussion: when MySQL processes subqueries, it overwrites subqueries. Normally, we want to finish the query by the inner-to-outer, and then use the subquery to drive the outer query table.
For example: SELECT * from Test where the TID in (select Fk_tid from Sub_test where gid=10); Normally we would think that the SQL is executed in the following order: Sub_test table with the GID obtained F K_tid (2,3,4,5,6) record, and then to test, bring in tid=2,3,4,5,6, get query data.
But the actual MySQL is handled in the following ways:
SELECT * FROM test where exists (select * from Sub_test where gid=10 and Sub_test.fk_tid=test.tid); /c5>
MySQL will scan all the data in test, each data will be sent to the subquery associated with the sub_test, the subquery will not be executed first, so if the test table is large, then performance will be problematic.
MySQL sub-query slow problem