MySQL's universal "nested loops" are not optimal for every type of query. But fortunately, the MySQL query optimizer only
A few queries do not apply, and we can often make MySQL work efficiently by rewriting queries.
Let's take a look at the limitations of the MySQL optimizer first:
1. Associate Subqueries
MySQL's sub-query was implemented very poorly. The worst kind of query is a subquery that contains in () in the Where condition.
For example, we want to find all the movie information in the Sakila database, actor Penlope Guiness.
Naturally, we will implement the subquery in the following way:
Select* from sakila.film where in ( Select fromwhere=1 )
It's easy for you to think that MySQL should execute this query from the inside out, and find out the match by the criteria in the subquery.
film_id. So you see what you would think this query might be like:
--SELECT Group_concat (film_id) from sakila.film_actor WHERE actor_id = 1;--result:1,23,25,106,140,166,277,361,438,499,506,509,605,635,749,832,939,970,980SELECT * fromSakila.filmWHEREfilm_idinch(1, at, -,106, $,166,277,361,438,499,506,509,605,635,749,832,939,970,980);
Unfortunately, the opposite is true. MySQL wants to use external association conditions to quickly filter subqueries, which may be considered
This makes the subquery more efficient. MySQL will rewrite the query like this:
SELECT * from Sakila.film WHERE EXISTS (SELECT*fromWHERE=1and= film.film_id);
In this case, the subquery will depend on the data of the external table and will not be executed preferentially.
MySQL will scan the film table in full form and then loop through the subquery. In the case of very small appearances,
There will be no problem, but the performance will be very poor in the case of a large appearance. Fortunately,
It is easy to rewrite with an associated query.
MySQL>SELECT film. * from Sakila.film - INNER JOIN sakila.film_actor USING (film_id) - WHERE = 1;
The other good optimization method is to manually generate a list of in () with Group_concat. Sometimes even than join query
Faster. In summary, although in () subqueries work poorly in many cases, exist () or other equivalent subqueries
Sometimes it's good to work.
Correlated subquery performance is not always bad.
Subquery VS Correlated Query
--Correlated subqueries
mysql> explain select film_id, language_id from sakila.film where not Exsits ( select * from Sakila.film_actor where film_actor.film_id = film.film_id)
********************* 1. Row ***********************************
id:1
select_type:primary
Table:film
Type:all
possible_keys:null
Key:null
key_len:null
Ref:null
rows:951
extra:using where
********************* 2. Row ***********************************
Id:2
select_type:dependent subquery
Table:film_actor
Type:ref
possible_keys:idx_fx_film_id
key:idx_fx_film_id
Key_len:2
ref:film.film_id
rows:2
extra:using where; Using Index
--Correlation query
MySQL>Select from sakila.film
Left outer Join sakila.film_actor using (film_id) where is NULL
1. Row ***********************************
Id:1
Select_type:simple
Table:film
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:951
Extra:
********************* 2. Row ***********************************
id:1
Select_type:simple
Table:film_actor
Type:ref
possible_keys:idx_fx_film_id
key:idx_fx_film_id
Key_len:2
ref:sakila.film.film_id
Rows:2
extra:using where; Using Index;not exists;
As you can see, the execution plan here is almost the same, here are some subtle differences:
1. Table Film_actor access type one is dependent subquery another is simple, which is not any different for the underlying storage engine interface;
2. The second query to the film table does not have a using where, but this is not important. The using clause and the WHERE clause are actually exactly the same.
3. The second table film_actor the extra of the execution plan is "not exists" this is the early termination algorithm we mentioned earlier, MySQL is optimized by not exits
to avoid reading any extra rows in the index of the table Film_actor. This is equivalent to using not exist directly, which is the same in the execution plan, once matched to a line
data, stop scanning immediately
the test results are:
Query Query results per second (QRS)
Not EXISTS subquery
Left OUTER JOIN 425
This shows that using subqueries is slightly slower.
Another example:
However, each specific case will be different, and sometimes the sub-query writing will be faster. For example, when you return some columns of a table with only one result.
It sounds like this is going to be a good performance for correlated queries. Specific analysis, such as the following link, we would like to return all the film containing the same actor
Because there are a lot of actors in the movie, it's possible to return some duplicate records.
MySQL,Selectfrominnerjoin Sakila.film_actor using (film _ID)
We need to use distinct and group by to remove duplicate records.
MySQL-selectdistinct from sakila.film inner Join Sakila.film_actor using (film_id)
But, looking back at this query, what is the meaning of the results returned by this query? At least that would make the meaning of SQL less obvious.
If there is a exists, it is easy to express the logic of "including the same actor". And there is no need to use distinct and Group by, and there will be no duplicate result set.
We know that once distinct and group by are used, a temporary intermediate table is usually required during query execution.
- Select from Sakila.film_actor where exists (Select* from sakila.film_actor where= film_ ACTOR.FILM_ID)
The test results are:
Query Query results per second (QRS)
INNER JOIN 185
EXISTS Sub-query 325
This shows that using subqueries is slightly faster.
Through the above detailed case, the main want to explain two points:
One is that there is no need to listen to the "absolute truth" about subqueries (that is, do not use subqueries)
The second is the hypothesis that tests should be used to verify the execution of subqueries and response times.
MySQL Query performance optimization Four