MySQL Query performance optimization Four

Last Update:2015-08-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL's universal "nested loops" are not optimal for every type of query. But fortunately, the MySQL query optimizer only
A few queries do not apply, and we can often make MySQL work efficiently by rewriting queries.
Let's take a look at the limitations of the MySQL optimizer first:

1. Associate Subqueries

MySQL's sub-query was implemented very poorly. The worst kind of query is a subquery that contains in () in the Where condition.
For example, we want to find all the movie information in the Sakila database, actor Penlope Guiness.
Naturally, we will implement the subquery in the following way:

   Select* from sakila.film  where in (  Select  fromwhere=1  )

It's easy for you to think that MySQL should execute this query from the inside out, and find out the match by the criteria in the subquery.
film_id. So you see what you would think this query might be like:

--SELECT Group_concat (film_id) from sakila.film_actor WHERE actor_id = 1;--result:1,23,25,106,140,166,277,361,438,499,506,509,605,635,749,832,939,970,980SELECT *  fromSakila.filmWHEREfilm_idinch(1, at, -,106, $,166,277,361,438,499,506,509,605,635,749,832,939,970,980);

Unfortunately, the opposite is true. MySQL wants to use external association conditions to quickly filter subqueries, which may be considered
This makes the subquery more efficient. MySQL will rewrite the query like this:

SELECT *  from Sakila.film WHERE EXISTS (SELECT*fromWHERE=1and= film.film_id);

In this case, the subquery will depend on the data of the external table and will not be executed preferentially.
MySQL will scan the film table in full form and then loop through the subquery. In the case of very small appearances,
There will be no problem, but the performance will be very poor in the case of a large appearance. Fortunately,
It is easy to rewrite with an associated query.

MySQL>SELECT film. *  from Sakila.film  - INNER JOIN sakila.film_actor USING (film_id)  - WHERE = 1;

The other good optimization method is to manually generate a list of in () with Group_concat. Sometimes even than join query
Faster. In summary, although in () subqueries work poorly in many cases, exist () or other equivalent subqueries
Sometimes it's good to work.

Correlated subquery performance is not always bad.

Subquery VS Correlated Query

--Correlated subqueries

 mysql>  explain select  film_id, language_id from   sakila.film  where  not   Exsits ( select  *  from   Sakila.film_actor  where  film_actor.film_id =   film.film_id)

********************* 1. Row ***********************************
id:1
select_type:primary
Table:film
Type:all
possible_keys:null
Key:null
key_len:null
Ref:null
rows:951
extra:using where

********************* 2. Row ***********************************
Id:2
select_type:dependent subquery
Table:film_actor
Type:ref
possible_keys:idx_fx_film_id
key:idx_fx_film_id
Key_len:2
ref:film.film_id
rows:2
extra:using where; Using Index

--Correlation query
MySQL>Select from sakila.film

　　　　 Left outer Join sakila.film_actor using (film_id) where  is NULL

1. Row ***********************************
Id:1
Select_type:simple
Table:film
Type:all
Possible_keys:null
Key:null
Key_len:null
Ref:null
rows:951
Extra:

********************* 2. Row ***********************************
id:1
Select_type:simple
Table:film_actor
Type:ref
possible_keys:idx_fx_film_id
key:idx_fx_film_id
Key_len:2
ref:sakila.film.film_id
Rows:2
extra:using where; Using Index;not exists;

As you can see, the execution plan here is almost the same, here are some subtle differences:
1. Table Film_actor access type one is dependent subquery another is simple, which is not any different for the underlying storage engine interface;

2. The second query to the film table does not have a using where, but this is not important. The using clause and the WHERE clause are actually exactly the same.

3. The second table film_actor the extra of the execution plan is "not exists" this is the early termination algorithm we mentioned earlier, MySQL is optimized by not exits
to avoid reading any extra rows in the index of the table Film_actor. This is equivalent to using not exist directly, which is the same in the execution plan, once matched to a line
data, stop scanning immediately

the test results are:
Query Query results per second (QRS)
Not EXISTS subquery
Left OUTER JOIN 425
This shows that using subqueries is slightly slower.

Another example:
However, each specific case will be different, and sometimes the sub-query writing will be faster. For example, when you return some columns of a table with only one result.
It sounds like this is going to be a good performance for correlated queries. Specific analysis, such as the following link, we would like to return all the film containing the same actor
Because there are a lot of actors in the movie, it's possible to return some duplicate records.

MySQL,Selectfrominnerjoin Sakila.film_actor using (film _ID)

We need to use distinct and group by to remove duplicate records.

MySQL-selectdistinct from sakila.film  inner  Join Sakila.film_actor using (film_id)

But, looking back at this query, what is the meaning of the results returned by this query? At least that would make the meaning of SQL less obvious.
If there is a exists, it is easy to express the logic of "including the same actor". And there is no need to use distinct and Group by, and there will be no duplicate result set.
We know that once distinct and group by are used, a temporary intermediate table is usually required during query execution.

 - Select  from Sakila.film_actor where exists (Select* from sakila.film_actor  where= film_ ACTOR.FILM_ID)

The test results are:
Query Query results per second (QRS)
INNER JOIN 185
EXISTS Sub-query 325
This shows that using subqueries is slightly faster.

Through the above detailed case, the main want to explain two points:
One is that there is no need to listen to the "absolute truth" about subqueries (that is, do not use subqueries)
The second is the hypothesis that tests should be used to verify the execution of subqueries and response times.

MySQL Query performance optimization Four

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL Query performance optimization Four

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support