MySQL Optimization principle _ Small Table driver large table in and exists rational use

Last Update:2018-06-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Suppose A For loop
 for 0 10000; $i + +
{
     for0, $j + +)
     {

     }
}

 for 0 ; $i + +
{
    for010000; $j + +)
    {

    }
}

Looking at the two for loops above, the total number of loops is the same. But for the MySQL database, this is not the case, we try to choose the ② for loop, that is, the small table driver large table.
The most frustrating thing about the database is the release of the program link, the first set up 10,000 links, the second set up 50 times. Assuming that the link is two times, every time you do millions of data set query, check out and go, so that only done two times, instead of creating millions of links, apply for link release repeated repeated, so the system can not stand it.
This is the birth of in and exists contrast.

Small table-Driven large tables: Small data sets drive large datasets.

This assumes that table a represents the employee table, and B represents the department table.
Assuming that there are only three departments, sales, Technical Department, Administration department, the implication is that in these three departments of all employees are identified.

Select  from where inch (Select from B);

This writing is equivalent to:
The for select ID from B. For example, Huawei has 100 departments, but the employees of Huawei have less 15w-20w and more employees than departments, this time it is equivalent to get a small table (department table); for select * from A where a.id = b.ID, equivalent to the a.id in the B table, equivalent to get the corresponding ID from the Department table.

When the data set of table B must be less than the data set of table A, in is better than exists.
Instead

 select  * from  A where  exists ( select  1  from  B  where  b.id = a.id); //  Here the Select 1 is not absolute, can be written as select ' X ' or ' A ', ' B ', ' C ' can, as long as the constant is OK.

This is equivalent to:
for select * from A, first loop from table a
for select * from b where b.id = a.ID, then loop from table B.
So exists will turn out to see if the a table exists in (select 1 from B where b.id = a.id), this query returns the bool value of TRUE or false, which is simply to use exists better than in when the data set of table A is smaller than the data set of Table B. Note that the ID field of table A and B should be indexed.

syntax: EXISTS
select ... From table WHERE EXISTS (subquery).
understanding: Placing the data of the main query in a subquery for conditional validation, Depending on the validation results (true or FALSE), the data results of the query are retained.

supplement:
1:exists (subquery) only returns TRUE or FALSE, so the SELECT * in subquery It can also be select 1 or select ' X ', and the official statement is that it ignores the select list when it is actually executed, so there is no difference. The actual execution of the
2:exists subquery may have been optimized rather than a comparison of our understanding, If you are concerned about efficiency, you can actually check it.
3:exists subqueries can be replaced with conditional expressions, other subqueries or joins, What is optimal requires specific analysis of the particular problem.

If the two table size of the query is equal, then the in and exists are not very different.

Extension examples Consolidate:

If one of the two tables is smaller and one is a large table, then the subquery table is large with exists, and the subquery table is small in:
Example: Table A (small table), table B (large table)

Select  from where inch (Select from B); //   inefficient, using the index of the CC column on table A; Select fromwhere exists (select  fromwhere cc=a.cc); //

The opposite

Select  from where inch (Selectfrom// High efficiency, using the index of the CC column on the b table;Selectfromwhere exists (selectfromwhere cc=b.cc); // inefficient, using the index of the CC column on table A.

not-in and not-exists if the query statement uses not-in to perform a full-table scan of the outer surface, the index is not used, and the index on the table is still used by not Extsts's subquery. So no matter the table is large, using not exists is faster than not.

MySQL Optimization principle _ Small Table driver large table in and exists reasonable use

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL Optimization principle _ Small Table driver large table in and exists rational use

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL Optimization principle _ Small Table driver large table in and exists rational use

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support