MySQL Optimization principle _ Small Table driver large table in and exists rational use

Source: Internet
Author: User

Suppose A For loop
for 0 10000; $i + +
{
for0, $j + +)
{

}
}

for 0 ; $i + +
{
for010000; $j + +)
{

}
}

Looking at the two for loops above, the total number of loops is the same. But for the MySQL database, this is not the case, we try to choose the ② for loop, that is, the small table driver large table.
The most frustrating thing about the database is the release of the program link, the first set up 10,000 links, the second set up 50 times. Assuming that the link is two times, every time you do millions of data set query, check out and go, so that only done two times, instead of creating millions of links, apply for link release repeated repeated, so the system can not stand it.
This is the birth of in and exists contrast.

Small table-Driven large tables: Small data sets drive large datasets.

This assumes that table a represents the employee table, and B represents the department table.
Assuming that there are only three departments, sales, Technical Department, Administration department, the implication is that in these three departments of all employees are identified.

Select  from where inch (Select from B);

This writing is equivalent to:
The for select ID from B. For example, Huawei has 100 departments, but the employees of Huawei have less 15w-20w and more employees than departments, this time it is equivalent to get a small table (department table); for select * from A where a.id = b.ID, equivalent to the a.id in the B table, equivalent to get the corresponding ID from the Department table.

When the data set of table B must be less than the data set of table A, in is better than exists.
Instead

 select  * from  A where  exists ( select  1  from  B  where  b.id = a.id); //  Here the Select 1 is not absolute, can be written as select ' X ' or ' A ', ' B ', ' C ' can, as long as the constant is OK. 

This is equivalent to:
for select * from A, first loop from table a
for select * from b where b.id = a.ID, then loop from table B.
So exists will turn out to see if the a table exists in (select 1 from B where b.id = a.id), this query returns the bool value of TRUE or false, which is simply to use exists better than in when the data set of table A is smaller than the data set of Table B. Note that the ID field of table A and B should be indexed.

syntax: EXISTS
select ... From table WHERE EXISTS (subquery).
understanding: Placing the data of the main query in a subquery for conditional validation, Depending on the validation results (true or FALSE), the data results of the query are retained.

supplement:
1:exists (subquery) only returns TRUE or FALSE, so the SELECT * in subquery It can also be select 1 or select ' X ', and the official statement is that it ignores the select list when it is actually executed, so there is no difference. The actual execution of the
2:exists subquery may have been optimized rather than a comparison of our understanding, If you are concerned about efficiency, you can actually check it.
3:exists subqueries can be replaced with conditional expressions, other subqueries or joins, What is optimal requires specific analysis of the particular problem.

If the two table size of the query is equal, then the in and exists are not very different.


Extension examples Consolidate:

If one of the two tables is smaller and one is a large table, then the subquery table is large with exists, and the subquery table is small in:
Example: Table A (small table), table B (large table)

Select  from where inch (Select from B); //   inefficient, using the index of the CC column on table A; Select fromwhere exists (select  fromwhere cc=a.cc); //

The opposite

Select  from where inch (Selectfrom// High efficiency, using the index of the CC column on the b table;Selectfromwhere exists (selectfromwhere cc=b.cc); // inefficient, using the index of the CC column on table A. 

not-in and not-exists if the query statement uses not-in to perform a full-table scan of the outer surface, the index is not used, and the index on the table is still used by not Extsts's subquery. So no matter the table is large, using not exists is faster than not.

MySQL Optimization principle _ Small Table driver large table in and exists reasonable use

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.