Analysis of the use of exists and in MySQL, analysis of mysqlexistsin

Source: Internet
Author: User

Analysis of the use of exists and in MySQL, analysis of mysqlexistsin

Exists External table query by loop one by one. Each query will view the exists conditional statement. When the conditional statement in exists can return the row of the record (no matter how many rows of the record can be returned ), if the condition is true, the record from the current loop is returned. If the statement in exists cannot return the record row, the record from the current loop is discarded, the exists condition is like a bool condition. If a result set can be returned, it is true. If a result set cannot be returned, it is false.

As follows:

Select * from user where exists (select 1 );

The records in the user table are retrieved one by one. Because select 1 in the sub-condition can always return record rows, all records in the user table will be added to the result set, so they are associated with select * from user; is the same

Also as follows

Select * from user where exists (select * from user where userId = 0 );

When loop is performed on the user table, the check Condition Statement (select * from user where userId = 0) is always returned because userId is never 0, if the condition is always false, all records in the user table will be discarded.

Not exists is opposite to exists, that is, when the exists condition returns a result set, the records returned by the loop will be discarded. Otherwise, the records returned by the loop will be added to the result set.

In general, if Table A has n records, the exists query extracts the n records one by one and then judges the n times of the exists condition.

 

 

The in query is equivalent to the superposition of multiple or conditions, which is easy to understand, for example, the following query

Select * from user where userId in (1, 2, 3 );

Equivalent

Select * from user where userId = 1 or userId = 2 or userId = 3;

Not in is opposite to in, as shown below:

Select * from user where userId not in (1, 2, 3 );

Equivalent

Select * from user where userId! = 1 and userId! = 2 and userId! = 3;

In general, an in query is to first retrieve all records of the subquery conditions. Assume that the result set is B and there are m records in total. Then, the results set of the subquery conditions is decomposed into m records, perform m queries again

 

It is worth mentioning that the subcondition returned by the in query must have only one field, for example

Select * from user where userId in (select id from B );

But not

Select * from user where userId in (select id, age from B );

Exists does not have this restriction.

Modify the statements here. For some databases such as mysql, The in query subcondition can return multiple fields. The syntax format is as follows:

Select * from user where (userId, age) in (select id, age from B );

The following describes the performance of exists and in.

Consider the following SQL statement

1: select * from A where exists (select * from B where B. id = A. id );

2: select * from A where A. id in (select id from B );

 

Query 1. The following pseudo code can be converted for easy understanding.

For ($ I = 0; $ I <count (A); $ I ++ ){

$ A = get_record (A, $ I); # obtain records one by one from Table

If (B. id = $ a [id]) # if the sub-condition is true

$ Result [] = $;

}

Return $ result;

This is probably the meaning. In fact, we can see that query 1 mainly uses the index of Table B, and how table A has little impact on the query efficiency.

 

Assume that all IDs in Table B are 1, 2, 3, and query 2 can be converted

Select * from A where A. id = 1 or A. id = 2 or A. id = 3;

This is easy to understand. Here we mainly use the index A, and how table B has little impact on queries.

 

Next, let's look at not exists and not in.

1. select * from A where not exists (select * from B where B. id = A. id );

2. select * from A where A. id not in (select id from B );

Check that query 1 is still the same as above. The index of B is used.

For query 2, it can be converted into the following statement:

Select * from A where A. id! = 1 and A. id! = 2 and A. id! = 3;

We can know that not in is a range query! = The Range Query cannot use any index. It means that every record in Table A must be traversed once in Table B to check whether this record exists in table B.

Therefore, not exists is more efficient than not in.

 

The in statement in mysql uses a hash connection between the external table and the internal table, while the exists statement uses a loop on the External table. Each loop then queries the internal table. We always think that exists is more efficient than in statements. This statement is actually inaccurate. This is to differentiate the environment.
 

If the two tables to be queried are of the same size, there is little difference between in and exists.. If one of the two tables is small and the other is A large table, use exists for A large subquery table and in for A small subquery table, for example, Table A (small table) and Table B (large table) 1: select * from A where cc in (select cc from B) is inefficient and uses the index of the cc column in table; select * from A where exists (select cc from B where cc =. cc) high efficiency, using the cc column index on table B. 2 On the contrary: select * from B where cc in (select cc from A) is highly efficient and uses the index of the cc column on table B; select * from B where exists (select cc from A where cc = B. cc) low efficiency, using the index of the cc column in table. Not in and not exists if the query statement uses not in, the internal and external tables are scanned for the whole table, and no index is used. However, the not extsts subquery can still use the table index. Therefore, whether the table is large, not exists is faster than not in.. Difference between in and = select name from student where name in ('zhang ', 'wang', 'lil', 'zhao '); the result is the same as that of select name from student where name = 'zhang' or name = 'lil' or name = 'wang' or name = 'zhao.
To sum up, I just memorized the SQL optimization method, but ignored the cause. In fact, the best optimization effect is not achieved. I recently encountered an SQL optimization problem. I used to use exists, but the query speed took over 70 seconds. I accidentally came back and read this article I reproduced, the in query speed is changed to more than 2 seconds. You need to know the truth, but also the truth.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.