Analysis on the use of exists and in MySQL

Source: Internet
Author: User

Exists External table query by loop one by one. Each query will view the exists conditional statement. When the conditional statement in exists can return the row of the record (no matter how many rows of the record can be returned ), if the condition is true, the record from the current loop is returned. If the Condition Statement in exists cannot return the record row, the record from the current loop is discarded, the exists condition is like a bool condition. If a result set can be returned, it is true. If a result set cannot be returned, it is false.

As follows:

Select * from user where exists (select 1 );

The records in the User table are retrieved one by one. Because select 1 in the sub-condition can always return record rows, all records in the User table will be added to the result set, so they are associated with select * from user; is the same

Also as follows

Select * from user where exists (select * from user where userid = 0 );

When loop is performed on the user table, the check Condition Statement (select * from user where userid = 0) is always returned because userid is never 0, if the condition is always false, all records in the User table will be discarded.

Not exists is opposite to exists, that is, when the exists condition returns a result set, the records returned by the loop will be discarded. Otherwise, the records returned by the loop will be added to the result set.

In general, if Table A has n records, the exists query extracts the N records one by one and then judges the n times of the exists condition.

 

 

The in query is equivalent to the superposition of multiple or conditions, which is easy to understand, for example, the following query

Select * from user where userid in (1, 2, 3 );

Equivalent

Select * from user where userid = 1 or userid = 2 or userid = 3;

Not in is opposite to in, as shown below:

Select * from user where userid not in (1, 2, 3 );

Equivalent

Select * from user where userid! = 1 and userid! = 2 and userid! = 3;

In general, an in query is to first retrieve all records of the subquery conditions. Assume that the result set is B and there are m records in total. Then, the results set of the subquery conditions is decomposed into M records, perform M queries again

 

It is worth mentioning that the subcondition returned by the in query must have only one field, for example

Select * from user where userid in (select ID from B );

But not

Select * from user where userid in (select ID, age from B );

Exists does not have this restriction.

 

The following describes the performance of exists and in.

Consider the following SQL statement

1: Select * from a where exists (select * from B where B. ID = A. ID );

2: Select * from a where a. ID in (select ID from B );

 

Query 1. The following pseudo code can be converted for easy understanding.

For ($ I = 0; $ I <count (a); $ I ++ ){

$ A = get_record (A, $ I); # obtain records one by one from Table

If (B. ID = $ A [ID]) # If the sub-condition is true

$ Result [] = $;

}

Return $ result;

This is probably the meaning. In fact, we can see that query 1 mainly uses the index of Table B, and how table A has little impact on the query efficiency.

 

Assume that all IDs in Table B are 1, 2, 3, and query 2 can be converted

Select * from a where a. ID = 1 or a. ID = 2 or a. ID = 3; (this is equivalent in most databases, but not in MySQL)

This is easy to understand. Here we mainly use the index A, and how table B has little impact on queries.

MySQL's in (n data records) will be first sorted by the database, and then the binary search will be used to find the value that matches the data in m tables in the data of n data records. Aside from the index, its efficiency is at the logn level.

The efficiency of ID = 1 or ID = 2 or ID = 3 is high.

 

 

Next, let's look at not exists and not in.

1. Select * from a where not exists (select * from B where B. ID = A. ID );

2. Select * from a where a. id not in (select ID from B );

Check that query 1 is still the same as above. The index of B is used.

For query 2, it can be converted into the following statement:

Select * from a where a. ID! = 1 and A. ID! = 2 and A. ID! = 3;

We can know that not in is a range query! = The Range Query cannot use any index. It means that every record in Table A must be traversed once in Table B to check whether this record exists in table B.

Therefore, not exists is more efficient than not in.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.