Transfer from http://sunxiaqw.blog.163.com/blog/static/990654382013430105130443/
exists on the appearance with loop-by-article query, each query will look at the exists conditional statement, when the conditional statement in exists can return the record row (regardless of the number of record lines, as long as can return), the condition is true, return the current loop to this record, Conversely, if the conditional statement in the exists cannot return a record line, the current loop to the record is discarded, the exists condition is like a bool condition, true if the result set can be returned, false if the result set cannot be returned
As follows:
SELECT * from user where exists (select 1);
The record of the user table is removed one by one, because the Select 1 in the sub-condition can always return the record row, then all records of the user table will be added to the result set, so with the select * from user;
Also as follows
SELECT * from user where exists (SELECT * from user where userId = 0);
You can know that when you loop the user table, check the conditional statement (SELECT * from user where UserID = 0), because the userId is never 0, so the conditional statement will always return an empty set, the condition is always false, then all the records of the user table will be discarded
Not exists is the opposite of exists, that is, when the exists condition has a result set returned, the record to which the loop will be discarded, or the record to which the loop is to be added to the result set
In general, if a table has n records, then the exists query is to take these n records out one by one and then judge N Times exists condition
The in query is equivalent to multiple or conditional overlays, which is better understood, such as the following query
SELECT * from user where userId in (1, 2, 3);
is equivalent to
SELECT * FROM user where UserID = 1 or UserID = 2 or UserID = 3;
Instead of in and in, the following
SELECT * from the user where userId not in (1, 2, 3);
is equivalent to
SELECT * FROM user where userId! = 1 and UserID! = 2 and UserID! = 3;
In general, in query is the subquery condition of the records are all detected, assuming that the result set is B, a total of M records, and then in the sub-query condition of the result set decomposition into m, and then the M query
It is worth mentioning that the in query sub-condition Returns the result must have only one field, for example
SELECT * FROM user-where userId in (select-ID from B);
and cannot be
SELECT * from the user where userId in (select ID, age from B);
and exists there's no such limit.
Below, consider the performance of exists and in
Consider the following SQL statement
1:select * from A where exists (SELECT * from B where b.id = a.id);
2:select * from A where a.id in (select ID from B);
Query 1. You can convert the following pseudo-code to make it easier to understand
for ($i = 0; $i < count (A); $i + +) {
$a = Get_record (A, $i); #从A表逐条获取记录
if (b.id = $a [id]) #如果子条件成立
$result [] = $a;
}
return $result;
This is probably the meaning, in fact, we can see that the query 1 is mainly used in the index of B table, a table how the efficiency of the query should not be small
Assuming all the IDs of table B are all-in-all, query 2 can be converted to
SELECT * from A where a.id = 1 or a.id = 2 or a.id = 3;
This is a good understanding, this is mainly used in the index of a, B table how to affect the query is not small
See not exists and not in
1. Select * from A where is not EXISTS (SELECT * from B where b.id = a.id);
2. Select * from A where a.id not in (select ID from B);
Look at query 1, or the same as above, using the index of B
For Query 2, you can convert to the following statement
SELECT * FROM A where a.id! = 1 and a.ID! = 2 and a.ID! = 3;
You can know that not in is a range query, this! = Range query can not use any index, equal to say every record of table A, should be traversed in B tables, to see if there is a record in B table
Therefore, not exists is more efficient than not
In the MySQL in statement is the appearance and the inner table as a hash connection, and the EXISTS statement is the external loop loop, each loop loop and then query the internal table. It is not accurate to say that exists is more efficient than the in statement. This is to distinguish the environment.
If the two table size of the query is equal, then the in and exists are not very different . if one of the two tables is smaller and one is a large table, then the subquery table is large with exists, and the subquery table is small in:Example: Table A (small table), table B (large table) 1:select * from A where CC in (select CC from B)Low Efficiency, using the index of the CC column on table A; select * from A where exists (select cc from B where cc=a.cc)High Efficiency, the index of the CC column on table B is used. Opposite 2:select * from B where CC in (select CC from A)High Efficiency, using the index of the CC column on table B; select * from B where exists (select cc from A where cc=b.cc)Low Efficiency, the index of the CC column on table A is used. Not-in and not-exists if the query statement uses not-in to perform a full-table scan of the outer surface, the index is not used, and the index on the table is still used by not Extsts's subquery.
so no matter how big the table is, using not exists is faster than not.。 The difference between in and = select name from student where name in (' Zhang ', ' Wang ', ' Li ', ' Zhao '); The result is the same as the select name from student where Name= ' Zhang ' or name= ' li ' or name= ' Wang ' or name= ' Zhao '.
A brief analysis of the use of exists and in in MySQL (very good writing)