SQLDifference between exists and in
Two tables, user table tdefuser (userid, address, phone) and consumption table taccconsume (userid, time, amount), need to query the user records that consume more than 5000.
Use exists:
Select * From tdefuser
Where exists (select 1 from taccconsume wheretdefuser. userid = taccconsume. userid and taccconsume. amount> 5000)
In:
Select * From tdefuser
Where userid in (select userid from taccconsume wheretaccconsume. amount> 5000)
Generally, exists is more efficient than in.
The subquery after exists () is called a subquery and does not return the value of the list. only returns a true or false result (this is why the subquery is "select 1", and it is exactly the same with "select6". Of course, you can also select the field, but it is obviously less efficient)
The running mode is to run the primary query first and then query its corresponding results in the subquery. If it is ture, the output is not displayed. then, query in the subquery based on each row in the primary query.
The subquery after in () returns the result set. In other words, the execution order is different from that of exists. the subquery first generates a result set, and then the primary query goes to the result set to find the list of fields that meet the requirements. output that meets the requirements, otherwise no output.
For example, the user table tdefuser (userid, address, phone), the consumption table taccconsume (userid, time, amount) data is as follows:
The clustered index of the consumption table is userid, time
Data (note that due to clustered indexes, the actual storage is in the following order)
1 2006-1-1 200
1 2006-1-2 300
1 2006-1-2 500
1 2006-1-3 2000
1 2006-1-3 2000
1 2006-1-4 400
1 2006-1-5 500
2 2006-1-1 200
2 2006-1-2 300
2 2006-1-2 500
2 2006-1-3 2000
2 2006-1-3 6000
2 2006-1-4 400
2 2006-1-5 8000
3 2006-1-1 7000
3 2006-1-2 30000
3 2006-1-2 50000
3 2006-1-3 20000
Statement:
Select * From tdefuser
Where exists (select 1 from taccconsume wheretdefuser. userid = taccconsume. userid and taccconsume. amount> 5000)
For userid = 1, false is returned only when all records are found, which is similar to the efficiency of the second statement.
For userid = 2, if the record-1-3 is found, true is returned, which is more efficient than the first statement.
For userid = 3, the first record returns true, which is more efficient than the second statement.
Statement
Select * From tdefuser
Where userid in (select userid from taccconsume wheretaccconsume. amount> 5000)
Returns an empty record set.
2
2
3
3
3
3
Judge again
Statement
Select * From tdefuser
Where userid in (select userid from taccconsume where userid = tdefuser. userid andamount> 5000)
For userid = 1, all records need to be found, empty record set is returned, comparison judgment
For userid = 2, all records need to be found and the record set is returned.
2
2
, Comparison and judgment
For userid = 3, all records need to be found and the record set is returned.
3
3
3
3
, Comparison and judgment
If there is no clustered index in the table, the number of items to be searched by each userid of exists is different, but they are all <= the number of items to be scanned by the third statement, extreme (for example, 5000 of all items are at the end) similar to the efficiency of the third statement, it is generally faster than the second statement. Therefore, "general" exists is more efficient than in.
All, any, someDifference
1. All
The entire condition is true only when all data meets the conditions.
Select *
From
Where 5> All (select ID from a) -- all values must be greater than 5
2. Any
If any piece of data meets the conditions, the entire condition is true.
Select *
From
Where 3> Any (select ID from a) -- if the result contains a condition whose value is less than 3
Go
3. Some and any indicate the same limits.