Comparison between exists and in, not exists and not in

Source: Internet
Author: User

Exists

Returns true or false, depending on whether the result set returned by the subquery contains at least one row. Unlike other predicates (such as in) and logical expressions, exists cannot return unknown. Subquery or return the row set, or do not return. If a subquery filters a special row and returns the unknown result, the row is not returned to the subquery result set. In the filtering process, unknown is treated as false. In other words, in a subquery with filtering conditions, the precondition that exists returns true is that at least one row has a filtering value of true.
First look at an exists instance. The following query returns the customers in the customers table whose countries are Spain and whose orders have been placed in the table orders.
Select customerid, companyName
From DBO. Customers as C
Where Country = n' Spain'
And exists
(Select * from orders as O
Where o. customerid = C. customerid );
/*
Customerid companyName
--------------------------------------------------
Bolid bólido comidas preparadas
Galed Galer ía del gastr ónomo
Godos cocina típica
Romey Romero y tomillo

(Four rows affected)
*/
Tip: * The subquery used here is very secure, although it is rarely used. The query optimizer ignores the list returned by selecet in the subquery, because exists only follows

The existence of a row set, regardless of other attributes.

The execution plan first scans the MERs table and filters out the table based on the Spain condition. For each qualified customer, the execution plan checks whether the orders table contains the customer's customerid by searching the index column orders. customerid. Here, the index on the customerid column is very helpful for the query, because it can be used to directly access the rows containing the specified customerid in the orders table (without the need to perform a full table scan)

Exists vs. In

Many people want to know whether a query with an exists predicate is more efficient than a logical equivalent query with an in predicate. For example, the preceding query can be implemented by using an in predicate and a self-contained subquery:
Select customerid, companyName
From DBO. Customers as C
Where Country = n' Spain'
And customerid in (select customerid from DBO. Orders );
In versions earlier than SQL Server, the execution plan shows that there are differences between the two. Obviously, the efficiency of exists is high, which is determined by the inherent short-circuiting attribute of exists. Then, in SQL Server 2005 and, when the two query methods are logically equivalent and have the same limits, the query optimizer usually generates the same execution plan for them.
If you always consider the meaning of the Three-price logic (true, false, unknown), you may realize that in and exists are different, different from exists, when the input column contains a null value, in produces an unknown logical result. For example, the result of a In (B, c, null) is unknown. In the filtering process, unkown is treated as false. Therefore, the query results are the same, whether including the in predicate or exists predicate. The query optimizer recognizes this and thus generates the same execution plan.

Not exists vs. Not in

If a null value exists in the input column of the predicate not in, this logical difference exists between not exists and not in, just like in and exists.
For example, you want to return all customers from Spain who have not placed orders. The following method is implemented through the not exists predicate:
Select customerid, companyName
From DBO. Customers as C
Where Country = n' Spain'
And not exists
(Select * from orders as O
Where o. customerid = C. customerid );
/*
Customerid companyName
--------------------------------------------------
Fissa Fabrica inter. salchichas S..

(One row is affected)
*/
Even if the customerid in the orders table has a null value, it does not affect our query results. You can get a list of all customers from Spain, and SQL Server cannot find any order information about this customer from the orders table.

The execution plan scans the MERs table and uses the customer from Spain to filter. For each qualified customer, the execution plan performs an index query on the merid of the index column. The top operation appears here because it only needs to determine whether there is at least one order for the corresponding customer. This is the short-circuiting performance embodied by exists in the process. Here, when the orders. mermerid column density is very high (that is, there are many repeated items), the efficiency will be higher. For each customer, the query is performed only once, regardless of the number of orders in the orders table. Only one row of the page file (the lowest level of the page level of the index) will be scanned for matching, instead of targeting all qualified rows.
The following query uses the not in predicate to return the same output result. It seems to have the same meaning, but we will find that it is not.
Select customerid, companyName
From DBO. Customers as C
Where Country = n' Spain'
And customerid not in (select customerid from DBO. Orders );
Check the execution plan and find that it is different from the execution plan generated by the not exists query.

The plan starts with some additional operations. Compared with the previous (not exists) operations, it needs to find all the empty mermerid columns. Why are their execution plans different? Why does SQL Server pay special attention to the null column in the orders. mermerid column?
The difference between the two execution plans does not affect the query results because the customerid column in the orders table does not have a null value. However, because custmerid allows null, the query optimizer must consider this situation. Now, in the table orders, assume that mermerid is null.
Insert into DBO. Orders default values;
Re-run the preceding two queries and find that not exists returns the same result, while the query where not in is located returns NULL.
In fact, as long as the orders. mermerid column is under control, the not in query will always return an empty set. The reason is that the predicates Val in (val1, val2,..., null) only return true and unknown. Therefore, Val not in (val1, val2,..., null) Only returns not true and not unknown, and neither of them is true.
For example, if the customer column set in this query is (a, B, null), and customer a is in the first column set, the predicate a in (a, B, null) returns true, a not in (a, B, null) returns not true, that is, false, so that customer a will not be returned in the query. Customer C, on the other hand, does not appear in List (a, B, null), but because of the null value, the logical result C in (a, B, null) is unknown. C Not In (a, B, null) is not unknown, that is, unknown. Therefore, customer C is not returned by external queries and does not appear in the customer list.
No matter whether a customer appears in the customer column set or not, as long as there is null in the Set, the customer will not be returned by the query. Therefore, we need to know one thing,When a null value is potentially present in a column set, the not exists and not in logic are not equivalent.
. This explains the differences between the two execution plans and the potential differences in results.
If you want the not in query logic to be equivalent to the not exists query, you must display the non-empty declared column, or add a filter condition to remove null values.
:
Select customerid, companyName
From DBO. Customers as C
Where Country = n' Spain'
And customerid not in (select customerid from DBO. Orders
Where customerid is not null );
/*
Customerid companyName
--------------------------------------------------
Fissa Fabrica inter. salchichas S..

(One row is affected)
*/

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.