Development essay -- not in vs NOT EXISTS

Source: Internet
Author: User

Source: http://blog.csdn.net/dba_huangzj/article/details/31374037 reproduced Please reference

I have seen a discussion on in/exists in the forum. I am too lazy to find the original post. Here I will introduce the recent study summary:

Not in and not eixts may have certain risks when querying columns that are allowed to be null. IN particular, not in. If a subquery contains at least one NULL value, unexpected results will appear. The following is a demonstration.

 

IF OBJECT_ID('ShipmentItems', 'U') IS NOT NULL     DROP TABLE dbo.ShipmentItems; GO CREATE TABLE dbo.ShipmentItems     (       ShipmentBarcode VARCHAR(30) NOT NULL ,       Description VARCHAR(100) NULL ,       Barcode VARCHAR(30) NOT NULL     ); GO INSERT  INTO dbo.ShipmentItems         ( ShipmentBarcode ,           Barcode ,           Description         )         SELECT  '123456' ,                 '1010203' ,                 'Some cool widget'         UNION ALL         SELECT  '123654' ,                 '1010203' ,                 'Some cool widget'         UNION ALL         SELECT  '123654' ,                 '1010204' ,                 'Some cool stuff for some gadget'; GO -- retrieve all the items from shipment 123654 -- that are not shipped in shipment 123456 SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'         AND Barcode NOT IN ( SELECT Barcode                              FROM   dbo.ShipmentItems                              WHERE  ShipmentBarcode = '123456' ); /* Barcode ------------------------------ 1010204*/


 

We can see the expected results. Let's take a look at modifying the table structure and allow the column to be null:

ALTER TABLE dbo.ShipmentItems ALTER COLUMN Barcode VARCHAR(30) NULL; INSERT  INTO dbo.ShipmentItems         ( ShipmentBarcode ,           Barcode ,           Description         )         SELECT  '123456' ,                 NULL ,                 'Users manual for some gadget'; GO SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'         AND Barcode NOT IN ( SELECT Barcode                              FROM   dbo.ShipmentItems                              WHERE  ShipmentBarcode = '123456' ); /* Barcode ------------------------------*/


Many people think this is a bug. Sometimes data can be found, but sometimes data cannot be found. But it is actually NOT a bug. When the not in Clause returns at least one NULL value, the query returns NULL. The following statement can better illustrate this idea:

SELECT  CASE WHEN 1 NOT IN ( 2, 3 ) THEN 'True'              ELSE 'Unknown or False'         END ,         CASE WHEN 1 NOT IN ( 2, 3, NULL ) THEN 'True'              ELSE 'Unknown or False'         END; /* ---- ---------------- True Unknown or False */


IN fact, because the essence of IN is the OR operation, so:


SELECT  CASE WHEN 1 IN ( 1, 2, NULL ) THEN 'True'              ELSE 'Unknown or False'         END ;

In, 1 in 1, that is, TRUE, so true is returned. The logic of this statement is actually:

SELECT  CASE WHEN ( 1 = 1 )                   OR ( 1 = 2 )                   OR ( 1 = NULL ) THEN 'True'              ELSE 'Unknown or False'         END ; 


When not in is used, the following statement is used:

SELECT  CASE WHEN 1 NOT IN ( 1, 2, NULL ) THEN 'True'              ELSE 'Unknown or False'         END ;


Will be converted:

SELECT  CASE WHEN NOT ( ( 1 = 1 )                         OR ( 1 = 2 )                         OR ( 1 = NULL )                       ) THEN 'True'              ELSE 'Unknown or False' END ;


According to the concept of discrete mathematics, it can be converted:

SELECT  CASE WHEN ( ( 1 <> 1 )                     AND ( 1 <> 2 )                     AND ( 1 <> NULL )                   ) THEN 'True'              ELSE 'Unknown or False'         END ;


The predicate has the short-circuit feature, that is, in the AND condition, if one condition is false, the entire condition is false, AND 1 <> 1 is false, therefore, the latter part does not need to be determined, and the else part is returned directly. Even if it is 1 <> null, according to the characteristics of the set theory, the comparison between NULL and actual data always returns unknown, so it is also false. If you have to use not in, make sure that the subquery will never return NULL. Or you need to remove NULL with additional processing, for example:

SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'   AND Barcode NOT IN ( SELECT Barcode                        FROM   dbo.ShipmentItems                        WHERE  ShipmentBarcode = '123456'                          AND Barcode IS NOT NULL ) ;


Another method is to rewrite the statement and replace it with not exists:

SELECT  i.Barcode FROM    dbo.ShipmentItems AS i WHERE   i.ShipmentBarcode = '123654'         AND NOT EXISTS ( SELECT *                          FROM   dbo.ShipmentItems AS i1                          WHERE  i1.ShipmentBarcode = '123456'                                 AND i1.Barcode = i.Barcode ); /* Barcode ------------------------------ 1010204*/


IN addition, based on the SARG requirements, it is generally NOT recommended to use reverse scanning such as not in/not exists to avoid affecting performance. There is also a choice to use IN/EXISTS key point, is the problem of multi-column match, IN the T-SQL, multi-column match to use EXISTS at the same time, while single column match can use EXISTS/IN. You may use other methods to implement IN Multi-column matching, but I usually choose EXISTS to match multiple columns.


Original article from: CSDN blog: Huang zhaoji's blog

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.