開發隨筆——NOT IN vs NOT EXISTS

來源:互聯網
上載者:User

原文出處: http://blog.csdn.net/dba_huangzj/article/details/31374037  轉載請引用

之前在論壇中見到一個針對in/exists的討論,原帖懶得找了,這裡介紹一下最近的學習小結:

NOT IN和NOT EIXTS在對允許為null的列查詢時會有一定的風險。特別是NOT IN,如果子查詢包含了最少一個NULL,會出現非預期的結果。下面做一個示範。

 

IF OBJECT_ID('ShipmentItems', 'U') IS NOT NULL     DROP TABLE dbo.ShipmentItems; GO CREATE TABLE dbo.ShipmentItems     (       ShipmentBarcode VARCHAR(30) NOT NULL ,       Description VARCHAR(100) NULL ,       Barcode VARCHAR(30) NOT NULL     ); GO INSERT  INTO dbo.ShipmentItems         ( ShipmentBarcode ,           Barcode ,           Description         )         SELECT  '123456' ,                 '1010203' ,                 'Some cool widget'         UNION ALL         SELECT  '123654' ,                 '1010203' ,                 'Some cool widget'         UNION ALL         SELECT  '123654' ,                 '1010204' ,                 'Some cool stuff for some gadget'; GO -- retrieve all the items from shipment 123654 -- that are not shipped in shipment 123456 SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'         AND Barcode NOT IN ( SELECT Barcode                              FROM   dbo.ShipmentItems                              WHERE  ShipmentBarcode = '123456' ); /* Barcode ------------------------------ 1010204*/


 

可以看出得到了期待結果。下面看看修改表結構,允許列為null的情況:

ALTER TABLE dbo.ShipmentItems ALTER COLUMN Barcode VARCHAR(30) NULL; INSERT  INTO dbo.ShipmentItems         ( ShipmentBarcode ,           Barcode ,           Description         )         SELECT  '123456' ,                 NULL ,                 'Users manual for some gadget'; GO SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'         AND Barcode NOT IN ( SELECT Barcode                              FROM   dbo.ShipmentItems                              WHERE  ShipmentBarcode = '123456' ); /* Barcode ------------------------------*/


很多人會覺得這是一個bug,有時候能查出資料,有時候卻不能。但是實際上不是bug,當NOT IN子句返回最少一個NULL時,查詢會返回空,下面的語句能更好地說明這個想法:

SELECT  CASE WHEN 1 NOT IN ( 2, 3 ) THEN 'True'              ELSE 'Unknown or False'         END ,         CASE WHEN 1 NOT IN ( 2, 3, NULL ) THEN 'True'              ELSE 'Unknown or False'         END; /* ---- ---------------- True Unknown or False */


實際上,由於IN的本質是OR操作,所以:


SELECT  CASE WHEN 1 IN ( 1, 2, NULL ) THEN 'True'              ELSE 'Unknown or False'         END ;

中,1 in 1,也就是為TRUE,所以返回true,這個語句的邏輯實際上是:

SELECT  CASE WHEN ( 1 = 1 )                   OR ( 1 = 2 )                   OR ( 1 = NULL ) THEN 'True'              ELSE 'Unknown or False'         END ; 


當使用NOT IN 時,如下面的語句:

SELECT  CASE WHEN 1 NOT IN ( 1, 2, NULL ) THEN 'True'              ELSE 'Unknown or False'         END ;


會轉變成:

SELECT  CASE WHEN NOT ( ( 1 = 1 )                         OR ( 1 = 2 )                         OR ( 1 = NULL )                       ) THEN 'True'              ELSE 'Unknown or False' END ;


根據離散數學的概念,可以轉換為:

SELECT  CASE WHEN ( ( 1 <> 1 )                     AND ( 1 <> 2 )                     AND ( 1 <> NULL )                   ) THEN 'True'              ELSE 'Unknown or False'         END ;


謂詞有短路特性,即在AND條件中,只要有一個條件為false,整個條件都為false,而1<>1是為false,所以後面的也不需要判斷了,直接返回else部分。即使是1<>null,根據集合論的特性,NULL和實際資料的對比總是返回unknown,所以也是為false。如果你非要用NOT IN ,請確保子查詢永遠不會有NULL返回。或者需要額外處理去除NULL,比如:

SELECT  Barcode FROM    dbo.ShipmentItems WHERE   ShipmentBarcode = '123654'   AND Barcode NOT IN ( SELECT Barcode                        FROM   dbo.ShipmentItems                        WHERE  ShipmentBarcode = '123456'                          AND Barcode IS NOT NULL ) ;


還有一種方法就是改寫語句,用NOT EXISTS來等價替換:

SELECT  i.Barcode FROM    dbo.ShipmentItems AS i WHERE   i.ShipmentBarcode = '123654'         AND NOT EXISTS ( SELECT *                          FROM   dbo.ShipmentItems AS i1                          WHERE  i1.ShipmentBarcode = '123456'                                 AND i1.Barcode = i.Barcode ); /* Barcode ------------------------------ 1010204*/


另外,基於SARG要求,一般不建議用NOT IN/NOT EXISTS這種反向掃描,避免影響效能。還有一個選擇使用IN/EXISTS的要點,就是多列匹配的問題,在T-SQL中,多列同時匹配要用EXISTS,而單列匹配可以用EXISTS/IN。可能可以用其他寫法來實現IN的多列匹配,但是一般我個人會選擇使用EXISTS來匹配多列。


原文出自:CSDN部落格:黃釗吉的部落格

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.