It mainly speaks of the difference in performance between except and not.
Copy Code code as follows:
CREATE TABLE tb1 (ID int)
CREATE TABLE TB2 (ID int)
BEGIN TRAN
DECLARE @i INT = 500
While @i > 0
Begin
INSERT into DBO.TB1
VALUES (@i--V-int
)
SET @i = @i-1
End
When I tested it, TB1 was 1000,TB2 500.
Copy Code code as follows:
DBCC Freesystemcache (' All ', ' default ');
SET STATISTICS IO on
SET STATISTICS time on
SELECT * FROM TB1 EXCEPT select * from TB2;
SELECT * FROM TB1 WHERE ID isn't in (select ID from TB2);
SET STATISTICS IO off
SET STATISTICS Time off
Execution Plan:
Copy Code code as follows:
SELECT * FROM TB1 EXCEPT select * from TB2;
|--merge Join (right Anti Semi join, Merge: ([master1].[ DBO]. [TB2]. [ID]) = ([Master1]. [dbo]. [TB1]. [ID]), residual: ([Master1]. [dbo]. [TB1]. [ID] = [Master1]. [dbo]. [TB2]. [ID]))
|--sort (DISTINCT ORDER BY: ([master1].[ DBO]. [TB2]. [ID] ASC))
| |--table Scan (OBJECT: ([master1].[ DBO]. [TB2])
|--sort (DISTINCT ORDER BY: ([master1].[ DBO]. [TB1]. [ID] ASC))
|--table Scan (OBJECT: ([master1].[ DBO]. [TB1])
Copy Code code as follows:
SELECT * FROM TB1 WHERE ID isn't in (select ID from TB2);
|--hash Match (right Anti Semi Join, Hash: ([master1].[ DBO]. [TB2]. [ID]) = ([Master1]. [dbo]. [TB1]. [ID]), residual: ([Master1]. [dbo]. [TB1]. [Id]=[master1]. [dbo]. [TB2]. [ID]))
|--table Scan (OBJECT: ([master1].[ DBO]. [TB2])
|--nested Loops (left Anti Semi Join)
|--nested Loops (left Anti Semi Join, Where: ([master1].[ DBO]. [TB1]. [ID] is NULL)
| |--table Scan (OBJECT: ([master1].[ DBO]. [TB1])
| |--top (Top EXPRESSION: ((1)))
| |--table Scan (OBJECT: ([master1].[ DBO]. [TB2])
|--row Count Spool
|--table Scan (OBJECT: ([master1].[ DBO]. [TB2]), Where: ([Master1]. [dbo]. [TB2]. [ID] is NULL)
SQL Server Execution Time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.
(500 rows affected)
Table ' TB1 '. Scan count 1, logical read 2 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
Table ' TB2 '. Scan count 1, logical read 1 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
(6 rows affected)
(1 rows affected)
SQL Server Execution Time:
CPU time = 0 milliseconds, elapsed time = 528 milliseconds.
(500 rows affected)
Table ' worktable '. Scan count 0, logical read 0 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
Table ' TB2 '. Scan count 3, logical read 1002 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
Table ' TB1 '. Scan count 1, logical read 2 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
(10 rows affected)
(1 rows affected)
SQL Server Execution Time:
CPU time = 16 milliseconds, elapsed time = 498 milliseconds.
SQL Server Execution Time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.
Conclusion: With more data and less data, the except performance is better than that in the case of less data, but EXECPT is superior to not in the case of more data.
Looking at the execution plan to learn how to index on TB1 and TB2, the execution plan for except can be optimized.
If you are interested you can look at the implementation plan for not exists. Suggestions:
Don't be superstitious about test results, because all of the performance is closely related to the execution plan. Implementation plans and statistics are inextricably linked.
So excessive superstition test results may have a performance impact on the production library that does not achieve the expected performance effect.