In today's article, I would like to discuss the Intersect setup operation in SQL Server. The Intersect set operation crosses 2 recordsets and returns the same record as the values in the 2 set. Demonstrates this concept.
intersect and inner JOIN
You will find that it is almost the same as a INNER JOIN between 2 tables. But today I'll introduce some of the important differences between them. Let's start with 2 simple tables created as input.
1 --Create the 1st table2 CREATE TABLET13 (4Col1INT,5Col2INT,6Col3INT7 )8 GO9 Ten --Create the 2nd table One CREATE TABLET2 A ( -Col1INT, -Col2INT the ) - GO - - --Create a unique Clustered Index on both tables + CREATE UNIQUE CLUSTERED INDEXIdx_ci onT1 (col1) - CREATE UNIQUE CLUSTERED INDEXIdx_ci onT2 (col1) + GO A at --Insert some records into both tables - INSERT intoT1VALUES(1,1,1), (2,2,2), (NULL,3,3) - INSERT intoT2VALUES(2,2), (NULL,3) - GO - GO
As you can see from the T-SQL code, I also created a unique clustered index on 2 tables and inserted some test records. Now let's cross each of these 2 tables:
1 SELECT from T1 2 INTERSECT 3 SELECT from T2 4 GO
SQL Server returns 2 records: A record with a column value of 2 and a column value of NULL. This is the 1th big difference from the INNER JOIN : If a null value appears in 2 tables, the records are ignored. When you perform a INNER JOIN operation on a col column between 2 tables, a record with a null value does not return:
1 SELECT from T1 2 INNER JOIN on = T1.col1 3 GO
Shows the difference between the INTERSECT and INNER JOIN method Result sets:
Now let's analyze the execution plan for the INTERSECT setup operation. Because you have supported indexes on COL columns, the query optimizer can translate INTERSECT operations for traditional INNER JOIN logic operations.
But here Nested Loop (Inner join) does not really Inner join operation. Let's see why. When you view the nested Loop operator property, you see that there is a residual predicate (residual predicate) on the Clustered Index Seek (Clustered) operator.
The remaining predicate is evaluated on COL2 because that column is not part of the clustered index navigation structure that you just created. As I was beginning to say, SQL Server needs to find a matching row in all the columns of 2 tables. With the Clustered Index Seek (Clustered) operator and the remaining predicate, SQL Server checks only for matching records that have the same column values in the T1 table. and the nested loop operator itself returns only the column values from one table-this is the T1 table.
So INNER join is just a left Semi join: SQL Server checks to see if there is a record of our match in the right table-if so, the matching record is returned from the left table. The remaining predicate on Clustered Index Seek (Clustered) can be eliminated by providing all the necessary columns in the navigation structure, as follows:
1 -- Create a supporting non-clustered Index 2 CREATE nonclustered Index on T1 (Col1, Col2) 3 GO
Now when you look at the execution plan of the INTERSECT operator again, you will see that SQL Server has an index Seek (nonclustered) operation on the indexes you just created, and the remaining predicates are no longer needed.
Now when we delete all the supported index structures, let's see what the execution plan will become.
1 -- drop index id_nci on T1 3 INDEX idx_ci on T1 4 drop index idx_ci "on T2 5 go
When you INTERSECTthe 2 tables again, you will now see the Nested Loop (left Semi Join) operator in the execution plan. SQL Server now needs to make a left semi-physical connection in the execution plan by performing a line-by-row comparison of the Table Scan operator internally and the remaining predicate in the Nested Loop .
This execution plan is not really efficient because the internal Table Scan needs to be repeated-for every row that comes back from the outside. The supported indexes are very important if we want to make the INTERSECT setting operation as efficiently as possible.
Summary
INTERSECT Setup is not scary, but few people know it very well. When you use it, you have to realize it and INNER JOIN. The difference between. As you can see, having a good index design is important to it, so the query optimizer generates a good execution plan.
Thanks for your attention!
Intersect in SQL Server