This article introduces the optimization of "not in" and "not exists" in the T-SQL.
Not inAndNot existsCommand:
In some cases, select, update, and delete operations are required to isolate data. Isolated Data: The associated table does not exist in the master table.
To operate such data, the first response is to use the "not in" or "not exists" command. Using not in will seriously affect the performance, because this command will check each record one by one, resulting in resource shortage, especially when updating and deleting big data, the resource may be locked by these operations.
SelectNot inOrNot exists
Currently, SQL Server has two commands that can be used to insert, update, or delete big data. The performance is much better than not in, and the syntax is much easier than not exists, the written statement looks refreshing. Now let's see their debut, merge and mongot.
Example:
Create two tables first
1 use [MyTest]2 create table Test1 ([id] int, [name] varchar(20))3 create table Test2 ([id] int, [name] varchar(20), [address] varchar(100))
1 declare @RowC int2 declare @Name varchar(20)3 set @RowC = 04 while @RowC < 4000005 Begin6 set @Name = 'TestName' + CAST(@RowC as varchar(10))7 insert into Test1(id, name) values(@RowC, @Name)8 set @RowC = @RowC+19 end
1 declare @RowC int 2 declare @Name varchar(20) 3 declare @Address varchar(100) 4 set @RowC = 0 5 while @RowC < 500000 6 Begin 7 set @Name = 'TestName' + CAST(@RowC as varchar(10)) 8 set @Address = 'TestAddress' + CAST(@RowC as varchar(10)) 9 insert into Test2([id], [name], [address]) values(@RowC, @Name, @Address)10 set @RowC = @RowC+111 end
Use the not in commandSelect/update/deleteOperation:
1 SELECT [name] FROM Test2 where [name] not in (select [name] from Test1)2 UPDATE Test2 SET [name] =N'New_Name' where [name] not in (select [name] from Test1)3 DELETE Test2 FROM Test2 where [name] not in (select [name] from Test1)
Better performance, more concise syntaxMerge and merge t:
1 merge Test2 T using (select name from Test2 except select name from Test1 )S on t.name=s.name2 when matched then update SET name=N'New_Name' ;3 merge Test2 T using (select name from Test2 except select name from Test1 )S on t.name=s.name4 when matched then delete ;5 SELECT * FROM Test2 S where not exists (select 1 from Test2 inner join Test1 on Test2.name=Test1.name and Test2.name=s.name)
Note that some of the above still use not exists:
1 SELECT name FROM Test2 S where not exists (select 1 from Test2 inner join Test1 on Test2.name=Test1.name and Test2.name=s.name)
Now we need to use the concise example t:
1 select name from Test1 except select name from Test2
Here we just give an example, without showing the actual comparison data. However, the performance of merge and mongot commands in big data processing is better
Not in is much better, and the code is much more concise than not exists. Believe it or not, I believe it !!!
On the test data Hello:Test1 has400000Data entries in Test2500000Data entries.100000The names of data entries are different.
Select not in: 18 seconds
Select distinct T: almost no time spent
Update Not in: 19 seconds
Update into T, merge:
The delete operation is basically the same as the previous two operations, so we will not delete them here.
Not exists has no better performance than limit T. On my test data, the two operations are almost completed within one second !!!
Thank you @ Xu shaoxia for your comments and comments. I hope you can take a look.