The previous time encountered a stored procedure, one of the parameters is a string, the actual effect is stored in the above is as a query condition processing
In the stored procedure, the string is split into a temporary table after the key value, as a query condition, the logical implementation of two processing methods
INSERT INTO #t
Select Key from Split_function (' passed in string ', ', ')
The first is to do a inner join with the physical table, similar to the following
SELECT * from TableA a inner joins TableB B on a.id=b.id
INNER JOIN #t C on B.key=c.key
where
A.column2 between @paramater1 and @parameter2
The second is to put this filter condition in the Where condition
SELECT * from TableA a inner joins TableB B on a.id=b.id
where
A.column between @paramater1 and @parameter2
and B.key in (select Key from #t)
In fact, the stored procedure itself is more complex, more than 10 tables of a complex join and multi-clock filtering logic, there are several large tables near tens, the core point is similar to the above query condition processing method,
The logic of its own is implemented in the first way, the focus on the index, statistics, and so on above, how can not find the reason
Found in poor performance itself here I just cite a simple example that is actually difficult to simulate the actual logic
This question puzzled me for a long time,
At first, I didn't think of the reason for the difference between filtering and directly placing in the where condition using the INNER join intermediate result set.
After careful observation of the execution plan, it was found that the first way of implementing the plan was this:
At the beginning of the execution plan to filter the physical table, did not first use the value of #t to filter the physical table, just use the TableA on the Column2 filter conditions to get a result set, and then use this larger result set to drive the other table, and then to the #t do join,
is equal to the intermediate result set is very large, and finally to go with #t to do join filter, performance is relatively poor
In fact, has not been aware of the above inner join #t造成的, the first is the index, statistics and other things to do all kinds of optimization, have not been how to improve
Later, for the second way, the physical table is treated like this: use Column2 and #t共同去过滤TableA, wait until an intermediate result set, and then go to drive the other table
Generated graphical execution plan, estimated two screens are not displayed, through step by step observation, looked for a long time, only to find that this difference,
Although the final result is the same, but the efficiency of the query is very large, because at the beginning of the TableA filter, get a relatively small result, and then to drive the other table or join with other tables, the cost is relatively small
This question puzzled for many days, originally wanted to write a demo to verify, but the actual scene is too complex, it is difficult to simulate the kind of data and the logical relationship between the table.
However, it is clear that the above-mentioned wording, for the simple demo, the possibility of a small difference, but the difference between the implementation plan is very obvious
After the optimization of SQL, a number of ideas, especially under complex conditions, the face of query conditions of processing, must be cautious
The difference between a SQL-optimized query statement using INNER JOIN as a filter condition and where as a filter condition