Low efficiency
Index Scan: The full-scan index (including root, intermediate, and leaf pages):
Efficient
Index seek: Search forward and backward by index:
Explanation Explanation Index seek and index scan:
The index is a B-tree,
Index seek is to find the target row from the root node of the B-tree, first-level.
The index scan goes from left to right, traversing the entire B-tree.
Assume that the only target row is on the right-most leaf node of the Index tree (assuming a nonclustered index , tree depth 2, leaf node occupies the K-page physical storage).
The index seek causes an IO of 4, and the index scan causes the IO to be k, which has a huge difference in performance.
seek: The process from root B to leaf node
Scan: When SEEK is complete, the range or full scan is performed on the leaf node (depending on the selectivity of the query
For an index , you can read the online documentation carefully about the physical database Architecture section
Do not include operations in query conditions
These operations include string connections (such as: SELECT * from Users where UserName + ' pig ' = ' Zhang San Pig '), wildcards in the preceding like operations (such as: SELECT * from tb1 where col4 '? ') ), using other user-defined functions, system built-in functions, scalar functions, and so on (such as: SELECT * from Userlog where DATEPART (dd, logtime) = 3).
SQL Server has no way of estimating overhead when dealing with the above statement. The end result is of course clustered index scan or tablescan .
Do not include operations between different columns in the same table in the query condition
The so-called "operations" include subtraction or through some function (for example: SELECT * from TB where col1–col2 = 1997), also including comparison operations (such as: SELECT * from TB where col1 > Col2). In this case, SQL Server has no way of estimating the overhead. Whether there is an index on col1, col2, or a col1, an overlay index on col2, or an index that col1 include col2 is created.
However, this query has a workaround, you can create more than one calculated field on the table, its value is set to your "operation" result, and then create an index on the field, OK.
(Result set/total number of rows) is called selectivity, the greater the ratio, the higher the selectivity.
You get it, the focus of this article is selectivity.
Statistics, plainly, is the number of row result sets in a table when a field takes a value. Statistical information can be said to be a selective measure, SQL Server is based on it to estimate the merits of different query plans.
If the total number of rows in the table is 1w, the number of samples is 1w. A provider_no value of 21 is only 1 rows, whereas a row with a value of 500 has 4824 rows.
We know that SQL Server caches the query plan, if there is one stored procedure:
Create proc MyProc
(
@pno int
)
As
SELECT * from charge where provider_no = @pno
The first time we pass in a 21,ok, it caches the execution plan for the stored procedure for nonclustered index seek that. And then we passed in a 500, finished, the server found it has a myproc cache, so, and through nonclustered index seek execution, then your partner see your query spent a huge amount of Io, so, you are despised.
What does that mean? If your query is volatile, you should tell SQL Server not to cache the query plan, which should be re-evaluated and compiled each time. The implementation method is simple, and the tail of the query is added an option (RECOMPILE). And SQL2K5 also has a NB feature that can recompile only part of the stored procedure at a time (you can, of course, choose to recompile the entire stored procedure, depending on your needs.) See the online documentation. )
Original link: http://blog.csdn.net/pumaadamsjack/article/details/6597357
Index seek and Index scan