When using indexes to query data, the optimizer determines whether to perform index scanning or index search based on the statistics related to the index. However, the statistical step size cannot exceed 200 (the third part of the result returned by DBCC show_statistics). This makes the accuracy of statistical information increasingly inaccurate in tables with a large amount of data. Of course, this impact will not be fatal, and there will be very few opportunities. The key is that if the statistics are not updated in time, the optimizer will select an incorrect execution plan.
If we find many missing_column_statistics events in the event probe, we can set the asynchronous automatic update statistics option to true. To prevent time-out because the query waits for the completion of statistics, the query will definitely select an incorrect execution plan because of inaccurate statistics.When will the statistics be automatically updated?
In Compatibility ViewSYS. sysindexesInRowmodctrThe column records the number of records that have changed since the last update statistics. Each statistical information pair corresponds to a row.If the number of records is less than 500, automatic update is triggered only when the change exceeds 500. For a larger table, the update occurs after the number of records of the 500 + 20% * Table is modified (addition, deletion, modification, and so on are included ).Based on the ratio of this update, if we only rely on automatic update timing, it will cause the optimizer to select an incorrect execution plan.
The following is an example to demonstrate the query performance loss caused by this delayed automatic update operation. The orderdate of the orders table of the northwind database creates a non-unique non-clustered index.
-- Update the statistics and observe the related statistics and view information.
Update Statistics DBO. Orders With Fullscan
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The value of rowmodctr is 0, indicating that no records are modified after statistics are updated.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- Create a new order and observe the current statistics. The system does not update the statistics.
Insert Into DBO. Orders (orderdate) Values ( ' 2008-09-12 ' )
-- At this time, we can see that there is no '2017-09-12 'record in the statistics.
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The rowmodctr field is changed to 1.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- Currently, there is only one matching record, and index search is efficient. Option (recompile) does not save the cache plan, preventing subsequent queries from reusing the plan.
Select * From DBO. Orders Where Orderdate = ' 2008-09-12 ' Option (Recompile)
-- No matter whether you add or modify the Order Date of 500 orders, only one operation is executed to prevent exceeding the threshold value when auto update is triggered.
-- Update top (500) DBO. Orders set orderdate = '2017-09-12 'Where orderdate <> '2017-09-12'
Declare @ I Int
Set @ I = 500
While ( @ I > 0 )
Begin
Insert Into DBO. Orders (orderdate) Values ( ' 2008-09-12 ' )
Set @ I = @ I - 1
End
-- The statistical information does not include '2017-09-12 ',
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The rowmodctr field of orderdate is changed to 501. If you update orderdate, only the corresponding orderdate row is changed to 501.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
Exec Sp_spaceused [ DBO. Orders ] , True
Go
-- At this time, there are many qualified records, but the incorrect selection will continue to use index search because the statistics are not updated in a timely manner, resulting in excessive logic I/O
Select * From DBO. Orders Where Orderdate = ' 2008-09-12 ' Option (Recompile)
-- Update statistics
Update Statistics DBO. Orders With Fullscan
-- Select Table scan correctly.
Select * From DBO. Orders Where Orderdate = ' 2008-09-12 '
-- All rowmodctr values are reset to 0.
Select ID, [ Name ] , Rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- The statistics contain the latest order date '2017-09-12'
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
As shown in the preceding experiment, manual statistical updates on frequently modified tables can prevent the optimizer from generating incorrect execution plans.