Impact of dependency on automatic statistics on Performance

Source: Internet
Author: User

When using indexes to query data, the optimizer determines whether to perform index scanning or index search based on the statistics related to the index. However, the statistical step size cannot exceed 200 (the third part of the result returned by DBCC show_statistics). This makes the accuracy of statistical information increasingly inaccurate in tables with a large amount of data. Of course, this impact will not be fatal, and there will be very few opportunities. The key is that if the statistics are not updated in time, the optimizer will select an incorrect execution plan.

If we find many missing_column_statistics events in the event probe, we can set the asynchronous automatic update statistics option to true. To prevent time-out because the query waits for the completion of statistics, the query will definitely select an incorrect execution plan because of inaccurate statistics.When will the statistics be automatically updated?

In Compatibility ViewSYS. sysindexesInRowmodctrThe column records the number of records that have changed since the last update statistics. Each statistical information pair corresponds to a row.If the number of records is less than 500, automatic update is triggered only when the change exceeds 500. For a larger table, the update occurs after the number of records of the 500 + 20% * Table is modified (addition, deletion, modification, and so on are included ).Based on the ratio of this update, if we only rely on automatic update timing, it will cause the optimizer to select an incorrect execution plan.

The following is an example to demonstrate the query performance loss caused by this delayed automatic update operation. The orderdate of the orders table of the northwind database creates a non-unique non-clustered index.

-- Update the statistics and observe the related statistics and view information.
Update   Statistics DBO. Orders With Fullscan
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The value of rowmodctr is 0, indicating that no records are modified after statistics are updated.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- Create a new order and observe the current statistics. The system does not update the statistics.
Insert   Into DBO. Orders (orderdate) Values ( ' 2008-09-12 ' )
-- At this time, we can see that there is no '2017-09-12 'record in the statistics.
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The rowmodctr field is changed to 1.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- Currently, there is only one matching record, and index search is efficient. Option (recompile) does not save the cache plan, preventing subsequent queries from reusing the plan.
Select   *   From DBO. Orders Where Orderdate = ' 2008-09-12 '   Option (Recompile)
-- No matter whether you add or modify the Order Date of 500 orders, only one operation is executed to prevent exceeding the threshold value when auto update is triggered.
-- Update top (500) DBO. Orders set orderdate = '2017-09-12 'Where orderdate <> '2017-09-12'
Declare   @ I   Int
Set   @ I = 500
While ( @ I > 0 )
Begin
Insert   Into DBO. Orders (orderdate) Values ( ' 2008-09-12 ' )
Set   @ I = @ I - 1
End
-- The statistical information does not include '2017-09-12 ',
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)
-- The rowmodctr field of orderdate is changed to 501. If you update orderdate, only the corresponding orderdate row is changed to 501.
Select ID, [ Name ] , Dpages, rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )

Exec Sp_spaceused [ DBO. Orders ] , True
Go
-- At this time, there are many qualified records, but the incorrect selection will continue to use index search because the statistics are not updated in a timely manner, resulting in excessive logic I/O
Select   *   From DBO. Orders Where Orderdate = ' 2008-09-12 '   Option (Recompile)
-- Update statistics
Update   Statistics DBO. Orders With Fullscan
-- Select Table scan correctly.
Select   *   From DBO. Orders Where Orderdate = ' 2008-09-12 '
-- All rowmodctr values are reset to 0.
Select ID, [ Name ] , Rowcnt, rowmodctr From SYS. sysindexes Where ID = Object_id (N ' DBO. Orders ' )
-- The statistics contain the latest order date '2017-09-12'
DBCC Show_statistics ( ' DBO. Orders ' , Orderdate)

As shown in the preceding experiment, manual statistical updates on frequently modified tables can prevent the optimizer from generating incorrect execution plans.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.