Statistics using self-growing key column values

Source: Internet
Author: User

Source: Statistics using self-growing key column values

in today's article I would like to talk about the very common problem in SQL Server: How to handle statistics with self-growing key columns. As we all know, there is a histogram associated with each statistic object in SQL Server. The histogram specifies the distribution of the column data using multiple step descriptions. In a histogram, SQL Server supports up to 200 steps, but it is a problem when you query the data range in the last step of the histogram. Let's look at the following code to reproduce the situation:

1 --Create A simple Orders table2 CREATE TABLEOrders3 (4OrderDate DATE not NULL,5Col2INT  not NULL,6Col3INT  not NULL7 )8 GO9 Ten --Create a non-unique Clustered Index on the table One CREATE CLUSTERED INDEXIdx_ci onOrders (OrderDate) A GO -  - --Insert 31465 rows from the ADVENTUREWORKS2008R2 database the INSERT  intoOrders (OrderDate, Col2, Col3)SELECTOrderDate, CustomerID, TerritoryID fromAdventureWorks2008R2.Sales.SalesOrderHeader - GO -  - --Rebuild the Clustered Index, so, we get fresh statistics. + --The last value of the histogram is 2008-07-31. - ALTER INDEXIdx_ci onOrders REBUILD + GO A  at --Insert Additional rows *after* The last step in the histogram - INSERT  intoOrders (OrderDate, Col2, Col3) - VALUES('20100101',1,1) - GO  $

After the index is rebuilt, we look at the histogram and we find that the last stepping value is 2008-07-31.

1 DBCC Show_statistics ('dbo. Orders'idx_ci' with histogram

As you've seen, we've inserted 200 additional records after the last step into the table. In this case, the histogram has no real feedback on the actual data distribution, but SQL Server is still doing cardinality calculations. Let's look at how SQL Server handles this problem in different versions.

SQL Server 2005 Sp1-sql Server

Before SQL Server 2014, cardinality calculations handled this problem very simply: SQL Server estimates the number of rows is 1, which you can see from the image below.

Click the toolbar display to include the actual execution plan and execute the following query:

SELECT *  from WHERE OrderDate='2010-01-01'

Since SQL Server 2005 SP1, the query optimizer can mark the 1 column as self-growth (ascending) to overcome the limitations just described. If you update the Statistics object 3 times with the self-growing column value, the column is marked as self-growing. To see if there are no columns marked as self-growing, you can use trace flag 2388. When you enable this trace flag, the output of theDBCC show_statistics is changed, with additional columns returned.

DBCC TRACEON (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

now the following code updates the statistics 3 times, inserting rows at the end of our clustered index with the self-growth key column value each time.

1 --= 1st Update the Statistics on the table with a FULLSCAN2 UPDATE STATISTICSOrders withFULLSCAN3 GO4 5 --Insert Additional rows *after* The last step in the histogram6 INSERT  intoOrders (OrderDate, Col2, Col3)7 VALUES('20100201',1,1)8 GO  $9 Ten --= 2nd Update the Statistics on the table with a FULLSCAN One UPDATE STATISTICSOrders withFULLSCAN A GO -  - --Insert Additional rows *after* The last step in the histogram the INSERT  intoOrders (OrderDate, Col2, Col3) - VALUES('20100301',1,1) - GO  $ -  + --= 3rd Update the Statistics on the table with a FULLSCAN - UPDATE STATISTICSOrders withFULLSCAN + GO

Then, when we execute the DBCC SHOW_STATISTICS command, you will see that the SQL Server has said that the column is marked as ascending.

DBCC TRACEON (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

2389 . If you enable this trace flag, the query optimizer is density vector (Density vector) for cardinality calculations.

 --  Now we query the newly inserted range which is currently not present in the histogram.  --  with Trace Flag 2389, the Query Optimizer uses the Density Vector to make the cardinality E Stimation.  select  *  from   Orders  where  OrderDate =   "   " option  (RECOMPILE, Querytraceon )  go  

Now look at the table density:

DBCC Traceoff (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

The table density is now 0.0008873115, so the estimated number of rows for the query optimizer is 28.4516:0.0008873115* (32265-200).

This is not the best result, but it is much better than the estimated number of rows 1!

(Here is the problem, I am a local SQL Server 2008r2, test estimate number of rows or 1, I do not know why, hope to know the explanation of friends, thank you!

)

SQL Server

a new feature introduced in SQL Server 2014 is the new cardinality calculation . The new cardinality calculation is very simple to handle the self-growth key problem: By default, no trace flag is used to calculate cardinality using the density vectors of the statistics object. The following query enables cardinality calculations for 2312 trace flags to run the same query.

 1  --  2  select  *  from   Orders  3  =   " 20100401   '  

let's look at the cardinality calculation here, and you'll see that the query optimizer estimates the number of rows again to be 28.4516, but this time there's no self-growth on the table. This is the self-bringing feature of SQL Server 2014.

(SQL Server 2014 test failed with estimated number of rows is also 1 ...) )

Summary

In this article, I show you how SQL Server's query optimizer handles self-growing key issues. Before SQL Server 2014, you would need to enable 2389 trace flags to get a better cardinality calculation-so that column would be marked as self-growing (ascending). SQL Server 2014, the query optimizer uses density vectors by default for cardinality calculations, which makes it much easier. I hope you've learned something about it, and you'll have a better idea of how to handle self-growing key columns in SQL Server.

Thanks for your attention!

Statistics using self-growing key column values

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.