Statistics using self-growing key column values

Last Update:2015-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source: Statistics using self-growing key column values

in today's article I would like to talk about the very common problem in SQL Server: How to handle statistics with self-growing key columns. As we all know, there is a histogram associated with each statistic object in SQL Server. The histogram specifies the distribution of the column data using multiple step descriptions. In a histogram, SQL Server supports up to 200 steps, but it is a problem when you query the data range in the last step of the histogram. Let's look at the following code to reproduce the situation:

1 --Create A simple Orders table2 CREATE TABLEOrders3 (4OrderDate DATE not NULL,5Col2INT  not NULL,6Col3INT  not NULL7 )8 GO9 Ten --Create a non-unique Clustered Index on the table One CREATE CLUSTERED INDEXIdx_ci onOrders (OrderDate) A GO -  - --Insert 31465 rows from the ADVENTUREWORKS2008R2 database the INSERT  intoOrders (OrderDate, Col2, Col3)SELECTOrderDate, CustomerID, TerritoryID fromAdventureWorks2008R2.Sales.SalesOrderHeader - GO -  - --Rebuild the Clustered Index, so, we get fresh statistics. + --The last value of the histogram is 2008-07-31. - ALTER INDEXIdx_ci onOrders REBUILD + GO A  at --Insert Additional rows *after* The last step in the histogram - INSERT  intoOrders (OrderDate, Col2, Col3) - VALUES('20100101',1,1) - GO  $

After the index is rebuilt, we look at the histogram and we find that the last stepping value is 2008-07-31.

1 DBCC Show_statistics ('dbo. Orders'idx_ci' with histogram

As you've seen, we've inserted 200 additional records after the last step into the table. In this case, the histogram has no real feedback on the actual data distribution, but SQL Server is still doing cardinality calculations. Let's look at how SQL Server handles this problem in different versions.

SQL Server 2005 Sp1-sql Server

Before SQL Server 2014, cardinality calculations handled this problem very simply: SQL Server estimates the number of rows is 1, which you can see from the image below.

Click the toolbar display to include the actual execution plan and execute the following query:

SELECT *  from WHERE OrderDate='2010-01-01'

Since SQL Server 2005 SP1, the query optimizer can mark the 1 column as self-growth (ascending) to overcome the limitations just described. If you update the Statistics object 3 times with the self-growing column value, the column is marked as self-growing. To see if there are no columns marked as self-growing, you can use trace flag 2388. When you enable this trace flag, the output of theDBCC show_statistics is changed, with additional columns returned.

DBCC TRACEON (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

now the following code updates the statistics 3 times, inserting rows at the end of our clustered index with the self-growth key column value each time.

1 --= 1st Update the Statistics on the table with a FULLSCAN2 UPDATE STATISTICSOrders withFULLSCAN3 GO4 5 --Insert Additional rows *after* The last step in the histogram6 INSERT  intoOrders (OrderDate, Col2, Col3)7 VALUES('20100201',1,1)8 GO  $9 Ten --= 2nd Update the Statistics on the table with a FULLSCAN One UPDATE STATISTICSOrders withFULLSCAN A GO -  - --Insert Additional rows *after* The last step in the histogram the INSERT  intoOrders (OrderDate, Col2, Col3) - VALUES('20100301',1,1) - GO  $ -  + --= 3rd Update the Statistics on the table with a FULLSCAN - UPDATE STATISTICSOrders withFULLSCAN + GO

Then, when we execute the DBCC SHOW_STATISTICS command, you will see that the SQL Server has said that the column is marked as ascending.

DBCC TRACEON (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

2389 . If you enable this trace flag, the query optimizer is density vector (Density vector) for cardinality calculations.

 --  Now we query the newly inserted range which is currently not present in the histogram.  --  with Trace Flag 2389, the Query Optimizer uses the Density Vector to make the cardinality E Stimation.  select  *  from   Orders  where  OrderDate =   "   " option  (RECOMPILE, Querytraceon )  go

Now look at the table density:

DBCC Traceoff (2388)DBCC show_statistics ('dbo. Orders'idx_ci')

The table density is now 0.0008873115, so the estimated number of rows for the query optimizer is 28.4516:0.0008873115* (32265-200).

This is not the best result, but it is much better than the estimated number of rows 1!

(Here is the problem, I am a local SQL Server 2008r2, test estimate number of rows or 1, I do not know why, hope to know the explanation of friends, thank you!

)

SQL Server

a new feature introduced in SQL Server 2014 is the new cardinality calculation . The new cardinality calculation is very simple to handle the self-growth key problem: By default, no trace flag is used to calculate cardinality using the density vectors of the statistics object. The following query enables cardinality calculations for 2312 trace flags to run the same query.

 1  --  2  select  *  from   Orders  3  =   " 20100401   '

let's look at the cardinality calculation here, and you'll see that the query optimizer estimates the number of rows again to be 28.4516, but this time there's no self-growth on the table. This is the self-bringing feature of SQL Server 2014.

(SQL Server 2014 test failed with estimated number of rows is also 1 ...) ）

Summary

In this article, I show you how SQL Server's query optimizer handles self-growing key issues. Before SQL Server 2014, you would need to enable 2389 trace flags to get a better cardinality calculation-so that column would be marked as self-growing (ascending). SQL Server 2014, the query optimizer uses density vectors by default for cardinality calculations, which makes it much easier. I hope you've learned something about it, and you'll have a better idea of how to handle self-growing key columns in SQL Server.

Thanks for your attention!

Statistics using self-growing key column values

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Statistics using self-growing key column values

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Statistics using self-growing key column values

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support