Statistics on the uniqueidentifier column

Source: Internet
Author: User

The statistics on the uniqueidentifier column are interesting, and there are some nasty behaviors on it. Let's see.

problem Recurrence (the repro)

To show you the behavior we just complained about, I created a database with the following simple table definitions, and I forced the primary key constraint on the uniqueidentifier column. This means that SQL Server generates a unique clustered index in the background, and the clustered index itself has a statistics object that describes the data distribution for that column. Of course, the data distribution is linear, because each value in the uniqueidentifier column is unique to itself.

1 --Create A new table with a uniqueidentifier column as primary key.2 --SQL Server would enforce the primary KEY constraint through unique clustered index in the background.3 CREATE TABLECustomerstableguid4 (5Iduniqueidentifier  not NULL PRIMARY KEY,6FirstNameVARCHAR( -),7LastNameVARCHAR( -)8 )9 GO

Next i insert 1 million records into the table.

1 --Insert 1 million records2 DECLARE @i INT = 03  while(@i <= 1000000)4 BEGIN5     INSERT  intoCustomerstableguid (ID, FirstName, LastName)6     VALUES7     (8         NEWID(),9         'FirstName' + CAST(@i  as VARCHAR),Ten         'LastName' + CAST(@i  as VARCHAR) One     ) A      -     SET @i +=1 - END the GO

Now we use FULLSCAN to update our statistics on the table. FULLSCAN means that SQL Server scans the entire table to update the statistics object within the data.

1 -- Let's update the Statistics with a FULLSCAN. 2 UPDATE STATISTICS  with FULLSCAN 3 GO

But when you look at the stats object now, you'll see that in the histogram, SQL Server only generates 4 steps.

1 ' dbo. Customerstableguid'23DBCC show_statistics ('dbo. Customerstableguid'pk__customer__3214ec271273c1cd')

In the header information you can see that during the statistics update, 1 million lines are sampled, but the histogram shows only 4 steps! But when you now update the statistics object with a smaller sampling interval, things will change:

1 -- Let's update the Statistics with a smaller sampling interval. 2 UPDATE STATISTICS  with  - PERCENT 3 GO

When you look at the histogram now, you will see that we have a lot of different steps:

Summary

Remember this when you use the uniqueidentifier column in your database design. As long as these values are unique, you will have a performance problem, because for the histogram, you have a huge range,avg_range_row can only make 1 rows of the correct estimate.

Thanks for your attention!

Statistics on the uniqueidentifier column

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.