Performance tests on the impact of SQL Server statistics on queries

Source: Internet
Author: User
Tags sql server query

Introduction
The SQL Server query analyzer is overhead-based. Generally, the query Analyzer determines how to select an efficient query route based on the predicate, such as the index to be selected. When the analyzer looks for a path, it does not count the number of rows and ranges of values contained in the index every time, instead, the information is created and updated based on certain conditions and saved to the database. This is the so-called statistical information.

 

How to View statistics
It is very simple to view SQL Server statistics. Run the following command:

DBCC SHOW_STATISTICS ('Table name', 'index name ')

The result is shown in Figure 1.

Figure 1. Statistics

 

How statistics affect queries
The following is a simple example to show how statistics affect the query analyzer. I have created a test table with two INT-value columns, where id is auto-incrementing, ref is used to create non-clustered indexes, and 100 pieces of data are inserted, from 1 to 100, insert 9900 data records equal to 100. The statistical information in Figure 1 is the statistical information of the sample data.

At this time, I use the ref value after where as the query condition, but given different values, we can see that the query analyzer made different choices based on the statistical information, as shown in figure 2.

   

Figure 2. The Query Optimizer makes different choices based on different predicates

 

In fact, for the query analyzer, the bar chart is very useful for directly identifiable predicates, such:

Where date = getdate ()
Where id = 12345.
Where monthly_sales: <10000/12
Where name like "Careyson" + "%"

 

However,

Where price = @ vari
Where total_sales> (select sum (qty) from sales)
Where a. id = B. ref_id

Where col1 = 1 and col2 = 2

 

This type of query can only know the value at runtime, and the sampling step is obviously not so easy to use. In addition, if the predicate in the fourth row is two query conditions, the sampling step is not easy to use. No matter how many columns the index has, the sampling step only stores the first column of the index. When the bar chart is no longer used, SQL Server uses density to determine the optimal query route.

The formula for density is: 1/the number of unique values in the table. The smaller the density, the easier the index to be selected. For example, in the second table in figure 1, we can use the following formula to calculate the density:

   

Figure 3. Density of a column

 

According to the formula, when the data volume in the table increases gradually, the density will decrease.

For queries that cannot be selected based on the sampling step, the query analyzer uses density to estimate the number of rows. The formula is: Estimated number of rows = number of rows in the table * density.

Then, based on this formula, if I make a query, the estimated number of rows will be the number shown in figure 4.

   

Figure 4. Estimated number of rows

 

Let's verify this conclusion, as shown in figure 5.

   

Figure 5. Estimated number of rows

 

Therefore, we can see that the estimated number of rows is different from the actual number of rows. When the data distribution is even or the data volume is large, this error will change to very small Statistics updates.
As shown in the preceding example, because the query analyzer depends on statistics for queries, outdated statistics may lead to inefficient queries. Statistics can be managed by SQL Server, updated manually, or updated manually when updated.

When automatic update is enabled, data changes in the SQL Server Monitoring table are automatically updated when the critical value is reached. This standard is:

When inserting data into an empty table
The number of tables with less than 500 rows increases by 500 rows or more.
When the number of rows in a table exceeds 500, the amount of data changes is greater than 20%.
If the preceding conditions are met, the statistics are updated.

Of course, you can also use the following statement to manually update statistics.

     

Update statistics table name [index name]

 

Column-level statistics
SQL Server can also create statistics for columns that do not belong to any index to help the query analyzer obtain the "estimated number of rows". when you enable the database-level option "automatically create Statistics", as shown in figure 6.

SQL Server can also create statistics for columns that do not belong to any index to help the query analyzer obtain the "estimated number of rows". when you enable the database-level option "automatically create Statistics", as shown in figure 6.

   

Figure 6. Automatically create statistics

 

When this option is set to True, when the where predicate specifies a column not on any index, the column statistics will be created, but there are two exceptions:

  • The cost of creating statistics exceeds the cost of generating a query plan
  • Statistics are not automatically generated when SQL Server is busy.

 

You can view these statistics in the system view sys. stats, as shown in Figure 7.

   

Figure 7. View statistics in the system View

 

You can also use the following statement to manually create statistics:

Create statistics Statistical name ON table name (column name [,... n])

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.