Statistics for SQL Server

Source: Internet
Author: User

You can see that the statistics are divided into three parts, header information, data field selectivity and histogram.

2.1 Head Information
Column Name Description
Name The name of the statistic information.
Updated Date when the statistics were last updated
Rows The number of rows in the estimated table is not necessarily accurate
Rows Sampled The number of sampled rows for the statistic, if less than rows, indicates that the histogram and density results are more sampled row estimates
Steps Number of steps in the histogram.
Number of steps in the histogram.
Each step spans a column value range, followed by the upper-bound column value. The histogram cascade is defined based on the first key column in the statistics. The maximum number of steps is 200.
Density The calculated formula is a distinct value of all the values in the first key column of the statistics object (excluding the histogram boundary values). The query optimizer does not use this Density value, and the purpose of displaying this value is to be backwards compatible with versions prior to SQL Server 2008.
Average Key length The average number of bytes per value of all key columns in the statistics object.
String Index Yes indicates that the statistics object contains string summary statistics to improve cardinality estimates for query predicates that use the LIKE operator, such as WHERE ProductName '%bike '.
Yes indicates the Statistics object contains string summary statistics to improve the cardinality estimates for query pred Icates that with the LIKE operator; For example, where ProductName like '%bike '.
String Summary statistics are stored separately from the histogram and when it is the type of the statistic object the first key column is created on char, varchar, nchar, nvarchar, varchar (max), nvarchar (max), text, or ntext.
Filter Expression A predicate that contains a subset of the table rows in the Statistics object. NULL = non-filtered statistics. For more information about the filtered predicates, see Create Filtered Indexes. For more information about filtered statistics, see statistics.
Unfiltered Rows The total number of rows in the table before the filter expression is applied. If the filter expression is NULL, unfiltered rows equals rows.

2.2 Data Field selectivity
Column Name Description
Density Density is 1/non-repeating value. The result displays the density of each prefix of each column in the statistics object, with each density displaying a row. A distinct value is a distinct list of column values for each row prefix and column prefix. For example, if the statistics object contains a key column (A, B, c), the result will report the density of the list of non-repeating values for each of the following column prefixes: (a), (A, b), as well as (A, B., C). Using prefixes (A, B, C), each of the following lists is a distinct list of values: (3, 5, 6), (4, 4, 6), (4, 5, 6), and (4, 5, 7). Using prefixes (A, B), the same column values have the following list of distinct values: (3, 5), (4, 4), and (4, 5)
Average Length The average length, in bytes, of the column value list that stores the column prefix. For example, if each value in the list (3, 5, 6) requires 4 bytes, the length is 12 bytes.
Columns The name of the column in the prefix for which all density and Average length are displayed.
2.3 Histogram
Column Name Description
Range_hi_key The upper column value of the histogram cascade. Column values are also known as key values.
Range_rows The estimated number of rows whose column values are within the histogram cascade (excluding the upper bound).
Eq_rows The estimated number of rows whose column values are equal to the upper bounds of the histogram cascade.
Distinct_range_rows The estimated number of rows with distinct column values that are within the histogram cascade (excluding the upper bound).
Avg_range_rows The average number of rows in the histogram cascade (not including the upper limit) of the repeating column value (or range_rows/distinct_range_rows if distinct_range_rows > 0).
A histogram that calculates how often each distinct value in the data appears. Use the column values in the first key column of the statistics object to calculate the histogram, either by sampling the rows or by a full table scan.    If it is a sample creation, then the total number of rows of the storage row of the side here is the estimated value. When you create a histogram, the query optimizer sorts the column values, calculates the number of distinct column value matches, and divides the column's distinct column values into 1-200 consecutive histogram steps, each of which contains a range of column values, all possible column values between the two boundary values, without the boundary value itself, The smallest sort column value is the upper value of the first histogram step. Top (go to top) 3 options for influencing statistics

When each table or indexed view creates statistics, which columns are based on which statistics are created and when statistics are updated, you need to Auto_create_statistics, Auto_update_statistics, Auto_update_ Statistics_async to determine the three database-level options that can be viewed through a system view or by selecting the properties of the database from the graphical interface to see options.

1--View database Statistics option set value 2 SELECT3       name dbname,4       is_auto_create_stats_on,5          is_auto_update_stats_on,6          is_ Auto_update_stats_async_on7 from sys.databases
3.1 auto_create_statisticsThe default is on.    Automatically create statistics options, apply only to tabular statistics!!!    The query optimizer, based on the use of query predicates, creates statistics on a single column on a table (these columns do not create a histogram for the time being), and assists in querying the cardinality estimates for the plan.    This option does not determine whether to create statistics for the index or to produce filtered statistics. The statistics that are created from this option start with the _wa name. Can be viewed through the sys.stats view.
1 SELECT object_name (s.object_id) as object_name,2     col_name (sc.object_id, sc.column_id) as column_name,3     S.name as Statistics_name4 from Sys.stats as S joins Sys.stats_columns as SC5 on     s.stats_id = sc.stats_id and s.object_ id = sc.object_id6 WHERE s.name like ' _wa% ' 7 ORDER by S.name;
3.2 auto_update_statisticsThe default is on. The Automatically update STATISTICS option, the query optimizer automatically determines when statistics will expire when it needs to be updated.    Typically, from the time of the last automatic update, if a large number of data changes are accumulated during the period, including insertions, deletions and modifications, or table structure changes, the statistics will expire. This option is useful for creating statistics objects for indexes, single columns in query predicates, and statistics created using the CREATE STATISTICS statement. 3.3 Auto_update_statistics_async is off by default. Asynchronously updates the statistics options automatically to determine whether the query optimizer is updating with synchronous statistics or asynchronous statistics. Off means that the statistics are automatically updated with synchronization, so that the query plan always compiles with the latest statistics, and if the statistics expire, it waits for the update statistics before the query compiles, and if the statistics are automatically updated asynchronously when the statistics expire,    Compile and execute directly with the existing statistics, even if the compilation is suboptimal due to the statistics expiration, and the execution plan is not optimal, but still runs according to the compiled results. This option is used for statistics that are created for an index, a single column in a query predicate, and a statistic that is created by using the Create STATISTICS statement. Typically, the automatic update of statistics with synchronization is set to off, and in the following two cases, it can be turned on (from the official website):
    • Application the rich and poor do the same query or similar query, compared with the synchronous statistics update, the response time of using asynchronous Statistics Update query can be unaffected, avoid waiting for the most up-to-date statistics information;
    • The application encountered a client request time-out that was caused by one or more queries waiting for updated statistics. In some cases, waiting for synchronization statistics can cause the application to fail over a long time.
Top (go to top) 4 When to create and update4.1 Create
    • The query optimizer automatically creates
      • When you create an index, the query optimizer automatically creates statistics for the index on the table or view
      • When Auto_create_statistics is on, the query optimizer creates statistics for a single column in the query predicate
    • Manually perform the creation
      • Create STATISTICS Creation

In general, the statistics created by the query optimizer can meet most of our needs, but you can consider creating them manually if the following conditions are true:

    • Database Engine Tuning Advisor recommends creating
    • The query predicate contains more than one related column that is not already in the same index
    • The query selects data from a subset of the data
    • Query Missing statistics
4.2 Update STATISTICS defined on a normal table, when any of the following changes occur, the statistics are considered obsolete, and the update action is triggered automatically the next time it is used:
    • -table from no data to greater than or equal to 1 data;
    • -For tables with a data volume of less than 500 rows, when the first field of statistics has a cumulative change of data greater than 500;
    • -For a table with a data volume greater than 500 rows, when the first field data of the statistic is more cumulative than the number of 20%* (the total amount of tabular data).
In these three cases, the third situation is most prone to update not timely situation, such as a 1 million table, its recent one months of data growth is about 150,000, due to less than 20%, statistical information is not updated, which led to the last one months of data SQL execution has not been very correct information provided,    Then you need to check regularly and update the statistics in time! Statistics can be found on temporary tables, and their maintenance policies are basically the same as normal tables, but statistics cannot be established on table variables.
1--Update specified statistics 2 update STATISTICS Sales.SalesOrderDetail ak_salesorderdetail_rowguid; 3 GO 4  5--Update all statistics on the form 6 update STATISTICS sales.salesorderdetail; 7 GO 8  9--Update all statistics on the entire database EXEC Sp_updatesta Ts;11 12--Delete statistics STATISTICS Purchasing.Vendor.VendorCredit, sales.salesorderheader.customertotal;14 GO15 16-- View statistics last update time SELECT19        object_name (object_id) from Sys.stats21 WHERE stats_date (object_id, stats_id) was not Null

Reference: Https://msdn.microsoft.com/zh-cn/library/ms174384.aspx

Statistics for SQL Server

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.