Indexes always play an important role in performance. In fact, the query optimizer checks the statistics on the predicates before deciding what indexes to use. By default, statistics are created on all index columns when an index is created. However, it does not mean that statistics on non-index keys are useless for performance. If
Indexes always play an important role in performance. In fact, the query optimizer checks the statistics on the predicates before deciding what indexes to use. By default, statistics are created on all index columns when an index is created. However, it does not mean that statistics on non-index keys are useless for performance. If
Preface:
IndexThe query optimizer always plays an important role in performance. In fact, the query optimizer first checksStatisticsInformationAnd then decide what to useIndex. GenerallyIndex,IndexCreate ColumnsStatisticsInformation. But it does not mean thatIndexKeyStatisticsInformationIt is useless for performance.
If all columns in the table haveIndexThe database cannot afford it, and it is not a good idea, including all the columns used in the predicates.IndexIt is also not a good method. BecauseIndexLoad. Because space is requiredIndexAnd every DML statement needs to be updated.Index.
In general, we recommend that you add columns that appear in the where or ON clause.IndexBut in some cases, it is difficult to create all predicates.IndexCreateStatisticsInformationIt will be a minimum improvement. If Auto_Create_Statistics is ON, the optimizer will help you with this step.
Preparations:
By default, Auto_Create_Statistics is set to ON at the database level, but it must be changed to OFF for the following purposes:
ALTER DATABASE AdventureWorks2012 SET AUTO_CREATE_STATISTICS OFFGOALTER DATABASE AdventureWorks2012 SET AUTO_UPDATE_STATISTICS OFFGO
Create a new table for use in this article:
SELECT *INTO SalesOrdDemoFROM Sales.SalesOrderHeaderGO
Steps:
1. For new tables, there is noStatisticsInformationIn the preceding example, the following statement can be used for verification:
SELECT object_id , OBJECT_NAME(object_id) AS TableName , name AS StatisticsName , auto_createdFROM sys.statsWHERE object_id = OBJECT_ID('SalesOrdDemo')ORDER BY object_id DESC GO
Because noStatisticsInformationTherefore, this query has no data.
2. Create an aggregation on the new table.Index:
CREATE CLUSTERED INDEX idx_SalesOrdDemo_SalesOrderID ON SalesOrdDemo(SalesOrderID)GO
3. Run the script in step 1 again. You can see that the data already exists. Run the following statement and enable the execution plan:
SELECT s.salesorderid , so.SalesOrderDetailIDFROM salesordDemo AS s INNER JOIN Sales.SalesOrderDetail AS so ON s.salesorderid = so.SalesOrderIDWHERE s.duedate = '2005-09-19 00:00:00.000'
4. The following is the execution plan in step 3. Check that the SalesOrdDemo table has an aggregationIndexScan, which is reasonable because no WHERE clause is used in the SalesOrderID column. While the SalesOrderDetails table has non-ClusteringIndexScan. The actual number of rows is significantly different from the estimated number of rows.
5. Now it is time to create a new table on DueDate.StatisticsInformationBecause this column is not included in the queryIndex.
CREATE STATISTICS st_SaledOrdDemo_DueDate ON SalesOrdDemo(DueDate)GO
6. Execute the script in step 3 again without any changes:
SELECT s.salesorderid , so.SalesOrderDetailIDFROM salesordDemo AS s INNER JOIN Sales.SalesOrderDetail AS so ON s.salesorderid = so.SalesOrderIDWHERE s.duedate = '2005-09-19 00:00:00.000'
7. Compare the preceding execution plan. At this time, the SalesOrderDetails table has beenIndexScanning becomes ClusteringIndexQuery, and the overhead is only 2%, more importantly, the actual number of rows is almost the same as the estimated number of rows:
Analysis:
If the optimizer can obtainStatisticsInformationThen, the operator knows the number of rows to be returned and helps the optimizer select the best execution mode.