Original: 12th Chapter--sqlserver Statistics (2)--Influence of statistics on non-indexed keys
Objective:
Indexes always play an important role in performance, and in fact, the query optimizer first checks the statistics on the predicate before deciding on what index to use. In general, statistics are created on index columns when the index is created by default. However, statistics that do not represent on non-indexed keys are not available for performance.
If all the columns on the table are indexed, then the database will not be affordable and not a good idea, including all the columns used in the predicate are also not a good way to index. Because the index brings load. Because space is required to store the index, each DML statement needs to update the index.
In general, it is recommended to add an index on a column that appears in the where or ON clause, but it is difficult to create an index on all predicates in some cases, where creating statistics is a minimal improvement. If auto_create_statistics is on, the optimizer will help you with this step.
Preparatory work:
By default,Auto_create_statistics is set to on at the database level, but it is changed to OFFfor the following purposes:
ALTER database AdventureWorks2012 set auto_create_statistics offgoalter database AdventureWorks2012 set Auto_update_ STATISTICS Offgo
Then create a new table for use in this article:
SELECT *into salesorddemofrom Sales.salesorderheadergo
Steps:
1. for the new table, there is no statistics on it, so the following statement can be used to verify:
SELECT object_id, object_name (object_id) as TableName, NAME as Statisticsname, Auto_createdfrom sys.statswhere object_id = object_id (' Salesorddemo ') ORDER by object_id DESC GO
Because there is no statistics, there is no data for this query.
2. now create a clustered index on the new table:
CREATE CLUSTERED INDEX Idx_salesorddemo_salesorderid on Salesorddemo (SalesOrderID) GO
3. run the script for step one again, and you can see that the data is already available, now execute the following statement and open the execution plan:
SELECT S.salesorderid, so . Salesorderdetailidfrom Salesorddemo as S INNER JOIN sales.salesorderdetail as so on s.salesorderid = so. Salesorderidwhere s.duedate = ' 2005-09-19 00:00:00.000 '
4. The following is the execution plan in step 3 , it is reasonable to follow the Salesorddemo table on a clustered index scan, because there is no where clause in the use of SalesOrderID column. Instead, the salesorderdetails table has a nonclustered index scan. You can also see a big difference between the actual number of rows and the estimated number of rows.
5. Now it is time to create statistics on the duedate of the new table, because this column is not included in the index in the query.
CREATE STATISTICS st_saledorddemo_duedate on Salesorddemo (duedate) GO
6. execute the script for step 3 again without any changes:
SELECT S.salesorderid, so . Salesorderdetailidfrom Salesorddemo as S INNER JOIN sales.salesorderdetail as so on s.salesorderid = so. Salesorderidwhere s.duedate = ' 2005-09-19 00:00:00.000 '
7. compared to the execution plan above, the salesorderdetails table has been scanned from a nonclustered index into a clustered index lookup, and the cost is only 2%, more than the actual number of rows and estimated number of rows:
Analysis:
If the optimizer can get the statistics listed on the predicate, the meeting knows the number of rows to be returned and helps the optimizer choose the best way to execute.
12th--sqlserver Statistics (2)--effects of statistics on non-indexed keys