SQL Server collects and analyzes statistics on statement running.

Source: Internet
Author: User

In addition to the execution plan, there are other factors to consider when running a statement, such as the statement Compilation Time, execution time, and number of disk reads.

If the DBA can test the problem statement separately, you can enable the following three switches before running to collect statistics about the statement running.
This information is of great value for analysis.Copy codeThe Code is as follows: SET STATISTICS TIME ON
SET STATISTICS IO ON
SET STATISTICS PROFILE ON

SET STATISTICS TIME ON
--------------------------------------------------------------------------------
First, let's see what information will be returned by set statistics time on. Run the following statement first:Copy codeThe Code is as follows: DBCC DROPCLEANBUFFERS
-- Clear All cached data in the buffer pool
DBCC freeproccache
GO
-- Clear all cache execution plans in the buffer pool
SET STATISTICS TIME ON
GO
USE [AdventureWorks]
GO
Select distinct ([ProductID]), [UnitPrice] FROM [dbo]. [SalesOrderDetail_test]
WHERE [ProductID] = 777
GO
SET STATISTICS TIME OFF
GO

In addition to the result set, SQLSERVER returns the following information:Copy codeThe Code is as follows: SQL Server Analysis and Compilation Time:
CPU time = 15 ms, occupied time = 104 Ms.
SQL Server Analysis and Compilation Time:
CPU time = 0 ms, occupied time = 0 ms.
(Four rows affected)
SQL Server execution time:
CPU time = 171 milliseconds, occupied time = 1903 milliseconds.
SQL Server Analysis and Compilation Time:
CPU time = 0 ms, occupied time = 0 ms.

You know that SQL Server statements are executed in the following stages: Analysis-compile-Execute
Analyze the appropriate execution plan based on the statistical information of the table, compile the statement, and finally execute the statement.

What does the above output mean?:
--------------------------------------------------------------------------------
1. CPU time: the value indicates the amount of pure CPU time spent by SQLSERVER in this step. That is to say, the number of CPU resources consumed by the statement
2. Time used: This value indicates the total time used in this step. That is to say, this is the length of time for the statement to run. Some actions may have I/O operations, resulting in I/O waits, or blocking waits. In short, the time is used up, but no CPU resources are used. Therefore, it is normal to take longer than the CPU time, but the CPU time is the total time of the statement on all CPUs. If the statement uses multiple CPUs, and there is almost no other waiting time, it is normal that the CPU time exceeds the Occupied time.
3. Analysis and Compilation Time: This step is the statement Compilation Time. Since all execution plans are cleared before the statement is run, SQLSERVER must compile it.
The compilation time here is not 0. Since compilation mainly involves CPU operations, the CPU time is generally the same as the time used. If the difference is large, it is necessary to check whether SQL Server has bottlenecks in system resources.
Here they are a 15 ms, one is 104 Ms
4. SQLSERVER execution time: the actual running time of the statement. Since the statement was run for the first time, SQLSERVER needs to read data from the disk to the memory, where the statement runs for a long I/O wait. Therefore, the difference between the CPU time and the Occupied time is very large, one is 171 milliseconds, and the other is 1903 milliseconds.

In general, this statement took 104 + 1903 + 186 = 2193 milliseconds, of which the CPU time was 15 + 171 = 186 milliseconds. The main time of the statement should be spent on I/O wait.

Statement again, but no cache is cleared Copy codeThe Code is as follows: SET STATISTICS TIME ON
GO
Select distinct ([ProductID]), [UnitPrice] FROM [dbo]. [SalesOrderDetail_test]
WHERE [ProductID] = 777
GO
SET STATISTICS TIME OFF
GO

This is much faster than the previous one. The output time statistics are:Copy codeThe Code is as follows: SQL Server Analysis and Compilation Time:
CPU time = 0 ms, occupied time = 0 ms.
SQL Server Analysis and Compilation Time:
CPU time = 0 ms, occupied time = 0 ms.
(Four rows affected)
SQL Server execution time:
CPU time = 156 milliseconds, occupied time = 169 milliseconds.
SQL Server Analysis and Compilation Time:
CPU time = 0 ms, occupied time = 0 ms.

Because the execution plan is reused, the "SQL analysis and compilation time" CPU time is 0, and the Occupied time is 0.
Because the data is already cached in the memory and does not need to be read from the disk, the SQL Execution time is 156 CPU time, which is very close to the CPU time, which is 169.

The running time saved here is 1903-169 = 1734 milliseconds. From here, we can see that the cache plays a crucial role in statement execution performance.
To avoid affecting other tests, run the following statement to disable set statistics time on.Copy codeThe Code is as follows: SET STATISTICS TIME OFF
GO

SET STATISTICS IO ON
--------------------------------------------------------------------------------
This switch can output the number of physical reads and logical reads performed by the statement. It plays an important role in analyzing the complexity of statements.
Take the query as an example.Copy codeThe Code is as follows: DBCC DROPCLEANBUFFERS
GO
SET STATISTICS IO ON
GO
Select distinct ([ProductID]), [UnitPrice] FROM [dbo]. [SalesOrderDetail_test]
WHERE [ProductID] = 777
GO

The returned result is:
(Four rows affected)
Table 'salesorderdetail _ test '. 5 scans, 15064 logical reads, 0 physical reads, 15064 pre-reads, 0 lob logical reads, 0 lob physical reads, and 0 lob pre-reads.
The meaning of each output is:
--------------------------------------------------------------------------------
Table: the name of the table. The table here is SalesOrderDetail_test.

Scan count: the number of scans performed. According to the execution plan, the table was scanned several times. Generally, the more times a large table is scanned, the worse it is. The only exception is that if the execution plan is run concurrently, multiple threads read a table at the same time. Each thread reads a part of the table, but the number of all threads is displayed here. That is, when several threads are concurrently executed, several scans will be performed. In this case, the number is too large.

Logical read: the number of pages read from the data cache. The more pages, the larger the data size to be accessed, the larger the memory consumption, and the more expensive the query.

Check whether the index should be adjusted to reduce the number of scans and narrow the scan scope.
Physical read: Number of pages read from the disk
Pre-read: Number of pages that are pre-read into the cache for query
Physical read + pre-read: the number of pages that SQLSERVER reads from the disk to complete this query. If the value is not 0, the data is not cached in the memory. The running speed will be affected.
LOB logic reading: Number of pages of the text, ntext, image, Volume type (varchar (max), nvarchar (max), and varbinary (max) types read from the data cache
LOB physical read: Number of pages of the text, ntext, image, and volume types read from the disk
LOB pre-read: Number of cached pages of the text, ntext, image, and expires types for query
Run it again without clearing the cache.Copy codeThe Code is as follows: SET STATISTICS IO ON
GO
Select distinct ([ProductID]), [UnitPrice] FROM [dbo]. [SalesOrderDetail_test]
WHERE [ProductID] = 777
GO

Result set return:Copy codeThe Code is as follows: 1 Table 'salesorderdetail _ test '. 5 scans, 15064 logical reads, 0 physical reads, 0 pre-reads, and 0 lob logical reads,
2 lob physical reads 0 times, and lob pre-reads 0 times.

The logic read remains unchanged for this time, and the page is still 15064. However, the physical read and pre-read operations are both 0. This indicates that the data has been cached in the memory and does not need to be read from the disk for the second operation. This saves time. To avoid affecting other tests, run the following statement to disable set statistics io on.Copy codeThe Code is as follows: SET STATISTICS IO OFF
GO

SET STATISTICS PROFILE ON
--------------------------------------------------------------------------------
This is the most complex one returned from the three settings. It returns the statement execution plan and the actual number of returned rows when the statement runs in each step.
Through this result, you can not only get the execution plan, understand the statement execution process, analyze the statement optimization direction, but also determine whether SQL Server is
Select a correct execution plan.Copy codeThe Code is as follows: SET STATISTICS PROFILE ON
GO
Select count (B. [SalesOrderID])
FROM [dbo]. [SalesOrderHeader_test]
Inner join [dbo]. [SalesOrderDetail_test] B
ON a. [SalesOrderID] = B. [SalesOrderID]
WHERE a. [SalesOrderID]> 43659 AND a. [SalesOrderID] <53660
GO

The returned result set is very long. The following describes important fields.
--------------------------------------------------------------------------------

Note:: It is shown from the bottom, that is, it is executed from the bottom until the result set is obtained (Row 1) the value displayed in the rows field is the result set returned by this query.

The number of rows indicates the number of steps performed by SQLSERVER. The number of rows indicates that SQLSRVER has performed 6 steps !!
Rows: The actual number of rows returned by each step of the Execution Plan
Executes: Number of times the execution plan is run in each step
StmtText: The specific content of the execution plan. The execution plan is displayed as a tree. Each row is a running step, and a result set is returned and its own cost is also available.
EstimateRows: SQLSERVER estimates the number of rows returned for each step based on the statistical information in the table. When analyzing the execution plan, we often compare the Rows and EstimateRows columns to check whether SQL Server estimates are correct and whether statistics are updated.
EstimateIO: SQLSERVER estimates the I/O cost generated by each step based on the length of the field recorded in EstimateRows and statistics.
EstimateCPU: SQLSERVR estimates the CPU cost generated by each step based on the length of the field recorded in EstimateRows and statistics, as well as the complexity of the task to be done.
TotalSubtreeCost: SQLSERVER calculates the cost of the sub-tree of each execution plan (including the sum of its own cost and the cost of all its lower-layer steps) based on some calculation formula of EstimateIO and EstimateCPU ), the cost described below is about this field value.
Warnings: The warning that SQLSERVER encounters when running each step. For example, if no statistical information is provided for a certain step, cost estimation is supported.
Parallel: Is parallel execution plan used in this step of execution plan?

The above results show that the execution plan is divided into four steps, and the first step is divided into two parallel sub-steps.

Step a1 (5th rows): Find all values of a. [SalesOrderID]> 43659 AND a. [SalesOrderID] <53660 in the [SalesOrderHeader_test] table.
Because the table has a clustered index on this field, SQL can directly use the seek of this index.
SQL prediction returns 10000 records, and 10000 records .. This prediction is accurate. The cost in this step is 0.202 (totalsubtreecost)
Step a2 (6th rows): Find all values of a. [SalesOrderID]> 43659 AND a. [SalesOrderID] <53660 in the [SalesOrderDetail_test] table.
Because the table has a non-clustered index on this field, SQL can directly use the seek of this index. here we can see that SQL is smart. Although the query statement only defines a in the [SalesOrderHeader_test] table. [SalesOrderID] & gt; 43659 AND. [SalesOrderID] <53660 filtering condition, but according to semantic analysis, SQL knows that this condition is true on [SalesOrderDetail_test. Therefore, SQL selects to filter the condition and then join. This greatly reduces the cost of join Operations.
In this step, the SQL statement returns an estimated 50561 records, and the actual return is 50577. Cost is 0.127, not high
Step B (Row 3 ):Join the result set obtained in steps a1 and a2. Because SQL knows through estimation that these two result sets are relatively large, he chooses the join method of Hash Match directly.
SQL estimates that the join operation can return 50313 rows, and the actual return is 50577 rows. Because SQL has statistical information on the two tables [SalesOrderID], the estimation here is very accurate.
Cost in this step equals totalsubtreecost minus his sub-step, 0.715-0.202-0.127 = 0.386. Since the estimated value is very accurate, we can believe that the cost here is the actual cost of each step.
Step c (Row 3 ):Calculating the value of count (*) Based on the result set returned by join is relatively simple. The result of count (*) is always 1, so the predicted value is correct.
In fact, the cost in this step is estimated based on the result set returned by the previous (B) join Operation. We know that the estimated return value of Step B is very accurate, so the estimated cost in this step will not have any major problems.
The cost of this subtree is 0.745, minus the cost of its child node, and his own cost is 0.745-0.715 = 0.03. Is a very small step
Step B (Row 3 ):Convert the value returned by Step c to the int type and return the result.
This step continues from the previous step, which is simpler. The cost required for the Data Type of a convert value is almost negligible. So the cost of this subtree is equal to that of its subnode, both of which are 0.745.
That is to say, his own cost is 0.
By using this method, you can understand the statement execution plan, SQL Server estimation accuracy, and cost distribution.
Finally, let's talk about it.: Different SQL Server versions may have different cost machines, such as SQL2005 and SQL2008.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.