SQL Server collects statistics and analyzes _mssql of statement operations

Source: Internet
Author: User
For a statement to run, there are other factors to consider, in addition to the execution plan itself, such as the compile time of the statement, the execution time, how many disk reads have been made, and so on.

If the DBA is able to run the problem statement individually, you can open the following three switches before running to collect statistics about the statement running.
This information is valuable for analyzing problems.
Copy Code code as follows:

SET STATISTICS time on
SET STATISTICS IO on
SET STATISTICS profile on

SET STATISTICS time on
--------------------------------------------------------------------------------
Let's see what information the set STATISTICS time on returns. To run the statement first:
Copy Code code as follows:

DBCC dropcleanbuffers
--Clears all cached data in the buffer pool
DBCC Freeproccache
Go
--Clears all cached execution plans in the buffer pool
SET STATISTICS time on
Go
Use [AdventureWorks]
Go
SELECT DISTINCT ([ProductID]), [UnitPrice] from [dbo]. [Salesorderdetail_test]
WHERE [productid]=777
Go
SET STATISTICS Time off
Go

SQL Server returns the following two pieces of information in addition to the result set
Copy Code code as follows:

SQL Server profiling and compilation time:
CPU time = 15 milliseconds, elapsed time = 104 milliseconds.
SQL Server profiling and compilation time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.
(4 rows affected)
SQL Server Execution Time:
CPU time = 171 milliseconds, elapsed time = 1903 milliseconds.
SQL Server profiling and compilation time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.

Everyone knows that SQL Server execution statements are divided into the following phases: analysis-compiling-"executing"
Analyze the appropriate execution plan based on the statistical information of the table, then compile the statement and finally execute the statement

now, what do you mean by the above output?
--------------------------------------------------------------------------------
1. CPU time: The meaning of this value refers to the amount of pure CPU time that SQL Server spends in this step. In other words, how much CPU resources the statement spends
2. Elapsed time: This value refers to how much time is spent in this step altogether. That is, this is the length of time the statement is running, some actions may have I/O operations, I/O waiting, or a blocking, blocking wait. In short, time is lost, but no CPU resources are used. So it's normal to take longer than CPU time, but CPU time is the sum of time on all CPUs of the statement. If the statement uses more than one CPU, and the other waits for little, then the CPU time is greater than the elapsed time is normal
3, analysis and compile time: This step, is the statement compile time. SQL Server must compile the statement before it clears all execution plans before it runs.
The compilation time here is not 0. Because the compilation is mainly CPU operation, so the general CPU time and occupy time is similar. If this is a big difference, it's important to see if SQL Server has a bottleneck on system resources.
Here they are a 15 millisecond, one is 104 milliseconds
4. SQL Server execution time: When the statement actually runs. Because the statement is the first run, SQL Server needs to read the data from the disk into memory, where the statement runs a longer I/O wait. So there's a big difference between CPU time and occupancy time, one is 171 milliseconds, and the other is 1903 milliseconds.

In general, this statement takes 104+1903+186=2193 milliseconds, where CPU time is 15+171=186 milliseconds. The main time of the statement should be spent on I/O waiting

now do the statement again, but do not clear any cache
Copy Code code as follows:

SET STATISTICS time on
Go
SELECT DISTINCT ([ProductID]), [UnitPrice] from [dbo]. [Salesorderdetail_test]
WHERE [productid]=777
Go
SET STATISTICS Time off
Go

This time is much faster than last time. The output time statistics are:
Copy Code code as follows:

SQL Server profiling and compilation time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.
SQL Server profiling and compilation time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.
(4 rows affected)
SQL Server Execution Time:
CPU time = 156 milliseconds, elapsed time = 169 milliseconds.
SQL Server profiling and compilation time:
CPU time = 0 milliseconds, elapsed time = 0 milliseconds.

Because the execution plan is reused, "SQL analysis and compile time" CPU time is 0, taking up time is 0
Because the data has been cached in memory, does not need to read from the disk, the SQL execution time CPU time is 156, take the time and CPU time is very close, is 169.

This saves the run time 1903-169=1734 milliseconds, and it can be seen again that caching plays a vital role in statement execution performance
To not affect other tests, run the following statement to turn off set STATISTICS time on
Copy Code code as follows:

SET STATISTICS Time off
Go

SET STATISTICS IO on
--------------------------------------------------------------------------------
This switch can output the number of physical and logical reads made by the statement. Plays an important role in analyzing the complexity of the statement.
Or just that query as an example
Copy Code code as follows:

DBCC dropcleanbuffers
Go
SET STATISTICS IO on
Go
SELECT DISTINCT ([ProductID]), [UnitPrice] from [dbo]. [Salesorderdetail_test]
WHERE [productid]=777
Go

His return is:
(4 rows affected)
Table ' Salesorderdetail_test '. Scan Count 5, logical read 15,064 times, physical read 0 times, read 15,064 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.
The meaning of each output is:
--------------------------------------------------------------------------------
Table: The name of the table. The table here is salesorderdetail_test.

Scan count: Number of scans performed. According to the execution plan, the table was scanned several times. Generally speaking, the larger table scan the more times the worse. The only exception is if the execution plan chooses to run concurrently, a table is read by multiple thread threads at the same time, and each thread reads a portion of it, but the number of threads is shown here. That is, there are several thread in the concurrent do, there will be several scans. It's no problem when the number is big.

Logical reads: The number of pages read from the data cache. The more pages, the greater the amount of data to be accessed by the query, the greater the memory consumption, and the more expensive the query.

Can check whether the index should be adjusted, reduce the number of scans, narrow the scanning range
Physical reads: Number of pages read from disk
Pre-read: Number of pages that are read into the cache for query purposes
Physical Read + pre-read: Is the number of pages that SQL Server reads from disk to complete this query. If not 0, the data is not cached in memory. The speed of operation must be affected
LOB logical reads: number of text, ntext, image, large value types (varchar (max), nvarchar (max), varbinary (max) pages read from the data cache
LOB physical reads: Number of text, ntext, image, large value type pages read from disk
LOB prefetching: The number of text, ntext, image, and large value type pages put into the cache for query purposes
Then run it again without emptying the cache
Copy Code code as follows:

SET STATISTICS IO on
Go
SELECT DISTINCT ([ProductID]), [UnitPrice] from [dbo]. [Salesorderdetail_test]
WHERE [productid]=777
Go

Result set Returns:
Copy Code code as follows:

1 table ' Salesorderdetail_test '. Scan Count 5, logical read 15,064 times, physical read 0 times, pre-read 0 times, LOB logic read 0 times,
2 LOB physical reads 0 times, LOB reads 0 times.

This time the logical read is unchanged, or 15064 pages. But physical reading and prefetching are 0. Indicates that the data has been cached in memory the second run does not need to be read from disk again, save time to not affect other tests, run the following statement to turn off set STATISTICS IO on
Copy Code code as follows:

SET STATISTICS IO off
Go

SET STATISTICS profile on
--------------------------------------------------------------------------------
This is one of the most complex returns in three settings, and he returns the execution plan for the statement, as well as the actual number of rows that the statement runs at each step.
Through this result, not only can get the execution plan, understand the statement execution process, the Analysis statement tuning direction, can also determine whether SQL Server
A correct execution plan was selected.
Copy Code code as follows:

SET STATISTICS profile on
Go
SELECT COUNT (B.[salesorderid])
from [dbo]. [Salesorderheader_test] A
INNER JOIN [dbo]. [Salesorderdetail_test] B
On A.[salesorderid]=b.[salesorderid]
WHERE a.[salesorderid]>43659 and a.[salesorderid]<53660
Go

The result set returned is very long, so here's the important field.
--------------------------------------------------------------------------------

Attention: This is where you start from the bottom up, which means you start at the bottom until you get the result set, so the Rows field in row 1 shows the value of the result set that the query returns.

And how many lines indicate how many steps SQL Server has executed, here are 6 lines, indicating that Sqlsrver performed 6 steps!!
Rows: The actual number of rows returned in each step of the execution plan
executes: How many times each step of the execution plan was run
Stmttext: The specific content of the implementation plan. The execution plan is displayed in a tree form. Each row is a running step, there will be a result set back, there will be their own cost
EstimateRows: SQL Server returns the number of rows that are estimated for each step based on the statistics on the table. When analyzing the execution plan, we often compare rows and estimaterows to make sure that SQL Server is estimated correctly to determine if the statistics are updated
Estimateio: SQL Server estimates the I/O cost per step, based on the length of the field recorded in EstimateRows and statistics
estimatecpu: sqlservr estimates the CPU cost per step, based on the length of the field recorded in EstimateRows and statistics, and the complexity of what is to be done
Totalsubtreecost: SQL Server calculates the cost of each step execution plan subtree based on Estimateio and ESTIMATECPU, including its own and all of his underlying steps, which are described below for the value of this field
Warnings: The warning that SQL Server encounters when running each step, for example, there is no statistical information to support cost estimates at a certain step.
Parallel: Does this step of the execution plan use a parallel execution plan

From the results above you can see that the execution plan is divided into 4 steps, where the first step is divided into parallel two sub steps

Step A1 (line 5th): Find all A In the [Salesorderheader_test] list. [salesorderid]>43659 and a.[salesorderid]<53660 values
Because the table has a clustered index on this field, SQL can use the seek of this index directly
The SQL forecast returns 10,000 records and actually returns 10,000 records ... This prediction is accurate. The cost of this step is 0.202 (totalsubtreecost)
Step A2 (line 6th): Find the values of all a.[salesorderid]>43659 and a.[salesorderid]<53660 from the [salesorderdetail_test] table
Because the table has a nonclustered index on this field, SQL can directly use this index's seek here to see where the SQL is smart. Although a query statement only defines a on a [salesorderheader_test] table. [salesorderid]>43659 and a.[salesorderid]<53660 filter conditions, but according to semantic analysis, SQL knows that this condition is true on [salesorderdetail_test]. So SQL chooses to filter this condition first and then join. This will greatly reduce the cost of the join
In this step, the SQL estimate returns 50,561 records, and actually returns 50,577. The cost is 0.127, and it's not high.
Step B (line 4th):A join is made of the result set A1 and A2 two steps. Because SQL estimates that the two result sets are large, he chooses the join method of the hash match directly.
SQL estimates this join can return 50313 rows and actually return 50577 rows. Because SQL has statistics on [SalesOrderID] on both tables, the estimates here are very accurate
The cost of this step equals Totalsubtreecost minus his sub step, 0.715-0.202-0.127=0.386. Since the estimate is very accurate, we can believe that the cost here is the actual cost per step
Step C (line 3rd):The value of Count (*) based on the result set returned by join is simpler, and the result of Count (*) is always 1, so the predicted value is correct.
In fact, the cost of this step is estimated based on the size of the result set returned from the previous step (b) join. We know that the estimated return value of step B is very accurate, so the estimated cost of this step will not be a big problem
The cost of the Shang tree is 0.745, minus his sub node cost, and his own cost is 0.745-0.715=0.03. It's a very small step to take.
Step B (line 2nd):Converts the value returned by step c to the int type, returning as the result
This step is a continuation of the previous step, simpler. The cost of the data type of the convert value is almost negligible. So the cost of the Shang tree is equal to his child nodes, all 0.745.
In other words, his own cost is 0.
In this way, the user can understand the execution plan of the statement, the accuracy of SQL Server estimate, the distribution of cost
For the last moment,: Different SQL Server versions, different machine cost may vary, such as SQL2005, SQL2008
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.