Original: SQL Server Tuning Series Advanced (query statement running several indicator value monitoring)
Objective
In the previous article, we analyzed how the query optimizer works, including: detailed operation steps of the query optimizer, analysis of filter conditions, optimization of index items, and other information.
In this article we analyze the detection of several key indicator values in the course of our operation.
These indicator values are analyzed to analyze the operation of the statement and to analyze how it is optimized.
Through this article we can learn the tuning of a few of the most frequently used tools!
Talk less, start the subject of this article.
Technical preparation
The database version is SQL SERVER2008R2 and is analyzed using a more concise case library (Northwind) from Microsoft.
Weapon one, IO statistics
Through this IO statistic, we can analyze the number of data pages to be scanned by the current query statement. There are several important concepts that we analyze in turn.
The method is simple and one line of code is done:
SET STATISTICS IO on
Let's see an example.
* FROM Person.Contact
Here you can see how many times this statement is used for the data table, based on the scanned item of the data page.
The so-called data page is the underlying data storage for the database, and SQL Server stores the table row data as a data page. Each data page is 8K,
8k=8192 bytes-96 bytes (page header)-36 bytes (row offset) = 8060 bytes
It also says that a data page stores 8060 bytes of pure data content.
Let's explain in turn that there are several read concepts:
Logical Read
Represents the total number of access pages required to process a query. That is, the total number of data pages that need to be read to complete a query statement.
The data page here is likely to come from memory, or it may be from a hard disk read.
Physical Reading
This means the number of pages read from the hard disk. We know that SQL Server will be reading the data page as much as possible in memory, to facilitate the next direct read, improve read speed.
So here's the probability of the next visit to the data page stored in memory, which presents a metric: Cache Hit ratio
Cache Hit Ratio = (logical read-physical read)/logical read
The proposed indicator is designed to measure the validity of cached data pages in memory. For example: if the cache and in-memory data pages are not used once, for this should be removed from memory in a timely manner, after all, for memory resources is very expensive. It should be used to cache data pages with high hit ratios.
Pre-read
Read-ahead is actually the number of pages of data that SQL statements pre-read into memory when optimized. This pre-read data page is evaluated in advance, which is what the query optimizer is going to do in the previous article.
Of course, these pre-read data pages are sometimes not all used, but it basically covers the data pages that the query uses.
It is to be suggested here that the read-ahead data is Cheng Fei by another thread, and the statement optimization line runs in parallel with the same thread, with the goal of quickly acquiring data and increasing the speed of query acquisition.
From this indicator we can analyze a lot of problems, to give an example:
Let's add a new test table with the following script
--into newordersfrom ordersgo-- Add a new set of alter TABLE Newordersadd full_details CHAR (2000 'fulldetails'GO
Then use the following script to look at the size of this table
EXEC sp_spaceused Neworders,truego
We can see the total size of this table data page is 2216KB, we know a page is 8KB, we can infer that the table data page has:
2216 (total data page size)/8 (one data page size) = 277 pages
This means that the data table has 277 data pages.
Of course, we can also view the page's data pages by using the DMV view below
SELECT * from sys.dm_db_index_physical_stats (db_id ('Northwind'), object_id ( ' neworders '), null,null,'detailed')
Through the above analysis,
We can speculate that when querying this table for a full table scan, the logical reading of the theoretical data page should be 277 times
Verify the following statement by following the
--* from Neworders
I'll go...
The logic read here is 1047 pages, and the inferred 277 page above does not match ... Wipe... God horse reason!!!
This is the result of the splitting of the data pages we are analyzing. Because we are in the newly created table, a new column of data is added later: Full_details, adata column of type char (2000), the original data page is not sufficient to populate the newly added column data, resulting in a data page split.
For information on data page splitting, refer to Brother Song's article about http://www.cnblogs.com/CareySon/archive/2012/01/06/2313897.html
The solution to this problem is also given in the article.
We can see how much of the table's split page is available through the DMV view below
SELECT * from sys.dm_db_index_physical_stats (db_id (N'Northwind'), object_id ( ' neworders '), null,null,'detailed')
See, the split page here is 770 pages, so we can extrapolate the number of logical readings.
277 (Original data page) +770 (split page) = 1047 page
So the above questions are analyzed for reasons.
We've also shown a page splitting problem with this table: it affects query performance.
There are many ways to solve this, and the simplest way is to rebuild the clustered index.
* FROM Newordersgo
With IO statistics, in addition to analyzing the fragments caused by the page splits above, it is more important to compare the number of reads between different query statements and to optimize the statements by reducing the number of reads.
As for the pre-read scenario, we have previously analyzed that the data is preloaded by another thread when the T-SQL query statement is optimized.
So this thread actually has a reference value when reading the data, and the data can be read from this reference to ensure that most of the data is useful, that is, to increase the cache hit rate mentioned above.
About this reference value, I analyzed the next, in fact, is divided into two cases of analysis.
first , if the data table is a heap table, SQL Server gets it only through full table scan. This way, in order to avoid repeated reads, increase the consumption, so that the one-time pre-reading is not read a data page,
It's a physics 64 pages in a row.
Take a look at the official explanations for Books Online:
The read-ahead mechanism allows the database engine to read up to 64 contiguous pages (512KB) from a file. The read is performed as a scatter-gather read of the corresponding number of buffers in the buffer cache (which may be non-contiguous). If any page in this range already exists in the buffer cache, the corresponding page read will be discarded when the read is complete. If the page already exists in the cache, you can also "crop" the extent of the page from either end.
So, if our tables are not physically contiguous pages, then the number of reads is not good how to determine.
Let's look at an example of a heap table.
SET STATISTICS IOon--** from Neworders_test
The number of pre-reading here is 8 times, so I estimate that the underlying data page is definitely not sequential. So the result was 3 more times.
We can make a query under DBCC IND () to verify my inference.
DBCC IND ('Northwind','neworders_test',1)
Data information is more, I paste it into Excel, and then make a line chart, where the part of the painting is actually no data page, so does not produce a read.
There is also a problem with the reading order, but the total number is definitely 8 times ....
Hope this way, you crossing can read ... I hope I can express my understanding.
second , if the table is not a heap table, that is, there is a clustered index entry, well, SQL Server is easy to find its pre-reading reference: statistical information.
Also, we know that the data is stored in b-tree numbers, and the data pages that are read are present with the leaf nodes. So basically there is no continuous reading of the gratitude.
A leaf node is a data page, and a data page is a pre-read.
Let's look at an example:
We add the above table to the clustered index entry, empty the cache again, execute the query, and the script is as follows
* FROM Neworders_test
The clustered index is added here, and SQL Server looks like a savior, and according to statistics, read-ahead data.
So if there are errors in the statistics, it causes the read-ahead of the random read .... The cache hit rate is then severely reduced ..... And then seriously increase the speed of in-memory swap-in .... Increased CPU ....
Well, let's continue the article, the pre-read data line we mentioned above can be found in the following DMV.
SELECT * from sys.dm_db_index_physical_stats (db_id (N'Northwind'), object_id ( ' neworders_test '), null,null,'detailed')
From this DMV view, you can see that this table statistic is 277 data pages, so 277 read-ahead is formed.
However, the fact that the data table is 279 pages, that is, the statistical information is problematic, resulting in a few read read 2 data pages, and in order to compensate for this statistical negligence, there are 2 physical reads, re-obtained from the hard disk.
Tool two, TIME statistics
The simple thing about time statistics is that the statistics T-SQL executes the statement execution time items, including CPU time, statement compile time, total statement execution time, and so on.
The use of the method is also very simple, a line of code
SET STATISTICS time on
Through this parameter, we can analyze the above information, which is mainly used to compare the execution time in the query statement tuning, our goal is to reduce the execution time.
Example: We compare the above query statement to the time of the first run and later run (the data has been cached) to understand the importance of the cache by turning on time statistics.
Time to execute again
The article has been a bit long ... Let's get here first.
About tuning the content is too broad, we put in the future space to introduce, interested can advance attention.
Reference documents
- Microsoft Books Online Read page
- Refer to the book SQL. server.2005. Technical Insider "series
If you have any questions, you can leave a message or private messages, and look forward to an in-depth study of your child's shoes with SQL Server. Learn together and progress together.
At the end of the article gives the previous several connections, the following content basically covers our daily write query operation decomposition, it seems necessary to sort out a directory ....
SQL Server Tuning Series Basics
SQL Server Tuning Series Basics (Summary of common operators)
SQL Server Tuning Series Basics (Union operator summary)
SQL Server Tuning Series basics (Parallel operations Summary)
SQL Server Tuning Series basics (Parallel operations Summary chapter II)
SQL Server Tuning Series Basics (Index operations Summary)
SQL Server Tuning Series Basics (subquery operations Summary)
-----------------The following step-by-step article-------------------
SQL Server Tuning Series advanced (how the query optimizer runs)
If you read this blog, feel that you have something to gain, please do not skimp on your " recommendation ".
SQL Server Tuning Series Advanced (query statements run several indicator value monitoring)