SQL Server Tuning Series Advanced (query statements run several indicator value monitoring)

Last Update:2015-05-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Read Catalogue

Objective
Technical preparation

Objective

In the previous article, we analyzed how the query optimizer works, including: detailed operation steps of the query optimizer, analysis of filter conditions, optimization of index items, and other information.

In this article we analyze the detection of several key indicator values in the course of our operation.

These indicator values are analyzed to analyze the operation of the statement and to analyze how it is optimized.

Through this article we can learn the tuning of a few of the most frequently used tools!

Talk less, start the subject of this article.

The database version is SQL SERVER2008R2 and is analyzed using a more concise case library (Northwind) from Microsoft.

Weapon one, IO statistics

Through this IO statistic, we can analyze the number of data pages to be scanned by the current query statement. There are several important concepts that we analyze in turn.

The method is simple and one line of code is done:

1	`SETSTATISTICSIO` `ON`

Let's see an example.

123	`SETSTATISTICS` `IO` `ONGOSELECT` `*` `FROMPerson.Contact`

Here you can see how many times this statement is used for the data table, based on the scanned item of the data page.

The so-called data page is the underlying data storage for the database, and SQL Server stores the table row data as a data page. Each data page is 8K,

8k=8192 bytes-96 bytes (page header)-36 bytes (row offset) = 8060 bytes

It also says that a data page stores 8060 bytes of pure data content.

Let's explain in turn that there are several read concepts:

Logical Read

Represents the total number of access pages required to process a query. That is, the total number of data pages that need to be read to complete a query statement.

The data page here is likely to come from memory, or it may be from a hard disk read.

Physical Reading

This means the number of pages read from the hard disk. We know that SQL Server will be reading the data page as much as possible in memory, to facilitate the next direct read, improve read speed.

So here's the probability of the next visit to the data page stored in memory, which presents a metric: Cache Hit ratio

Cache Hit Ratio = (logical read-physical read)/logical read

The proposed indicator is designed to measure the validity of cached data pages in memory. For example: if the cache and in-memory data pages are not used once, for this should be removed from memory in a timely manner, after all, for memory resources is very expensive. It should be used to cache data pages with high hit ratios.

Pre-read

Read-ahead is actually the number of pages of data that SQL statements pre-read into memory when optimized. This pre-read data page is evaluated in advance, which is what the query optimizer is going to do in the previous article.

Of course, these pre-read data pages are sometimes not all used, but it basically covers the data pages that the query uses.

It is to be suggested here that the read-ahead data is Cheng Fei by another thread, and the statement optimization line runs in parallel with the same thread, with the goal of quickly acquiring data and increasing the speed of query acquisition.

From this indicator we can analyze a lot of problems, to give an example:

Let's add a new test table with the following script

123456789 --perform the following script to a new table select * into neworders from orders go --add a new column alter table neworders add FULL_DETAILS&NBSP; char not null default ' full Details ' go

Then use the following script to look at the size of this table

12	`EXECsp_spaceused NewOrders,TRUEGO`

We can see the total size of this table data page is 2216KB, we know a page is 8KB, we can infer that the table data page has:

2216 (total data page size)/8 (one data page size) = 277 pages

This means that the data table has 277 data pages.

Of course, we can also view the page's data pages by using the DMV view below

123	`SELECT*FROMSYS.dm_db_index_physical_stats(DB_ID(‘Northwind‘),object_id(‘NewOrders‘),NULL,NULL,‘detailed‘)`

Through the above analysis,

We can speculate that when querying this table for a full table scan, the logical reading of the theoretical data page should be 277 times

Verify the following statement by following the

123456 --先清空缓存数据，生产机慎用DBCC DROPCLEANBUFFERSSET STATISTICS IO ONSELECT * FROM NewOrders

I'll go...

The logic read here is 1047 pages, and the inferred 277 page above does not match ... Wipe... God horse reason!!!

Here is the data page we are analyzing forwarded the record phenomenon. Because we are in the newly created table, the newly added column of data: Full_details, a data column of type char (2000), when the variable-length column in the data row grows so that the original page cannot hold the data row, the data is moved to the new page, and a pointer to the new page is left in the original location. This is called: Forwarded record

We can view the number of pages formed by the forwarded record of the table by the DMV view below.

123	`SELECT*FROMSYS.dm_db_index_physical_stats(DB_ID(N‘Northwind‘),object_id(‘NewOrders‘),NULL,NULL,‘detailed‘)`

Correct: The 770 data page for the forwarded record page, the concept of non-split page (thanks to the friends of the hospital wy123 pointed out).

See, the forwarded record page here is 770 pages, then we can extrapolate the number of logical readings we have.

277 (Original data page) +770 (forwarded record page) = 1047 page

So the above questions are analyzed for reasons.

We also showed a problem with the forwarded record page through this table: it affects query performance.

There are many ways to solve this, and the simplest way is to rebuild the clustered index.

1234567 CREATECLUSTERED INDEX orderID_C ON NewOrders(OrderID)GODROP INDEX NewOrders.orderID_CGOSET STATISTICS IO ONSELECT * FROMNewOrdersGO

With IO statistics, in addition to analyzing the fragments caused by the forwarded record page above, it is more important to compare the number of reads between different query statements and to optimize the statements by reducing the number of reads.

As for the pre-read scenario, we have previously analyzed that the data is preloaded by another thread when the T-SQL query statement is optimized.

So this thread actually has a reference value when reading the data, and the data can be read from this reference to ensure that most of the data is useful, that is, to increase the cache hit rate mentioned above.

About this reference value, I analyzed the next, in fact, is divided into two cases of analysis.

first , if the data table is a heap table, SQL Server gets it only through full table scan. This way, in order to avoid repeated reads, increase the consumption, so that the one-time pre-reading is not read a data page,

It's a physics 64 pages in a row.

Take a look at the official explanations for Books Online:

The read-ahead mechanism allows the database engine to read up to 64 contiguous pages (512KB) from a file. The read is performed as a scatter-gather read of the corresponding number of buffers in the buffer cache (which may be non-contiguous). If any page in this range already exists in the buffer cache, the corresponding page read will be discarded when the read is complete. If the page already exists in the cache, you can also "crop" the extent of the page from either end.

So, if our tables are not physically contiguous pages, then the number of reads is not good how to determine.

Let's look at an example of a heap table.

1234	`SETSTATISTICS` `IO` `ON--新建个测试表SELECT` `` `INTO` `NewOrders_TEST` `FROM` `NewOrdersSELECT` `` `FROMNewOrders_TEST`

The number of pre-reading here is 8 times, so I estimate that the underlying data page is definitely not sequential. So the result was 3 more times.

We can make a query under DBCC IND () to verify my inference.

1	`DBCC IND(‘Northwind‘,‘NewOrders_TEST‘,1)`

Data information is more, I paste it into Excel, and then make a line chart, where the part of the painting is actually no data page, so does not produce a read.

There is also a problem with the reading order, but the total number is definitely 8 times ....

Hope this way, you crossing can read ... I hope I can express my understanding.

second , if the table is not a heap table, that is, there is a clustered index entry, well, SQL Server is easy to find its pre-reading reference: statistical information.

Also, we know that the data is stored in b-tree numbers, and the data pages that are read are present with the leaf nodes. So basically there is no continuous reading of the gratitude.

A leaf node is a data page, and a data page is a pre-read.

Let's look at an example:

We add the above table to the clustered index entry, empty the cache again, execute the query, and the script is as follows

123	`CREATECLUSTERED` `INDEX` `NewOrders_TESTIndex` `ON` `NewOrders_TEST(OrderID)GOSELECT` `*` `FROMNewOrders_TEST`

The clustered index is added here, and SQL Server looks like a savior, and according to statistics, read-ahead data.

So if there is an error in the statistics, it causes read-ahead reading .... Then severely reduces the cache hit rate ... and then seriously increases the speed of swap-in in memory .... Added CPU ....

Well, let's continue the article, the pre-read data line we mentioned above can be found in the following DMV.

123	`SELECT*FROMSYS.dm_db_index_physical_stats(DB_ID(N‘Northwind‘),object_id(‘NewOrders_TEST‘),NULL,NULL,‘detailed‘)`

From this DMV view, you can see that this table statistic is 277 data pages, so 277 read-ahead is formed.

However, the fact that the data table is 279 pages, that is, the statistical information is problematic, resulting in a few read read 2 data pages, and in order to compensate for this statistical negligence, there are 2 physical reads, re-obtained from the hard disk.

Tool two, TIME statistics

The simple thing about time statistics is that the statistics T-SQL executes the statement execution time items, including CPU time, statement compile time, total statement execution time, and so on.

The use of the method is also very simple, a line of code

1	`SETSTATISTICS` `TIMEON`

Through this parameter, we can analyze the above information, which is mainly used to compare the execution time in the query statement tuning, our goal is to reduce the execution time.

Example: We compare the above query statement to the time of the first run and later run (the data has been cached) to understand the importance of the cache by turning on time statistics.

Time to execute again

Cache tracking (added on December 25, 2014)

Of course, we can go a little further, if you want to see the details of this part of the data cache in memory, you can see the following DMV script

1234	`SELECT*` `FROM` `sys.dm_os_buffer_descriptorsWHERE` `DB_NAME(database_id)=‘Northwind‘AND` `page_type=‘DATA_PAGE‘ORDER` `BYpage_id`

The DMV can also be used to analyze the size ratio of each library in memory, and the script is as follows:

1234567891011 --清除缓存dbcc dropcleanbuffers--查看缓存内容中在内存大小SELECT COUNT(*)*8/1024 as ‘Cached Size(MB)‘ ,CASE database_id WHEN 32767 THEN ‘ResourceDB‘ ELSE DB_NAME(database_id) END AS ‘Database‘FROM sys.dm_os_buffer_descriptorsGROUP BY DB_NAME(database_id),database_idORDER BY ‘Cached Size(MB)‘ DESC

After this query, this table has all been cached in memory, because the whole table is a total of 2MB size

SQL Server Tuning Series Advanced (query statements run several indicator value monitoring)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

SQL Server Tuning Series Advanced (query statements run several indicator value monitoring)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support