Optimize SQL Server query performance with set STATISTICS IO and set STATISTICS time

Last Update:2014-06-17 Source: Internet

Author: User

Tags sql server query

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first thing to note is that the content of this article is not how to tune SQL Server query performance (about which you can write a book), but how to leverage the set STATISTICS IO and set STATISTICS in SQL Server query performance tuning Time these two frequently ignored Transact-SQL commands.

On the surface, the adjustment of query performance is a very simple matter. In essence, we want queries to run as fast as possible, either by shrinking the query run time from 10 minutes to 1 minutes, or by shortening the run time from 2 seconds to 1 seconds, and our ultimate goal is to reduce the running time.

Although there are many reasons why query performance tuning is difficult, this article will only cover one of them, the most important of which is that performance tuning is required whenever the use environment changes, so it is difficult to figure out how to tune the performance of the query.

If a performance survey is performed on a test server as most users do, the effect is often not quite satisfactory, because the environment of the test server is not exactly the same as the server environment in which it is actually applied. As the resource requirements evolve, SQL Server automatically adjusts itself.

If you have questions about this, you can run the same query repeatedly on a server that is heavily loaded, and in most cases, the time it takes to execute the query is not the same. Of course, the gap is not big, but its change is enough to make the performance adjustment more difficult than it should be.

What the hell is going on here? Is your idea wrong or is the server overloaded when you run the query? Is this the cause of the increase in running time? Although it is possible to run the query multiple times over and over again to get an average time, this is a lot of work. We need to compare the performance of each test with a very scientific standard.

Measuring server resources is the key to solving query performance tuning issues

There are a number of server resources that are used when executing queries on the server. One of the resources is CPU time, assuming that the database has not changed, repeatedly running the same query its CPU time will be very close. Here, I'm not referring to the time a query runs from start to finish, but rather the amount of CPU resources required to run the query, and the time it takes to run a query is related to how busy the server is.

Another resource required by SQL Server is IO. Whenever a query is run, SQL Server must read the data (logical read) from the data buffer, and if the required data is not in the buffer, it needs to be read to disk (physical read).

As you can see from the discussion, the more CPU and IO Resources A query requires, the slower the query runs, so another way to describe the query performance tuning task is to rewrite the query command in a way that uses less CPU, IO resources, and if you can complete the query in such a way, The performance of the query will be increased.

If the purpose of tuning query performance is to make it use as few server resources as possible, rather than the shortest time to run the query, it is easier to test whether the actions you take are improving the performance of the query or reducing the performance of the query. This is especially true on servers where resource utilization is constantly changing. First, you need to figure out how to test the resource usage of our servers when tuning the query.

Recall the set STATISTICS IO and set STATISTICS time

SQL Server has long supported the two Transact-SQL commands for set STATISTICS IO and set STATISTICS time, but for some other reason, many DBAs (data are system administrators) ignore them when tuning query performance. Maybe they're not very attractive. But whatever the reason, we'll see below that they're still useful in tuning query performance.

There are three ways to use these two commands: Use Transact-SQL command-line mode, use Query Analyzer, and set the appropriate connection properties for the current connection in Query Analyzer. In this article, we will use the Transact-SQL command line to demonstrate how they are used.

Set STATISTICS io and set STATISTICS time function as switches, you can turn on or off the various reporting information for our queries using resources. By default, these settings are turned off. Let's start with an example of how these commands open and see what information they will report.

Before starting our example, start query Analyzer and connect to a SQL Server. In this example, we will use the Northwind database and make it the default database for this connection.

Then, run the following query:

SELECT * FROM [Order Details]

If you have not changed the Order Details table, this query will return 2,155 records. This is a typical result and I believe you have seen it many times in Query Analyzer.

Now let's run the same query, but we'll run the set STATISTICS IO and set STATISTICS time commands first before we run the query. It should be remembered that the opening of these two commands is only valid for the current connection, and once you open one or two of these commands and then close the current connection and open a new connection, you need to execute the corresponding command again. If you want to close the two commands in the current connection, simply change the original command to OFF, and then do it again.

Before starting our example, run the following two commands (do not execute on the server you are using), and these two commands will clear the SQL Server data and process buffers so that we can be at the same starting point each time we execute the query, otherwise The results of each execution of a query are not comparable:

DBCC dropcleanbuffers

DBCC Freeproccache

　Enter and run the following Transact-SQL command:

SET STATISTICS IO on

SET STATISTICS time on

Once the above preparatory work is complete, run the following query:

SELECT * FROM [Order Details]

If you run all of the above commands at the same time, the output you get will be different from mine, and it's hard to figure out what's going on.
After you run the above command, you will see new data in the results window that you have not seen before, at the top of the window, with the following information:

SQL Server parse and compile time: (SQL Server parsing and compiling:)
CPU time = Ten ms, elapsed time = Ms.
SQL Server parse and compile time: (SQL Server parsing and compiling:)
CPU time = 0 ms, Elapsed time = 0 Ms.

When the above data is displayed, the records that are queried are displayed. After displaying 2,155 records, the following information is displayed:

Table ' Order Details '. Scan count 1, logical reads, physical reads 1, Read-ahead reads 9.
(table: Order Details, scan count 1, logical read 10, physical read 1, read ahead 9)
SQL Server Execution times:
(SQL Server Execution time:)
CPU time = ms, Elapsed time = 387 Ms.

(each result may be different, as we discuss the information shown below.) ）

So what is the specific implication of this information? Let's analyze it in detail below.

Result of SET STATISTICS time

The SET STATISTICS time command is used to test the runtime of various operations, some of which may not be useful for tuning query performance. Run this command to get the following display information on the screen:

At the beginning of the output:

SQL Server parse and compile time:
CPU time = Ten ms, elapsed time = Ms.
SQL Server parse and compile time:
CPU time = 0 ms, Elapsed time = 0 Ms.

At the end of the output:

SQL Server Execution times:

CPU time = ms, Elapsed time = 387 Ms.

At the beginning of the output we can see two test times, but the first line takes the CPU time and total time required to perform an operation, but the second line does not seem to be the same.

"SQL Server parse and compile time" means that SQL Server resolves the "elect * from [Order Details]" command and places the resolved results into the SQL Server procedure buffer for SQL Server to use the required The CPU run time and the total time.

In this example, the CPU runs for 10 milliseconds and the total time is 61 milliseconds. Due to the different configuration and load of the server, the two values you get for CPU uptime and total time may differ from the test results in this example.

The second line of SQL Server parse and compile time means that SQL Server takes the parse result from the procedure buffer for execution, in most cases the two values will be 0 because the process executes fairly quickly.

If you run the SELECT * FROM [Order Details] command again without clearing the buffer, the CPU run time and compile time will be 0 because SQL Server reuses the parse result in the buffer, so there is no time to compile again.

Does this information really help you in the tuning of query performance? That may not be the case, but I'll explain the true meaning of this information and you'll be amazed that most DBAs don't really understand the implications of this information:

What we are most interested in is the time information displayed at the end of the output:

SQL Server Execution times:

CPU time = ms, Elapsed time = 387 Ms.

The information shown above shows how much CPU run time was used to execute the query and how much time was used to run the query. CPU uptime is a relatively stable measure of the CPU resources needed to run queries, and it has nothing to do with the CPU's level of free time. However, this number will be different each time the query is run, except that the range of changes does not vary greatly over time. The total time is the amount of time that is required to execute the query (no time to block or read data), and because the load on the server is constantly changing, the range of this data can sometimes be quite large.

Because CPU time is relatively stable, you can use this data as a way to measure whether your throttling measures improve query performance or reduce query performance.

Effect of SET STATISTICS IO

The output information for SET STATISTICS io is displayed at the end of the output, and here is an example of what it shows:

Table ' Order Details '. Scan count 1, logical reads, physical reads 1, Read-ahead reads 9.

Part of this information is very useful, and the other part is not, so let's look at each part and understand its implications:

Scan Count: The number of times the table that is involved in the query is accessed. In our case, the table is accessed only 1 times, and this information is not very useful because the connection command is not included in the query, but this information is useful if the query contains one or more connections.

The scan count value for a table outside a loop is 1, but for a table within a loop, its value is the number of cycles. As you can imagine, for a table within a loop, the smaller the scan count value, the less resources it uses, and the higher the performance of the query. Therefore, when adjusting the performance of a connected query, you need to pay attention to the value of scan count, and when adjusting, observe whether it is increased or decreased.

Logical Reads: This is the most useful data provided by the set STATISTICS IO or set STATISTICS time command. We know that SQL Server must first read the data into its data buffer before it can manipulate any data. In addition, we know when SQL Server reads data from the data buffer and reads the data into a page of size 8K bytes.

So what is the meaning of logical reads? Logical reads refers to the number of pages that SQL Server must read from the data buffer to get the results in the query. When executing a query, SQL Server does not read more or less data than is actually required, so when executing the same query on the same dataset, the resulting logical reads numbers are always the same.

Why is it important to know that the logical reads value of SQL Server when executing a query in tuning query performance? Because this number does not change each time the same query is executed. So, when it comes to tuning query performance, this is a good standard to measure the success of your adjustment measures.

When tuning the performance of the query, if the logical reads value drops, it indicates that the query uses fewer server resources and the query has improved performance. If the logical reads value is increased, the throttling action reduces the performance of the query. In the case of other conditions, the less logical reading a query uses, the more efficient it is, and the faster the query will become.

Physical Reads: What I'm going to say here may sound paradoxical at first, but as long as you think over and over, you will understand the true meaning.

Physical reading means that SQL Server must read the data it needs from disk to the data buffer before performing a real query operation. Before SQL Server starts executing a query, the first thing it does is to check if the data it needs is in the data buffer, if it reads from it, and if not, SQL Server must first read the data it needs from the disk into the data buffer.

As we can imagine, SQL Server requires more server resources to perform a physical read than to perform a logical read. Therefore, in an ideal situation, we should try to avoid physical reading operations.

This part of the following sounds easy to confuse. When tuning the performance of a query, you can ignore physical reads and focus only on logical reads. You're going to wonder, didn't you just say that physical reading requires more server resources than logical reading?

Indeed, the physical reads that SQL Server needs to perform queries are not likely to be reduced by performance tuning. Reducing the number of physical reads is an important task for DBAs, but it involves tuning the performance of the entire server, not just the performance of the query. When performing query performance tuning, we cannot control the size of the data buffer or how busy the server is, whether the data required to complete the query is in the data buffer or on disk, and the only data that we can control is the number of logical reads that need to be performed to get the results of the query.

Therefore, in the tuning of query performance, we can safely disregard the value of the physical read provided by the set STATISTICS IO command. (One way to reduce the number of physical reads and speed up SQL Server is to make sure that the server has enough physical memory.) ）

Read-ahead Reads: Like physical Reads, this value does not have any users in query performance tuning. Read-ahead reads represents the physical page that SQL Server reads when it executes a read-ahead mechanism. To optimize its performance, a pre-read data page may or may not be useful, depending on how accurately SQL Server predicts the data needs before SQL Server reads a subset of the data before it needs data.

In this case, the value of Read-ahead reads is 9,physical Read and the value of logical reads is 10, and there is a simple additive relationship between them. So what happens when I execute a query on the server? First, SQL Server begins to check that the data needed to complete the query is in the data buffer, and it quickly discovers that the data is not in the data buffer and initiates a read-ahead mechanism to read the first 9 of the 10 data pages it needs into the data buffer. When SQL Server checks that all the required data is already in the data buffer, it will find that there are already 9 data pages in the data buffer and one is not, and it will read the disk again and read the required page to the data buffer. Once all the data is in the data buffer, SQL Server can process the query.

What should we do?

As I mentioned at the beginning of this article, it is important to use some scientific criteria to measure the effectiveness of your adjustment measures when querying for performance. The problem is that the load on SQL servers is dynamically changing, and it is not a reasonable approach to use the total run time of the query to measure whether the performance of the query you are tuning is improving or not.

A better approach is to compare multiple data, such as the number of logical reads or the CPU time used by the query. Therefore, when tuning the performance of a query, you need to first use the set STATISTICS IO and set STATISTICS time commands to provide you with the necessary data to determine whether the measures you have tuned to the performance of the query have really got the purpose.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More