The first thing to note is that the content of this article is not how to adjust the SQL Server query performance (you can write a book about this content ), instead, how can we use the set statistics Io and set statistics time frequently ignored Transact-SQL commands in the adjustment of SQL Server query performance.
On the surface, it is very simple to adjust the query performance. In essence, we hope that the query speed can be as fast as possible, whether it is to reduce the query running time from 10 minutes to 1 minute, we can also reduce the running time from 2 seconds to 1 second. Our goal is to reduce the running time.
Although the query performance is difficult to adjust for many reasons, this article will only cover one of the reasons, the most important of which is that whenever the use environment changes, therefore, it is difficult to figure out how to adjust the query performance.
If performance tests are performed on a tested server like most users, the results are often not very satisfactory, because the test server environment is not exactly the same as the actual application server environment. As requirements for resources change, SQL Server automatically adjusts itself.
If you have any questions about this, you can run the same query repeatedly on a server with a large load. In most cases, the query execution time is different. Of course, the gap is not big, but the change is enough to make the performance adjustment more difficult than it should be.
What exactly is this? Is your idea wrong or is the server overloaded when running the query? Is this the reason for the increase in running time? Although you can run the query multiple times to obtain an average time, the workload is very large. We need to use a very scientific standard to compare the performance of each test.
Measuring server resources is critical to solving query performance adjustment problems
Many types of server resources are used for query on the server. One of these resources is the CPU usage time. If the database has not changed, it is very close to running the same query repeatedly. Here, I am not referring to the time from the start to the end of a query, but to the number of CPU resources required to run this query, the time required to run a query depends on the server's busyness.
Another resource required by SQL Server is I/O. No matter when the query is run, SQL Server must read data (logical read) from the data buffer. If the required data is not in the buffer, read from the disk (physical read ).
From the discussion, we can know that the more CPU and IO resources a query requires, the slower the query speed. Therefore, another way to describe a query performance tuning task is, the query command should be rewritten in a way that uses less CPU and IO resources. If the query can be completed in this way, the query performance will be improved.
If the purpose of tuning query performance is to make it use as few server resources as possible, rather than the minimum query Run Time, therefore, it is easier to test whether the query performance is improved or reduced. This is especially true for servers with constantly changing resource utilization. First, we need to figure out how to test the resource usage of our servers when adjusting queries.
It reminds me of set statistics Io and set statistics time.
SQL Server supports the set statistics Io and set statistics time statements a long time ago. However, for other reasons, when tuning query performance, many DBAs (whose data is the system administrator) Ignore them. Maybe they are not very attractive. However, for whatever reason, we will find that they are useful in tuning query performance.
There are three ways to use these two Commands: using the transact-SQL command line method, using query analyzer, and setting the appropriate connection properties for the current connection in query analyzer. In this article, we will use the transact-SQL command line to demonstrate their usage.
The functions of set statistics Io and set statistics time are as follows: You can enable or disable various report information of resources used for query. These settings are disabled by default. Let's first look at an example of how to open these commands and see what information they will report.
Before starting our example, start query analyzer and connect to an SQL Server. In this example, we will use the northwind database and use it as the default database for this connection.
Then, run the following query:
Select * from [Order Details]
If you have not modified the Order Details table, 2155 records will be returned for this query. This is a typical result. I believe you have seen it many times in query analyzer.
Now we can run the same query. However, before running this query, we will first run the set statistics Io and set statistics time commands. Note that the opening of these two commands is only valid for the current connection. After one or two commands are opened, after closing the current connection and opening a new connection, you need to execute the corresponding command again. If you want to disable the two commands in the current connection, you only need to replace on in the original command with off and execute it again.
Before starting our example, run the following two commands (do not execute them on the server in use). These two Commands will clear the data and process buffer of SQL Server, this allows us to execute a query at the same starting point each time. Otherwise, the results of each query execution will not be comparable:
DBCC dropcleanbuffers
DBCC freeproccache
Enter and run the following Transact-SQL command:
Set statistics Io on
Set statistics time on
Once the preceding preparation is complete, run the following query:
Select * from [Order Details]
If you run all the above commands at the same time, your output will be different from mine, and it is difficult to figure out what happened.
After running the preceding command, the new information that has not been seen before will be displayed in the result window. At the top of the window, the following information will be displayed:
SQL Server parse and compile time: (SQL Server Parsing and Compilation Time :) CPU time = 10 MS, elapsed time = 61 Ms. SQL Server parse and compile time: (SQL Server Parsing and Compilation Time :) CPU time = 0 MS, elapsed time = 0 ms. |
After the above data is displayed, the queried records are displayed. After 2155 records are displayed, the following information is displayed:
Table 'order details'. Scan count 1, logical reads 10, physical reads 1, read-ahead reads 9. (Table: Order details, number of scans 1, logical read 10, physical read 1, read 9 in advance) SQL Server execution times: (SQL Server execution time :) CPU time = 30 MS, elapsed time = 387 Ms. |
(The results may be different each time. We will mention this in the following discussion .)
So what is the specific meaning of the information? Next we will analyze it in detail.
Result of set statistics time
The Set statistics time command is used to test the running time of various operations, some of which may be of little use for tuning query performance. Run this command to display the following information on the screen:
At the beginning of the output:
SQL Server parse and compile time: CPU time = 10 MS, elapsed time = 61 Ms. SQL Server parse and compile time: CPU time = 0 MS, elapsed time = 0 ms. |
Output end:
SQL Server execution times:
CPU time = 30 MS, elapsed time = 387 Ms.
At the beginning of the output, we can see the second test time, but the CPU time and total time required for the first line to execute a certain operation, but the second line does not seem to be the same.
"SQL Server parse and compile time" indicates that SQL Server parses the "elect * from [Order Details]" command and puts the parsed result in the SQL server process buffer for SQL Server to use. CPU running time and total time.
In this example, the CPU running time is 10 ms, and the total time is 61 Ms. Because the server configuration and load are different, the CPU running time and total time values may be different from the test results in this example.
"SQL Server parse and compile time" in the second line indicates the time for SQL Server to extract the parsing result from the process buffer for execution. In most cases, the two values are 0, because this process is executed quite fast.
If you run the select * from [Order Details] command again without clearing the buffer, the CPU running time and Compilation Time are both 0, because SQL server will reuse the parsing result in the buffer, therefore, you do not need to re-compile the time.
Is it really helpful for you to adjust the query performance? Maybe not, but I will explain the true meaning of the information. You will be surprised that most DBAs do not really understand the meaning of the information:
We are most interested in displaying the last time information in the output:
SQL Server execution times:
CPU time = 30 MS, elapsed time = 387 Ms.
The information shown above indicates the CPU running time used for executing this query and the running query time. The CPU running time is a relatively stable Measurement Method for the CPU resources required to run the query. It has nothing to do with the idle CPU usage. However, this number varies each time you run a query, but the change range does not have a large change in the total time. The total time is the time required for query execution (no blocking or data reading time is calculated). Because the server load is constantly changing, therefore, the scope of this data change is sometimes quite large.
Because the CPU usage time is relatively stable, you can use this data as a way to measure whether your adjustment measures improve the query performance or reduce the query performance.
Set statistics Io Effect
The output information of set statistics Io is displayed at the end of the output. An example is shown below:
Table 'order details'. Scan count 1, logical reads 10, physical reads 1, read-ahead reads 9.
One part of the information is very useful, and the other part is not. Let's take a look at each part and understand its meaning:
Scan count: the number of times the table involved in the query is accessed. In our example, the table is only accessed once. This information is not very useful because the query does not include the connection command, however, this information is useful if the query contains one or more connections.
The scan Count value of a table outside a cycle is 1, but the value of scan count is the number of cycles for a table in a cycle. As you can imagine, for a table in a loop, the smaller the scan Count value, the fewer resources it uses, and the higher the query performance. Therefore, when adjusting the performance of a query with a connection, you need to pay attention to the value of scan count. When adjusting the performance, observe whether it increases or decreases.
Logical reads: This is the most useful data provided by the set statistics Io or set statistics time command. We know that before SQL server can operate on any data, it must first read the data to its data buffer. In addition, we also know when SQL server will read data from the data buffer and read the data to a page of 8 K bytes.
So what is the significance of logical reads? Logical reads refers to the number of pages that SQL server must read from the data buffer to obtain the query results. When a query is executed, SQL server does not read more or less data than the actual requirement. Therefore, when the same query is executed on the same dataset, the resulting logical reads numbers are always the same.
Why is it important to know the logical reads value when SQL Server executes the query during query performance adjustment? This is because this value does not change every time the same query is executed. Therefore, when tuning the query performance, this is a good standard for measuring whether your adjustment measures are successful.
When you adjust the query performance, if the logical reads value drops, it indicates that the server resources used for the query are reduced, and the query performance is improved. If the logical reads value is increased, the adjustment measures reduce the query performance. When other conditions remain unchanged, the less logical reads a query uses, the higher the efficiency, and the faster the query speed.
Physical reads.
Physical read means that before performing a real query operation, SQL Server must read the required data from the disk to the data buffer. Before SQL server starts executing a query, the first thing it needs to do is to check whether the data it needs is in the data buffer. If it is in, it will read the data from it. If it is not, SQL Server must first read the required data from the disk to the data buffer.
As we can imagine, SQL server requires more server resources when performing physical reads than executing logical reads. Therefore, we should avoid physical read operations as much as possible.
The following part sounds confusing. When tuning the query performance, you can ignore physical reads and focus only on logical reads. You will be bored. Didn't you say that physical reads require more server resources than logical reads?
This is indeed the case. The number of physical reads required by SQL Server during query execution cannot be reduced through performance adjustment. Reducing the number of physical reads is an important task of DBA, but it involves adjusting the performance of the entire server, not just the query performance. When adjusting the query performance, we cannot control the size of the data buffer, the server's busyness, and whether the data required to complete the query is in the data buffer or on the disk, the only data we can control is the number of logical reads required to obtain the query results.
Therefore, in the query performance adjustment, we can safely ignore the physical read value provided by the set statistics Io command. (One way to reduce the number of physical reads and speed up the running of SQL Server is to ensure that the physical memory of the server is sufficient .)
Read-ahead reads: similar to physical reads, this value does not have any users in query performance tuning. Read-ahead reads indicates the physical page that the SQL Server reads when executing the pre-read mechanism. To optimize its performance, SQL Server reads a part of data in advance before considering that it needs data. Based on the Accuracy predicted by SQL Server for data requirements, pre-read data pages may be useful or useless.
In this example, the value of read-ahead reads is 9, the value of physical read is 1, and the value of logical reads is 10. There is a simple addition relationship between them. So what is the Query Process on the server? First, SQL Server checks whether the data required to complete the query is in the data buffer. It will soon find that the data is not in the data buffer, the pre-read mechanism is enabled to read the first nine of the 10 Data Pages required by the system to the data buffer zone. When SQL Server checks whether all the required data is already in the data buffer, it will find that nine data pages are already in the data buffer, and one is not, it will immediately read the disk and read the required pages to the data buffer zone. Once all the data is in the data buffer, SQL Server can process the query.
What should we do?
At the beginning of this article, I mentioned that it is very important to use scientific standards to measure the effectiveness of your adjustment measures when tuning the query performance. The problem is that the load of SQL servers is dynamically changing. It is not a reasonable way to use the total running time of the query to measure whether the query performance you are tuning is improved or not.
A better way is to compare multiple data items, such as the number of logical reads or the CPU time used for query. Therefore, to adjust the query performance, you must first use the set statistics Io and set statistics time commands to provide you with the necessary data, in order to determine whether your measures to adjust the query performance are actually achieved.