Use profiler to optimize the SQL Server database system

Source: Internet
Author: User
Tags high cpu usage most popular database microsoft website
Application profiler optimization SQL Server database system released on: 6/28/2005 | Updated on: 6/28/2005

Author: su youquan

Content on this page
Overview
Traditional methods and limitations of Profiler trace file Performance Analysis
Read80trace tool introduction and its normalization Function
Analyze data after Normalize Using Stored Procedures
Tips for using usp_getaccesspattern
Snake foot: Which is a hot database?

Overview

When your SQL Server database system runs slowly, you may know more or less that you can use the SQL Server Profiler (SQL event profiler in Chinese) tool for tracking and analysis. Yes, profiler can be used to capture all statements sent to SQL Server and execution performance data of statements (such as the number of Read/writes pages, CPU usage, and statement duration) for later analysis. However, this article does not introduce how to use the profiler tool, but rather describes how to use the read80trace tool (for details about this tool) in combination with custom stored procedures to outline and analyze the trace files captured by Profiler, finally, an exciting data analysis report is generated, allowing you to optimize the SQL Server database system.

This article is particularly useful to readers who need to analyze the performance of SQL Server large database systems, such as DBAs. In a database system with a large scale and complex application logic, the files generated by profiler are often huge. For example, in profiler, you can configure only to capture basic statement events, the size of the trace file captured after two hours of running may be GB. The method described in this article can not only greatly save the time and money for analyzing trace, but also free you from the massive amount of data in the trace file, and let you know the access mode of the database system well, so as to know which type of statements have the greatest impact on performance and which types of statements need to be optimized.

Back to Top

Traditional methods and limitations of Profiler trace file Performance Analysis

First, let's talk about the access mode of the database system. In addition to using trace files to solve problems such as deadlocks, blocking, and timeout, the most common and major function is to obtain the following important information from the trace file:

1.

Statements that run most frequently

2.

Key statements that affect system performance

3.

Proportion of various statement groups and related performance statistics

The access mode mentioned in this article is the above three information. We know that the modules of the database system are basically fixed, and the access methods for each module to SQL Server are almost fixed. The specific menu and button are basically unchanged, within a long enough time, the various SQL server access statements and their proportions are basically fixed. In other words, as long as the profiler sampling time is long enough (I generally run for more than two hours), the Access Mode of the database system can be counted from the trace file. Each database system has its own unique access mode. An important goal of analyzing the Profiler trace file is to find the access mode of the database system. Once you get the access mode, you can be confident and confident in optimizing the system. Unfortunately, no tools have been available so far to conveniently obtain this information.

There are two traditional trace analysis methods. One is to use the profiler tool itself. For example, you can use the filter function of profiler to filter out statements that run for more than 10 seconds, or sort the statements that consume the most CPU by CPU. The other is to import the trace file into the database and then use T-SQL statements for statistical analysis. These two methods are effective for small trace files. However, if the number of Trace files is relatively large (for example, four trace files larger than MB), the two methods have great limitations. One of the limitations is that the analysis and statistics are not easy because of the huge file size, which often makes it impossible for you to outline the execution performance of all statements from a global perspective. You can easily get confused by some statements and spend your energy on them. In fact, it is not a key statement that really requires attention. The second limitation is that you find that although many statement modes are very similar (only the parameters are different during execution), there is no simple way to classify them together for statistics. In short, you cannot easily obtain the access mode of the database system, and you cannot achieve a high level of architecture during optimization. This is the limitation of traditional analysis methods. The read80trace tool and custom stored procedures described below can overcome these limitations.

Back to Top

Read80trace tool introduction and its normalization Function

Read80trace is a command line tool. Using the read80trace tool can greatly save time on analyzing trace files and achieve twice the result with half the effort. The main principle of read80trace is to read the trace file and then executeNormalize(Standardized), import to the database, and generate an HTML page for performance statistical analysis. In addition, read80trace can generate an rml file, and then ostress uses the rml file to replay all events in the trace file in multiple threads. This is possible for statements that want to capture profiler on another server. This article does not detail read80trace or ostress tools, interested readers please refer to the relevant information, the relevant software can be downloaded from the Microsoft website (Note: software called rml, http://www.microsoft.com/downloads ).

I want to use the standardized functions of read80trace. What is standardization? It is to classify all statements with similar statement modes but different parameters. For example, there are several statements in the trace:

 
Select * from authors where au_lname = 'white' select * from authors where au_lname = 'green' select * from authors where au_lname = 'carson'

After standardization, the preceding statement is as follows:

 
Select * from authors where au_lname = {STR} select * from authors where au_lname = {STR}

With standardized statements, it is no longer difficult to calculate the access mode of the database system. When running read80trace, I usually use the following command line:

 
Read80trace-F-dmydb-imytrace. TRC

The-F switch does not generate rml files because I do not need the replay function. The generated rml file is large. We recommend that you use the-F switch if you do not need to replay the file.

The-D switch tells read80trace to save the trace file processing result to the mydb database. The stored procedure we created later accesses the tables generated by read80trace in mydb for statistics. -The I switch specifies the trace file name to be analyzed. The read80trace tool is very clever. If there is a series of Trace files generated by profiler in this directory, such as mytrace. TRC, mytrace1.trc, and mytrace2.trc, it reads them sequentially for processing.

In addition to the preceding descriptions, read80trace has many interesting functions. For example, you can use the-I switch to enable read80trace to read trace files from ZIP or cab files without unzipping them yourself. All switches are described in details in read80trace. CHM. What I like most is the performance of read80trace. It takes less than one hour to analyze several trace files in GB size. My computer is an old machine with only MB of memory. I am very satisfied with this performance.

You may use read80trace to analyze the trace file generated by the stress test. I suggest you analyze the trace files captured from the production environment. Because many stress testing tools cannot really simulate the real environment, the trace files they obtain cannot reflect the actual situation. Even some stress testing tools execute self-written statements cyclically, and do not reflect the accurate access mode. We recommend that you only use the trace generated by stress testing as a reference.

Back to Top

Analyze data after Normalize Using Stored Procedures

With standardized statements, you can use stored procedures for statistical analysis. The basic idea of analysis is to make group by statistics on the reads, CPU, and duration of all statements in the same mode to get the access mode information:

1.

The total number of executions of a specific statement, average number of read pages (READS), average CPU time, and average execution time.

2.

The proportion of such statements in all statements, such as execution times, reads, and CPU.

the definition and description of stored procedures are as follows:

/*************************************** * ********************/Create procedure usp_getaccesspattern 8000 @ duration_filter Int =-1 -- input parameter, you can filter the statistics based on the statement execution time as begin/* first obtain the total performance data of all statements */declare @ sum_total float, @ sum_cpu float, @ sum_reads float, @ sum_duration float, @ sum_writes floatselect @ sum_total = count (*) * 0.01, which is the total number of all statements. @ Sum_cpu = sum (CPU) * 0.01, -- this is the CPU time consumed by all statements @ sum_reads = sum (READS) * 0.01, -- this is the number of reads consumed by all statements, 8 K. @ Sum_writes = sum (writes) * 0.01, -- this is the number of writes consumed by all statements, in 8 K. @ Sum_duration = sum (duration) * 0.01 -- this is the total execution time of all statements. From tblbatches -- this is the table generated by read80trace, including all the statements in the trace file. Where duration >=@ duration_filter -- whether to filter by execution time/* then perform group by to obtain the proportion occupied by certain statements */select ltrim (STR (count (*) exec_stats, ''+ STR (count (*)/@ sum_total, 4,1) + '%' execratio, ltrim (STR (sum (CPU) + ': '++ ltrim (STR (AVG (CPU) cpu_stats,' + STR (sum (CPU)/@ sum_cpu, 4, 1) + '%' cpuratio, ltrim (STR (sum (READS) + ':' + ltrim (STR (AVG (READS) reads_stats, ''+ STR (sum (READS)/@ sum_reads, 4, 1) + '%' readsratio, -- ltrim (STR (sum (writes)) + ':' + Ltrim (STR (AVG (writes) -- writes_stats, ''+ STR (sum (writes)/@ sum_writes, 4,1) + '%) ', ltrim (STR (sum (duration) +': '+ ltrim (STR (AVG (duration) duration_stats, ''+ STR (sum (duration) /@ sum_duration, 4, 1) + '%' durratio, textdata, count (*)/@ sum_total TP, sum (CPU)/@ sum_cpu CP, sum (READS) /@ sum_reads RP, sum (duration)/@ sum_duration dpinto # queries_staticstics from/* tbluniquebatches table stores all standardized statements. */(Select reads, CPU, duration, writes, convert (varchar (2000), normtext) textdata from tblbatchesinner join tbluniquebatches on tblbatches. hashid = tbluniquebatches. hashid where duration> @ duration_filter) B group by textdata -- this group by is very important. It classifies and counts statements. Print 'top 10 order by CPU + reads + duration' select top 10 * from # queries_staticstics order by CP + RP + dp descprint 'top 10 order by CPU 'select top 10 * from # queries_staticstics order by CP descprint 'top 10 order by reads' select top 10 * from # queries_staticstics order by RP descprint 'top 10 order by duration' select top 10 * from # queries_staticstics order by DP descprint 'top 10 order by batches' select top 10 * from # queries_staticstics order by TP descend /******************** **************************************** */

Considering that the output result is too long, writes is removed from the storage process. This is because most database systems are dominated by reads. You can easily modify the stored procedure to include write.

The stored procedure is not complex and easy to understand. You can see that the statistical results are placed in the queries_staticstics table, sorted by different conditions, and then output. For example:

 
Select top 10 * from # queries_staticstics order by CP DESC

In the preceding statement, records in the queries_staticstics table are sorted and output according to the proportion of the total CPU usage of certain statements CP (that is, sum (CPU)/@ sum_cpu. This allows you to quickly locate the type of statements that consume the most CPU resources when analyzing the CPU performance of the server.

Now let's look at the output of an instance:

 
/******************/Use mydbexec usp_getaccesspattern/* You can enter an execution time as a filter parameter, in milliseconds. For example, usp_getaccesspattern 1000 *//********************/

Output result 1 (partial results. In addition, because the original output result is too long, the result is truncated into two parts for easy reading ):

Figure 1: sample output result 1

The preceding example samples the Business System of a large company. The problem with this system is applicationProgramSlow operation, SQL Server servers have a high CPU (90% to CPUs ~ 100% fluctuations ). I used the pssdiag tool to sample data for about 2 hours, and then run read80trace and usp_getaccesspattern to get the above result. Reports are clear at a glance. Stored Procedure DBO. x_dedup_proc runs 75 times in two hours, but occupies 90.8% of CPU resources and 94.6% of reads. From the perspective of access mode, this stored procedure is the key statement that causes high CPU and slow system performance. Once the stored procedure is optimized, the system performance problems will be solved. You may have doubts about the frequency of running 75 times in two hours. In fact, you can see that the average CPU time of this stored procedure is 681961 milliseconds, about 11 minutes. That is to say, a CPU can run a maximum of (60*2)/11 = 10 records in two hours. The system has a total of 8 CPUs, even if all the CPUs are used to run the statement, the maximum number is 10*8 = 80. The above execution is 75, indicating that the stored procedure is continuously running.

So what are the statements that the system runs most frequently? I extract another part from the result (Figure 2 ):

Figure 2 sample output result 2

From the table above, we can see that the statements that run most frequently are

 
Use XB set quoted_identifier, ansi_null_dflt_on...

Obviously, this is an execution environment configuration statement with no reference value. The other two statements that occupy 8.2% of the total number of statements are worth noting:

 
Select count (*) from x_process_stats where process ...... Select count (*) from x_process_stats where process ......

In this example, the key statement DBO. x_dedup_proc is very prominent, and even the preceding two statements can be ignored.

Let's look at another example (Figure 3 ):

Figure 3: sample output result 3

In the preceding example, the key statements are as follows:

 
Select count (*) from gtbl7msselect caseno from patientdata_sum where MRN = @ p1

Subsequent checks showed that the relevant tables did not have a valid index, and the performance immediately improved a lot after the index was added .. To solve these two statements, we need to use the same method to continue analysis and optimization until the system performance is acceptable .. Note that performance tuning is a long-term process. It is unlikely that you can solve all the problems within one or two days. It may solve 80% of the problem at the beginning, but it takes another 20% of the time to solve the problem later than 80%.

Back to Top

Tips for using usp_getaccesspattern

The output report of usp_getaccesspattern contains a wealth of information. You must have a general picture when analyzing reports. You can also choose the information you need. If it is a system with a CPU performance bottleneck, you need to pay attention to the statements with a high CPU usage ratio. For disk Io performance bottlenecks, you need to find statements that occupy a large proportion of reads and have a high average reads. Note that sometimes statements that run frequently may not be the key statements you need to pay attention. The most ideal case is that the key statement is the most frequent statement. Sometimes, even if the proportion of resources occupied by the most frequent statements is not high, if optimization is still possible, minor optimization will bring considerable benefits to the system due to the amplification effect.

When using usp_getaccesspattern, use the @ duration_filter parameter in combination. Because the parameter is in milliseconds, it is recommended that the parameter should not be smaller than 1000, but should be a multiple of 1000, such as, etc. This parameter often provides very interesting output. The output overlaps with the running results without parameters. Overlapping statements are usually the statements that need attention. Note that the maximum number of statements to run is not necessarily more than 1000 milliseconds. The results of all statements with parameters may not include the most frequently executed statements. I often perform cross-analysis on four results at the same time. One result is obtained by running without parameters, and the other three results are run with 5000, and milliseconds respectively. The four results of comparative analysis often give me a clear and thorough understanding of the database system access mode.

You may encounter an int integer overflow error when running the stored procedure. This is because the CPU and writes fields in the tblbatches table are int rather than bigint. You can run the following statement for correction:

 
Alter table tblbatches alter column reads bigintalter table tblbatches alter column CPU bigintalter table tblbatches alter column writes bigint

After modification, the overflow problem will be solved.

Back to Top

Snake foot: Which is a hot database?

This article has basically ended. You already know how to use read80trace and usp_getaccesspattern to obtain the access mode of the database system, and how to analyze the access mode report from the Global height, so that you can gain a thorough understanding when optimizing the system.

In addition, you can apply similar analysis ideas to obtain the proportion of resources occupied by each database. This is useful when SQL Server has multiple databases. From the report, you can immediately know which database is the most popular database that consumes the most system resources. The statement is as follows:

Print 'group by dbid' declare @ sum_total float, @ sum_cpu float, @ sum_reads float, @ sum_duration float, @ sum_writes floatselect @ sum_total = count (*) * 0.01, @ sum_cpu = sum (CPU) * 0.01, @ sum_reads = sum (READS) * 0.01, @ sum_writes = sum (writes) * 0.01, @ sum_duration = sum (duration) * 0.01 from tblbatchesselect dbid, ltrim (STR (count (*) exec_stats, ''+ STR (count (*)/@ sum_total,) + '%' execratio, ltrim (STR (sum (CPU) + ':' + + ltrim (STR (AVG (CPU) cpu_stats, ''+ STR (sum (CPU) /@ sum_cpu, 4,1) + '%' cpuratio, ltrim (STR (sum (READS) + ':' + ltrim (STR (AVG (READS) reads_stats, ''+ STR (sum (READS)/@ sum_reads, 4,1) + '%' readsratio, ltrim (STR (sum (duration) + ': '+ ltrim (STR (AVG (duration) duration_stats, ''+ STR (sum (duration)/@ sum_duration,) +' % 'durratio, count (*) /@ sum_total TP, sum (CPU)/@ sum_cpu CP, sum (READS)/@ sum_reads RP, sum (duration)/@ sum_duration dpinto # queries_staticstics_groupbydb from (select reads, CPU, duration, writes, convert (varchar (2000), normtext) textdata, dbid from tblbatchesinner join tbluniquebatches on tblbatches. hashid = tbluniquebatches. hashid) B group by dbid order by sum (READS) descselect dbid, execratio batches, cpuratio CPU, readsratio reads, durratio durationfrom # queries_staticstics_groupbydb

The following is an example of the above statement results:

 
Dbid batches CPU reads duration ------ ------- -------- 37 21.1% 18.7% 29.1% 27.1% 33 12.7% 32.4% 19.5% 24.8% 5.6% 28.3% 15.6% 26.1% 53.9% 2.9% 14.2% 2.1% 0.8% 7.2% 13.2% 6.6% 1.0% 16 0.0% 1.5% 1.9% 0.7% 35 2.0% 2.7% 1.8% 5.7% 7 0.1% 0.1% 1.1%

The above results clearly tell us that databases with IDs 37, 33, and 36 are the most active databases. An interesting fact is that the total number of statements issued by database 20 is 53.9%, but the proportion of system resources it occupies is not high.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.