Solving the performance problems of a database system can be a daunting task. It is important to know how to find the problem, but it is more important to understand why the system responds specifically to specific requests. Affect CPU utilization on the database server
There are many factors: compiling and recompiling SQL statements, missing indexes, multithreaded operations, disk bottlenecks, memory bottlenecks, routine maintenance, and extraction, transformation, and loading (ETL) activities and other factors. Using the CPU itself is not a bad thing, and executing a task is the responsibility of the CPU. The key to normal CPU utilization is to ensure that the CPU handles the tasks you need to handle, rather than wasting the loop on poorly optimized code or slow hardware.
Two ways to achieve the same goal
In summary, there are two ways to determine the performance of the CPU. The first approach is to check the hardware performance of the system, which helps to continue to determine where to start with the second approach-checking the server's query efficiency. The second way of determining SQL Server? Performance issues are usually more efficient. However, unless you know exactly where the query performance problem is, you should always start with a system performance assessment. Finally, these two approaches are usually the same. Let's start with some basics so that we can look at both approaches.
Understanding the Basics
The way Hyper-threading affects SQL Server makes Hyper-threading a very interesting topic. Hyper-Threading actually represents two operating system logical processors for each physical processor. Hyper-Threading essentially leverages the idle time on the physical processor so that each processor is more fully utilized. The Intel Web site (intel.com/technology/platform-technology/hyper-threading/index.htm) provides a more comprehensive overview of how hyper-threading works.
For SQL Server Systems, the DBMS actually handles its own extremely efficient operating system queues and threads, so hyper-threading is used only when CPU utilization is already high and the physical CPU on the system is overloaded. When SQL Server queues up a task request on more than one scheduler, the operating system must actually switch the context of the thread back and forth on the physical processor to satisfy the continuously issued request, even if there are two logical processors on the same physical processor. If the Context switches/sec for each physical processor is higher than 5000, we strongly recommend that you consider shutting down Hyper-threading on the system and testing the performance again.
Applications rarely use Hyper-threading efficiently when high CPU utilization occurs on SQL Server. Before implementing changes on a production system, you must test the application against SQL Server in both Hyper-threading startup and shutdown scenarios.
A high-end dual-core processor is certainly better than the RAM in the computer, and the latter is faster than the attached storage device. A good CPU can handle a throughput of about six times times the current top DDR2 memory, about twice times the top DDR3 memory. The typical memory throughput is 10 times times faster than the fastest fibre Channel drives. This allows the hard disk to perform only a limited number of IOPS (input/output operations per second), which is completely limited by the number of seek times that the drive can perform per second. To be fair, it is not uncommon to use only one storage drive to handle all storage requirements for an enterprise database system. Currently, most organizations take advantage of a storage area network (SAN) on an enterprise database server or a larger RAID group (eliminating or minimizing disk I/O processor issues). Most importantly, disk bottlenecks and memory bottlenecks can affect the performance of your processor, regardless of the size of your organization.
Because of the different I/O speed, the overhead of retrieving data from disk is much greater than the cost of retrieving data from memory. A data page in SQL Server is 8KB. An extension in SQL Server consists of eight 8KB pages, that is, the size of the extension equals 64KB. This is important to understand, because when SQL Server requests a specific data page from disk, it does not just retrieve the data page, it also retrieves the entire extension that the data page resides on. In fact, there are a variety of factors that make SQL Server more cost-effective, but I won't elaborate on it here. In the best-performing state, extracting the cached data pages from the buffer pool should take less than half a millisecond, and in an optimal environment, retrieving a single extension from disk will take between 2 and 4 milliseconds. In general, I think it should take between 4 and 10 milliseconds to read data from a well-functioning, healthy disk subsystem. Retrieving data pages from memory is typically 4 to 20 times times faster than extracting data from disk.
When SQL Server requests a data page, it checks the in-memory buffer cache before looking for this data page from the disk subsystem. If the data page is in the buffer pool, the processor retrieves the data and then executes the requested task. This is called a soft page error. Soft page faults are an ideal choice for SQL Server because the data retrieved as part of the request must be in the buffer cache to be used. Data pages that are not found in the buffer cache must be retrieved from the server's disk subsystem. This is called a hard page fault when the operating system must retrieve the data page from disk.
When it comes to correlating memory performance, disk performance, and CPU performance, a common standard can help us see all of the problems: throughput. From a less professional point of view, throughput is a measure of the amount of data that can fill a finite pipeline.
Route 1: System performance
In fact, there are only a few ways to determine whether a server is experiencing CPU bottlenecks, and there are not many potential factors that lead to high CPU utilization. Some of these issues can be tracked using PerfMon or similar system monitoring tools, while other issues can be tracked using SQL Profiler or similar tools. Another approach is to use SQL commands through Query Analyzer or SQL Server Management Studio (SSMS).
The basic principle I use when evaluating system performance is to "investigate extensively and then focus on it." Clearly, only after identifying the scope of the problem can we concentrate on further research. After evaluating overall CPU utilization with a tool like PerfMon, you can use the tool to view two very simple, easy-to-understand performance counters.
One of the most familiar performance counters is% Processor time; in PerfMon, the counter is highlighted when you open the Add Counters window. The% Processor time is the number of times the processor is busy executing tasks. When this value is 80% or higher for most peak operation times, the processor is generally considered to be highly utilized. Even if the server is less than 80% utilized, you will typically see spikes of up to 100%, which you should expect.
The other counter you should look at is Processor Queue Length, which can be found under the system performance object in PerfMon. Processor Queue Length shows the number of threads waiting to perform tasks on the CPU. SQL Server manages its tasks through the scheduler in the database engine, queuing and processing requests in the scheduler. Because SQL Server manages its own tasks, it will only take advantage of a single CPU thread for each logical processor. This means that on a system dedicated to SQL Server, the number of threads waiting to be executed in the processor queue should be minimal. Typically, on a dedicated SQL Server, the number of threads should not exceed five times times the number of physical processors, but I think more than twice times there will be a problem. In addition to this counter, you need to view the% Processor time and context switches/sec performance counters on the server where the DBMS shares the system with other applications (I will briefly describe the context switch later) to determine whether additional applications or The DBMS is moved to a different server.
After understanding processor queues and high CPU utilization, let's look at the compilations/sec and re-compilations/sec counters under the SQL server:sql statistic performance object (see Figure 1). Compiling and recompiling the query plan increases the CPU utilization of the system. You should see that the value of Re-Compilations is close to 0, but observing trends within the system can determine the usual behavior of the server and the number of normal compilations. It is not always possible to avoid recompiling, but you can refine the query and stored procedures to minimize recompilation and reuse the query plan. These values are compared to the actual SQL statements entering the system through Batch requests/sec (which can also be found in the SQL Server:sql statistic performance object). If the compilation and recompilation per second is a high percentage of the batch requests that enter the system, you should check this aspect. In some cases, SQL developers may not be aware of the impact of their code on these types of system resource problems and why they are affected. In the following, I will provide some reference materials to help you minimize this behavior.
Figure 1 Selecting counters for monitoring (click the image for a larger view)
In PerfMon, check the performance counters named Context switches/sec (see Figure 2). This counter shows the number of times the thread must be removed from the operating system scheduler (rather than the SQL scheduler) in order to perform tasks for other waiting threads. Context switches can be more frequent for database systems that are shared with other applications, such as IIS, or other vendor application server components. The Context switches/sec threshold I use is about 5,000 times times the number of processors in the server. This threshold can also be higher on systems that have Hyper-threading enabled and have a high level of CPU utilization. This indicates a CPU bottleneck if CPU utilization and context switching often exceed their respective thresholds at the same time. If this happens frequently and your system is obsolete, you should start planning to buy more or faster CPUs. For more information, see "Hyper-Threading" on the sidebar.
Figure 2 Performance counters to note
% Processor Time
Potential factors include insufficient memory, low query plan reuse rates, and non-optimized queries.
> x number of Processors
Potential factors include other applications on the server, multiple instances of SQL Server running on the same server, and Hyper-threading turned on.
Processor Queue Length
> 5 x number of processors
Potential factors include other applications on the server, frequent compilation or recompilation, and multiple instances of SQL Server running on the same server.
Compare with Batch requests/sec.
Compare with Batch requests/sec.
Compared to the number of compilations and recompilation per second.
Page Life expectancy
A potential factor for low memory.
A large number of data cache refreshes or low memory potential factors.
Assess checkpoints based on PLE and Lazy writes/sec.
Cache hit Ratio:sql plans
Indicates a low plan reuse rate.
Buffer Cache hit Ratio
A potential factor for low memory.
When CPU utilization is high, you also need to monitor SQL Server Lazy Writer (in SQL Server 2000) or Resource Monitor (in SQL Server 2005). The CPU time is increased by flushing the buffer and the procedure cache through resource threads called Resource Monitor. Resource Monitor is a SQL Server process that determines which pages to keep and which pages need to be flushed from the buffer pool to disk. Each page in the buffer and procedure cache is initially assigned a cost that represents the resource that is consumed when the page is placed into the cache. Resource Monitor Each time the page is scanned, the cost value is reduced. When a request requires a cache space, the system refreshes the pages based on the cost associated with each page, and the lowest-valued page is first refreshed. Resource monitor activity can be traced through the Lazy writes/sec performance counter under the SQL Server:buffer Manager object in PerfMon. You should track the trend of viewing this value to determine the typical thresholds on your system. This counter is typically viewed with the Page life expectancy and checkpoints/sec counters to determine if there is not enough memory.
The Page life expectancy (PLE) counter helps determine if there is not enough memory. The PLE counter shows how long the data page stays in the buffer cache. The acceptable threshold value for this counter in the industry is 300 seconds. If the value displayed for a long period of time is less than 300 seconds on average, it indicates that the data page is flushed from memory too frequently. If this happens, it will cause the Resource Monitor load to be heavier, resulting in increased processor activity. The PLE counter and the checkpoints pages/sec counter should be evaluated together. When checkpoints occur in the system, dirty data pages in the buffer cache are flushed to disk, resulting in a decrease in the PLE value. The Resource Monitor process is a mechanism to actually flush these pages to disk, so you should also see an increase in the value of the Lazy writes/sec during these checkpoints. If the PLE value increases immediately after the checkpoint is complete, you can ignore this momentary phenomenon. On the other hand, if you find that you are often below the PLE threshold, it is a good time to use the extra memory to mitigate your problems while freeing some resources back to the CPU. All of these counters can be found in the SQL server:buffer Manager performance object.
Route 2: Query performance
When tracking SQL Server applications, you should familiarize yourself with the stored procedures used for tracing. Using the GUI interface (SQL Server Profiler) for tracking can increase system load by 15% to 25%. If you can take advantage of stored procedures during tracing, you can reduce the load on the system by about half.
I run the following query when I know that the system has a bottleneck somewhere and want to determine which SQL statements are causing the server to become problematic. This query helps me understand the individual statements and the resources they are currently using, and the statements that need to be checked to improve performance. For more information about SQL tracing, see msdn2.microsoft.com/ms191006.aspx.
SELECT substring (text,qs.statement_start_offset/2 , (case is qs.statement_end_offset =-1 Then len ( convert (nvarchar (max), text)) * 2 ELSE qs.statement_end_offset end-qs.statement_start_offset)/2) , Qs.plan_generation_num as Recompiles , qs.execution_count as Execution_count , Qs.total_elapsed_time- Qs.total_worker_time as Total_wait_time , qs.total_worker_time as Cpu_time , qs.total_logical_reads as reads , qs.total_logical_writes as Writesfrom sys.dm_exec_query_stats QS Cross APPLY Sys.dm_exec_sql_text (qs.sql_handle) St left JOIN sys.dm_exec_requests r on qs.sql_handle = R.sql_ Handleorder by 3 DESC
After a new query is submitted to SQL Server, the query plan is evaluated, optimized, compiled, and placed into the procedure cache. Each time a query is submitted to the server, the procedure cache is checked to try to match the query plan to the request. If it is not found, SQL Server creates a new schedule for it, which can be costly.
T-SQL CPU optimizations should be noted in the following areas:
- Query plan Reuse
- Reduced compilation and recompilation
- Sort operations
- Inappropriate joins
- Missing index
- Table/Index Scan
- function usage in SELECT and WHERE clauses
- Multithreaded operations
Let's get to know it in more detail. SQL Server often extracts data from memory and disk, so it is rare to use only a single data page. In most cases, multiple parts of an application operate on a single record, run multiple smaller queries, or join tables to provide a complete view of the relevant data. In an OLAP environment, an application might extract hundreds of rows from one or two tables so that you can merge, accumulate, or summarize data to generate a regional sales report. In these cases, if the data exists in memory, the speed at which the data is returned can be computed in milliseconds, but it is measured in minutes if the same data is retrieved from disk rather than RAM.
The first example is a situation where there are a large number of transactions, and plan reuse depends on the application. Low plan reuse results in a large compilation of SQL statements, resulting in a large amount of CPU processing overhead. In the second example, because the data must be refreshed frequently from the buffer cache to make room for a large number of new data pages, the heavy use of system resources can cause the system's CPU to become hyperactive.
Suppose there is a highly transactional system in which the SQL statements shown below are executed 2000 times in 15 minutes in order to retrieve information about the shipping container. If there is no query plan reuse, it is possible that each statement will be executed approximately 450 milliseconds. If the same query plan is used after the first execution, each subsequent query can be completed in about 2 milliseconds, reducing the overall execution time to about 5 seconds.
Use shipping_dist01; SELECT container_id, carton_id , product_id , ProductCount , Modifieddatefrom Container.cartonwhere carton_id = 982350144;
Query plan reuse is critical to achieving the best performance of a system that has a large number of transactions, and in most cases is achieved through parameterized queries or stored procedures. Here are some useful resources for query plan reuse information:
- Batch compilation, recompilation, and scheduling cache issues in SQL Server 2005 (microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx)
- Optimizing SQL Server stored procedures to avoid recompilation (sql-server-performance.com/rd_optimizing_sp_recompiles.asp)
- Query recompilation in SQL Server 2000 (msdn2.microsoft.com/aa902682.aspx)
The SQL Server 2005 dynamic management view (DMV) contains a lot of information that is helpful. When CPU utilization was high, I used several DMV to help me determine whether the CPU was being used properly.
One of the DMV I looked at was Sys.dm_os_wait_stats, which provides a way for DBAs to determine each resource type or function used by SQL Server and to measure the system wait time due to that resource. The counters in this DMV are cumulative. This means that in order to clearly understand which resources will affect different areas of the system, after you view the data for any unresolved issues, you must first issue the DBCC SQLPERF (' sys.dm_os_wait_stats ', CLEAR) command to reset all the counters. The Sys.dm_os_wait_stats DMV is equivalent to the database consistency check command in SQL Server 2000, DBCC SQLPERF (waitstats). You can find more information about the different wait types in SQL Server Books Online on msdn2.microsoft.com/ms179984.aspx.
It should be understood that even when everything is running optimally, it is common for the system to appear to wait. You only need to determine whether the wait is affected by CPU bottlenecks. The signal waits should be as short as possible relative to the total wait time. The time the special resource waits for the processor resource can be determined directly by the total wait time minus the signal wait time, which should not be greater than approximately 20% of the total wait time.
The sys.dm_exec_sessions DMV displays all open sessions on SQL Server. This DMV provides a high-level view of the performance of each session and all the work that is performed after each session starts. This includes the total time that the session waits, total CPU usage, memory usage, and read and write counts. The DMV will also provide you with logon, logon time, host, and session time when the last SQL Server request was made.
With the sys.dm_exec_sessions DMV, you can identify only active sessions, so if you see high CPU usage, you should look at this view first. First look at the session with a high CPU count. Identify the applications and users who have been performing this task, and then begin to gain insight into them. Pairing sys.dm_exec_sessions with the sys.dm_exec_requests DMV provides a lot of information that can be obtained through sp_who and Sp_who2 stored procedures. If you combine this data with the Sys.exec_sql_text dynamic management function (DMF) through the sql_handle column, you can get the query that the session is currently running. The code snippet in Figure 3 shows how to extract this data at the same time to help determine the current situation on a server.
Figure 3 Determining Server Activity
SELECT es.session_id, es.program_name , Es.login_name , Es.nt_user_name , Es.login_time , Es.host_name, es.cpu_time , Es.total_scheduled_time , Es.total_elapsed_time , Es.memory_usage , Es.logical_reads , Es.reads , Es.writes , St.textfrom sys.dm_exec_sessions es left JOIN Sys.dm_exec_connections EC on es.session_id = ec.session_id left JOIN sys.dm_exec_requests er on es.session_id = er.session_id OUTER APPLY sys.dm_exec_sql_text (er.sql_handle) stwhere es.session_id > --< System sessionsorder by Es.cpu_time DESC
I found this statement to help identify which applications need to be focused on. I began to focus on these SQL statements when comparing CPU, memory, read, write, and logical reads for each session of an application to determine that the CPU resources were much higher than the other resources being used.
To fully track the application's SQL statements, I use SQL Server tracing. You can do this through the SQL Server Profiler tool or the trace system stored procedure to help you evaluate what is going on. (For more information about this topic, see the sidebar "SP Trace".) For statements with high CPU usage, as well as hash and classification warnings, cache misses, and other red flags, you should review the profiler. This helps you narrow down the scope to a specific SQL statement or to a specific time period that produces high resource usage. The profiler can track SQL statement text, execution plans, CPU utilization, memory usage, logical reads, writes, query plan caching, recompilation, rejection of query plans from the cache, cache misses, table and index scans, missing statistics, and many other events.
After collecting data through a sp_trace stored procedure or SQL Server Profiler, I typically use a database that has two padding methods: After the event is populated with trace data, and the trace is set to write to the database. Afterwards populating the database can be done through a SQL Server system function named Fn_trace_getinfo. The advantage of this approach is that I can query and classify data in a number of ways to see which SQL statements use the most CPU or read the most, count the number of times a recompile has occurred, and so on. The following example shows how to use this function to load a table with a profiler trace file. DEFAULT specifies that all trace files for this trace will be loaded in the order in which they were created:
SELECT * into Trc_20070401from fn_trace_gettable (' S:\Mountpoints\TRC_20070401_1.trc ', default); GO
As you can see, high CPU usage does not necessarily indicate a CPU bottleneck. High CPU usage can also be caused by a large number of other applications or hardware bottlenecks. Although the other counters appear to be in good health, after you have determined that the CPU usage is too high, you can find the cause within the system and then find the solution (Buy more CPUs or optimize your SQL code). No matter what you do, don't give up! Using the tips provided in this article, and then doing a little bit of practice and research, the implementation plan to optimize CPU utilization under SQL Server is fully achievable.
SQL Server CPU