Usage of performance counters monitored by LoadRunner

Last Update:2018-12-05 Source: Internet

Author: User

Tags disk usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Memory: memory usage may be the most important factor in system performance. If the system frequently switches pages, the memory is insufficient. "Page Swap" is a unit called "page". It moves code and data blocks of a fixed size from Ram to a disk to release memory space. Although some page switches enable Windows 2000 to use more memory than actually, they are acceptable, but frequent page exchanges will reduce system performance. Reducing page switching significantly increases the system response speed. To monitor the status of insufficient memory, start with the following object counters:

Available Mbytes: number of available physical memory. If the available Mbytes value is small (4 MB or smaller), it indicates that the total memory on the computer may be insufficient, or a program does not release the memory.

Page/sec: indicates the number of pages retrieved from the disk due to hardware page errors, or the number of pages written to the disk to release the working set space due to page errors. Generally, if pages/sec continues to exceed several hundred, you should study the page exchange activity further. You may need to increase the memory to reduce the page feed requirement (you can multiply this number by 4 K to get the hard disk data traffic caused by this ). A large value of pages/sec does not necessarily indicate memory problems, but may be caused by running programs that use memory ing files.

Page read/sec: Page hardware fault, a subset of page/sec. In order to parse the reference to memory, the number of times page files must be read. The threshold value is greater than 5. The lower the threshold, the better. A large value indicates disk read rather than cache read.

Because too many page swapping requires a lot of hard disk space, it may lead to confusion between insufficient Page Swap memory and the disk bottle diameter that leads to Page Swap. Therefore, you must track the following disk usage counters and memory counters when studying the causes of page exchanges with less memory:
Physical disk \ % disk Time
Physical disk \ avg. Disk Queue Length

For example, page reads/sec, % disk Time, And avg. Disk queue length. If the page reading speed is very low and the value of % disk Time and AVG. Disk queue length is very high, there may be disk bottle diameter. However, if the length of the queue increases while the page read rate does not decrease, the memory is insufficient.

To determine the impact of excessive page swapping on disk activity, increase the values of physical disk \ avg. Disk SEC/transfer and memory \ pages/sec counters several times. If the count of these counters exceeds 0.1, page switching will take more than 10 percent of disk access time. If this happens for a long time, you may need more memory.

Page faults/sec: Number of soft page failures per second (including some that can be satisfied directly in the memory and some that need to be read from the hard disk) compared with page/sec, data cannot be used immediately in the specified work set in the memory.

Cache Bytes: file system cache. By default, it is 50% of the available physical memory. If the iis5.0 runtime memory is insufficient, it will automatically sort out the cache. Pay attention to the trend changes of this counter

If you suspect Memory leakage, Please monitor memory \ available bytes and memory \ committed bytes to observe memory behavior, and monitors the Process \ private bytes, Process \ working set, and process \ handle count that may leak memory. If you suspect that a kernel-mode process causes leakage, you should also monitor memory \ pool nonpaged bytes, memory \ pool nonpaged allocs and process (process_name) \ pool nonpaged bytes.

Pages per second: the number of pages retrieved per second. The number should be less than one page per second.

Process:

% Processor time: Number of processors consumed by the processor. If the server is dedicated to SQL Server, the maximum acceptable limit is 80-85%.

Page faults/sec: Compares page faults generated by a process with those generated by the system to determine the impact of the process on system page faults.

Work set: processes memory pages recently used by threads, reflecting the number of memory pages used by each process. If the server has enough idle memory, the page will be left in the work set. When the free memory is less than a specific threshold, the page will be cleared from the working set.

Inetinfo: Private Bytes: Number of current bytes allocated by this process that cannot be shared with other processes. If the system performance decreases over time, this counter can be the best indicator of Memory leakage.

Processor: monitors "processor" and "system" Object counters to provide valuable information about processor usage, helping you determine whether a bottleneck exists.

% Processor time: if this value continues to exceed 95%, the bottleneck is the CPU. You can consider adding a processor or changing a faster processor.

% USER time: Indicates CPU-consuming database operations, such as sorting and executing Aggregate functions. If the value is very high, you can consider increasing the index and try to reduce the value by using simple table join and horizontal table segmentation methods.

% Privileged time: (CPU Kernel Time) indicates the percentage of time taken to process the code executed by a thread in privileged mode. If the value of this parameter and the value of physical disk remain high, it indicates that I/O is faulty. Consider replacing a faster hard drive system. In addition, you can set tempdb in Ram to reduce "Max async Io" and "Max lazy writer Io.

In addition, the server work queues \ queue length counter that tracks the current length of the server work queue of the computer will display a processor bottleneck. If the queue length is greater than 4, processor congestion may occur. This counter is the value of a specific time, not the average value of a period of time.

% DPC time: the lower the better. In a multi-processor system, if the value is greater than 50% and processor: % processor time is very high, adding a NIC may improve performance and the network provided is not saturated.

Thread

Contextswitches/sec: (instantiate inetinfo and DLLHOST processes) if you decide to increase the size of the thread byte pool, you should monitor these three counters (including the one above ). Increasing the number of threads may increase the number of context switches, so that the performance will not increase but decrease. If the context switching value of the ten instances is very high, the size of the thread byte pool should be reduced.

Physical Disk:

% Disk Time %: the percentage of time the selected disk drive is busy providing services for read or write requests. If all three counters are large, the hard disk is not the bottleneck. If only

% Disk Time is relatively large, the other two are relatively moderate, the hard disk may be a bottleneck. Before recording this counter, run diskperf-YD in the command line window of Windows 2000. If the value exceeds 80%, the memory may leak.

AVG. Disk queue length: the average number of Read and Write requests (queued for the selected disk in the instance interval. This value should not exceed 1.5 of the number of disks ~ 2 times. To improve performance, you can add disks. Note: A raid disk actually has multiple disks.

Average disk read/write queue length: the average number of read (write) requests (queues.

Disk reads (writes)/S: Number of disk reads and writes on a physical disk per second. The sum of the two should be smaller than the maximum capacity of the disk device.

Average disksec/read: the average time required to read data on this disk in seconds.

Average disk SEC/transfer: the average time required to write data to this disk in seconds.

Network Interface:

Bytes total/sec: the speed at which bytes are sent and received, including frame characters. Determine whether the network connection speed is a bottleneck. You can use the counter value to compare with the current network bandwidth.

Sqlserver performance counters:

Access methods (access method) is used to monitor the methods used to access the logical pages in the database.

Full scans/sec (full table scan/second) unlimited number of full scans per second. It can be a basic table scan or full index scan. If the value displayed by this counter is greater than 1 or 2, you should analyze your query to determine whether full table scan is required and whether the S q l query can be optimized.

Page splits/sec (page split/second) the number of page shards per second due to data update operations.

Buffer Manager: Monitoring Microsoft SQL server? How to Use: memory stores data pages, internal data structures, and high-speed cache processes. The counter monitors physical I/O when SQL Server reads database pages from the disk and writes database pages to the disk. Monitoring the memory and counters used by SQL Server helps determine whether the bottleneck exists due to the lack of available physical memory to store frequently accessed data in the cache. If so, SQL Server must retrieve data from the disk. Whether to improve query performance by adding more memory or making more memory available for data cache or SQL Server internal structure.

The frequency at which SQL Server reads data from the disk. Compared with other operations, such as memory access, physical I/O takes a lot of time. Minimizing physical I/O can improve query performance.

. Page reads/sec: Number of physical database page reads per second. This statistics shows the total number of physical pages read between all databases. Because of the high overhead of physical I/O, you can minimize the overhead by using larger data caching, smart indexing, more efficient queries, or changing database design.

. Page writes/sec (. Written page/second) Number of pages written by the physical database per second.

. Buffer cache hit ratio. Percentage of pages not read in the buffer pool (buffer cache/buffer pool) to all pages in the buffer pool. The percentage of pages that can be found in the cache and do not need to be read from the disk. This ratio is the total number of cache hits divided by the total number of cache searches after the SQL server instance is started. After a long period of time, the ratio changes very little. Since reading data from the cache is much lower than reading data from the disk, It is generally expected that this value is higher. Generally, you can increase the cache hit rate by increasing the memory available for SQL Server. The counter value depends on the application, but the ratio is best 90% or higher. Increase the memory until this value continues to exceed 90%, indicating that more than 90% of data requests can obtain the required data from the data buffer.

Lazy writes/sec (inert write/second) Number of buffer zones written by the inert write process per second. The value is preferably 0.

The cache manager object provides counters for monitoring Microsoft SQL server? How to use memory to store objects, such as stored procedures, special and prepared Transact-SQL statements, and triggers.

Cache hit ratio (high-speed cache hit rate, all cache "hit rate. In SQL Server, the cache can include log cache, buffer cache, and procedure cache, which is an overall ratio .) The ratio of the number of cache hits to the number of searches. This is a good counter for viewing how SQL Server high-speed cache works for your system. If this value is very low and continues below 80%, more memory needs to be added.

Latches are used to monitor internal SQL server resource locks called latches. Monitoring latches to identify user activity and resource usage helps identify performance bottlenecks.

Average latch wait ti m e (m s) (Average latch wait time (MS) an SQL Server thread must wait for an average latch time, in milliseconds. If the value is high, you may be experiencing serious competition problems.

Latch waits/sec (latch wait/second) Number of waits per second on the latch. If the value is high, it indicates that you are experiencing a lot of competition for resources.

Locks provide information about SQL Server locks on individual resource types. Add locks to SQL Server resources (such as reading or modifying rows in a transaction) to prevent multiple transactions from using resources concurrently. For example, if a row (x) Lock is added to a row of a table by a transaction, no other transaction can modify this row before the lock is released. Using as few locks as possible improves concurrency and performance. You can monitor multiple instances of the locks object at the same time. Each instance represents a lock on a resource type.

Number of deadlocks/sec (number of deadlocks/s)

Average wait time (MS) (average wait time (MS) average wait time for threads to wait for a certain type of lock

Lock requests/sec (Lock request/second) the number of lock requests of a certain type per second.

Memory Manager: used to monitor the overall server memory usage to estimate user activity and resource usage and help identify performance bottlenecks. Monitoring the memory used by the SQL server instance helps determine:

Whether the bottleneck exists due to the lack of available physical memory to store frequently accessed data in the cache. If so, SQL Server must retrieve data from the disk.

Whether more memory can be added or used for data cache or SQL Server internal structure to improve query performance.

Lock blocks: Number of locked blocks on the server. The locks are on resources such as pages, rows, and tables. You do not want to see a growth value.

Total Server Memory: the total amount of dynamic memory currently in use by the SQL Server server.

Monitors some counters required by IIS

Internet Information Services Global:

File Cache hits %, file cacheflushes, File Cache hits

File Cache hits % is the percentage of cache hits in all cache requests, reflecting the file cache settings of IIS. For a website composed of mostly static webpages, this value should be around 80%. File Cache hits is the specific value hit by the File Cache. File cacheflushes is the number of File Cache refreshes since the server starts. If the refresh is too slow, the memory will be wasted. If the refresh is too fast, objects in the cache are discarded too frequently and cannot be cached. By comparing File Cache hits and File Cache Flushes, you can obtain the cache hit rate to the cache clearing rate. Observe the two values to get an appropriate refresh value (refer to setting objectttl, memcachesize, and maxcachefilesize in IIS)

Web Service:

Bytes total/sec: displays the total number of bytes sent and received by the Web server. A low value indicates that the IIS is transmitting data at a low speed.

Connection refused: the lower the value, the better. A high value indicates a bottleneck in the network adapter or processor.

Not found errors: displays the number of requests that cannot be met by the server because the requested file cannot be found (HTTP Status Code 404)

Page breakdown
If a transaction takes too long, what is the problem for analysis? You can use the page to break down each page:

DNS resolution time: When a browser accesses a website, it generally uses a domain name. the DNS server needs to resolve the domain name to an IP address. This process is the domain name resolution time, if we use an IP address for access in the LAN, it will be no longer necessary.
Connection: After the IP address of the web server is parsed, the browser request is sent to the Web server. Then, an initial HTTP connection must be established between the browser and the Web server. The server needs to do two things: one is to receive the request, the other is to allocate the process, and the connection time is used to establish the connection.
First Buffer: After a connection is established, the first data packet is sent from the Web server and transmitted to the client over the network. The time when the browser successfully receives the first byte is first buffer. This metric time not only indicates the latency of the web server, but also indicates the response time of the network.
Receive: The time from when the browser receives the first byte until the last byte is successfully received. The download is complete. This time is the receive time.

Other times include SSL handshaking (SSL handshake protocol, which requires fewer pages) and clienttime (latency of requests in the client browser, it may be due to the think time of the client browser or the delay caused by other aspects of the client), error time (from sending an HTTP request to the Web server to send an HTTP Error message, required Time)

From: http://hi.baidu.com/apr13th/blog/item/dcc41eb3d8d685a0d8335aeb.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More