Basic metrics for Windows Performance Monitor (CPU, memory, hard disk parameters)

Source: Internet
Author: User

Reprint: http://kms.lenovots.com/kb/article.php?id=7045

Basic metrics for Windows Performance Monitor (CPU, memory, hard disk parameters)

As a systems engineering engineer, it is important to understand the data of monitoring, which is related to the problem of optimization and analysis, so today we give some basic indicators of Windows Performance Monitor (CPU, memory, hard disk parameters), and hope to help you to optimize and analyze problems in the future.

Windows-processor


Indicator name

Indicator description

Range of indicators

Indicator units

CPU utilization
(% Processor time)

% Processor time refers to the percentage of times the processor executes a non-idle thread. This counter is designed as the primary indicator for processor activity. It measures the time at which the processor is used to execute idle processing threads at each time interval, and subtracts the merit out of 100%. It can be considered as a percentage of the sample interval used to do useful work.

According to the application system, it is advisable to fluctuate within the range of 80%±5%. Too low, the server CPU utilization is not high, the CPU may become the processing bottleneck of the system.

%

Interrupt Rate
(interrupts/sec.)

The number of times the device interrupts the processor per second. When a task is completed or attention is required, the device sends an interrupt signal to the processor. Devices that can generate interrupts include system timers, mice, data communication connections, network cards, and other external devices. During an outage, the normal thread execution is paused, and an interrupt can switch the processor to another thread with a higher priority. Frequency interrupts are frequent and cyclical, and interrupt actions are performed in the background.

Depending on the processor, the lower the better; not more than 1,000;
If the value increases significantly and system activity does not increase correspondingly, there is a hardware problem and you need to check the network adapter, disk, or other hardware that caused the outage.

Secondary/sec

System call Rate
System call/sec.

Refers to the overall rate at which all processors running on the computer invoke operating system service routines. These routines perform basic programs such as scheduling and synchronization activities on the computer, and provide access to non-graphical devices, memory management, and namespace management.

If INTERRUPTS/SEC is greater than system calls/sec, a hardware device in the system generates excessive interrupts.

Secondary/sec

Processor Queue Length

The number of threads in the processor queue. This counter displays only the ready thread, not the running thread.

If there are always more than two threads in the processor queue, it usually indicates a processor jam.

Process switching rate
Context switches/sec

Refers to the overall rate at which all processors on a computer are transitioning from one thread to another. Context conversion occurs when a running thread automatically discards the processor, preempted by a higher-priority-ready thread, or converted between user-mode and privileged (kernel) mode to use execution or sub-system services

If the value of this counter is large, it indicates that the lock is highly competitive, or that the thread is switching between user and kernel mode frequently.

Windows-memory

Indicator name

Indicator description

Range of indicators

Indicator units

Pages/sec
Pages input/sec
Pages output/sec
Page fault/sec

Page faults/sec is the error page processed by the processor every second, including soft and hard errors. Pages Input/sec is to resolve the hard error page, the number of pages read from the hard disk, while page reads/sec is to resolve the hard error that is read from the hard disk. Pages/sec is the sum of pages input/sec and pages output/sec.
This series of indicators is the primary indicator that can display errors that cause system-wide delay types.
When the processor requests a page (possibly data or code) for an error in the location specified by the memory, this forms a page Fault. If the page is elsewhere in memory, the error is referred to as a soft error (measured in transition fault/sec) and is called a hard error if the page must be re-read from the hard disk. Many processors can continue to operate in the case of large soft errors. However, a hard error can lead to noticeable delays.
if the page reads/sec continues to remain as low, it indicates that it may be out of memory. Page/sec recommended 0-20. This value will always be high if the server does not have enough memory to handle its workload. If it is greater than 80, it indicates a problem (too many read and write data operations to access the disk, consider increasing the memory or optimizing the algorithm for reading and writing data).
The value of the series counter is low, indicating that the response request is faster, or the server system memory shortage may be caused (or the cache is too large, resulting in too little system memory).

Times/sec

Available Bytes

Shows the total amount of physical memory currently idle, equal to the sum of memory allocated to the standby (cached), idle, and 0 paged list.
Free memory can be used immediately; Clear 0 Memory is a memory page filled with 0 values to prevent subsequent processes from acquiring data used by the old process; Standby memory is the memory that is removed from the working set of the process (its physical memory) and then enters the disk, but the memory can still be reclaimed. The indicator shows only the last observed value, not the average.

When this value becomes an hour, Windows starts to call the disk paging file frequently. If the value is small, for example less than 5 MB, the system will spend most of the time on the action page file.
It is generally reserved for 10% of available memory. Minimum <4m, this value is too small may be out of memory or memory leaks.

Committed Bytes

Refers to the acknowledgement of virtual memory in bytes, which is the physical memory that is reserved on the disk paging file.

No more than 75% of physical memory

Windows–disk

Indicator name

Indicator description

Range of indicators

Indicator units

% Disk Time

Refers to the percentage of time that the selected disk drive is busy servicing a read or write request.

Normal <10, this value is too large to take too much time to access the disk, consider increasing memory, replacing faster drives, and optimizing the algorithms for reading and writing data. If the value continues to exceed 80 (the processor and network connections are not saturated at this time), a memory leak can occur.

Current Disk Queue Length

is the current number of requests on the disk when performance data is collected. It also includes requests that are in service at the time of collection. This is a snapshot of the moment, not the average of the time interval. A multi-axis disk device can have multiple requests that are in a running state, but other concurrent requests are waiting for service. This counter reflects a temporary high or low queue length, but it is likely to remain high if the disk drive is forced to run continuously.

The delay of the request is proportional to the length of this queue minus the number of spindles on the disk. To improve performance, this difference should be less than two on average.

Avg.Disk Queue Length
Avg. Disk Read Queue Length
Avg. Disk Write Queue Length

The average number of read and write requests (queued in the instance interval for the selected disk).

Avg.Disk Queue Length Normal <0.5, this value is too large to indicate that the disk IO is too slow to replace a faster hard drive.

Performance Monitor-Performance Monitor

Performance Monitor is a system resource and performance monitoring tool that comes with Windows. Performance Monitor provides information such as CPU usage, memory allocation, abnormal dispatch, thread scheduling frequency, etc. in a quantifiable manner. Asp. NET can provide the number of requests per second, request response time, and so on. Performance Monitor can monitor the utilization of these resources over time, providing averages and spikes.

Performance Monitor helps you get specific metrics about performance to monitor changes in system resources as problems occur. By examining the changes in some of the important counters in Performance Monitor, it is often possible to find some useful clues. For example, comparing the number of requests per second, the request response time and CPU utilization have the same curve to see if performance is related to load.

When solving performance problems, customers are often asked to add some of the following counters for performance collection.

    • All counters under the Process object
    • Processor All counters under object
    • All counters under System object
    • All counters under Memory object
    • If the client's program is. NET program is also added to the. NET starts with all counters under object.
    • If the customer uses ASP. NET, all counters under object starting with ASP are also added.

When analyzing the performance log, focus on the following counters.

Process Object

=============

The counters in the process object can analyze memory, CPU, number of threads, and number of handle based on the target process. Select the target process for the problem, and then analyze some of the following counters for the target process.

%processor time

-------------------

This counter is the indicator that the process consumes CPU resources. Even when the process is busy, the average CPU occupancy rate should be less than 80%. If this value is exceeded, you can assume that the program has a high CPU problem. Another problem is that the CPU fluctuates a great range. Although the average occupancy rate is not high, but jumping up and down frequently. In a short period of time, there will be a continuous high CPU situation occurs.

Handle Count

------------------

This counter records the number of kernel object handle used by the current process. Kernel object is an important system resource. The number of Handle count should also be maintained in a stable range when the program enters a stable state of operation. If you find that handle count has a continuous upward trend throughout the program cycle, you should consider whether the program has handle Leak.

ID Process

------------------

This counter records the process ID of the target process. You may find it strange that the ID has something to watch? The process ID is used to observe if a program has a restart. For example, the ASP. NET worker process may be automatically reclaimed. Because the process name is the same, only the process ID is passed to determine if there is a reboot. If the ID changes, see if the program crashes or recycle.

Private Bytes

------------------

This counter records the number of memory commits that are currently being made through the VirtualAlloc API. Whether the memory requested directly by the API is called, the memory that the heap manager requested, or the CLR's managed Heap, is counted inside. As with handle count, if the overall trend is continuous upward throughout the program cycle, the memory Leak is indicated.

Virtual Bytes

------------------

This counter records the user-state total memory address of the current process request success, including the address occupied by Dll/exe and the number of reserve memory addresses through the VirtualAlloc API, so the counter should always be larger than the private Bytes. In general, the changes in Virtual bytes and private bytes are broadly consistent. Because of the existence of memory shards, Virtual bytes and private bytes generally maintain a relatively stable proportional relationship. When the ratio of virtual bytes to private bytes is greater than 2, the program tends to have more serious memory address shards.

Processor Object

==============

Processor object records the load of the chip in the system. Because a normal program cannot be bound to a specific CPU, it is sufficient to observe total instance on a multi-CPU machine.

%processor time

----------------------

This counter is the same as the%processor time in the process, but it is not recorded for a particular process, but for the entire system. By comparing the counter with the same name counter under process, we can see whether the high CPU problem of the system is caused by a single process.

System Object

==============

The system object records the statistics for a single whole of the systems. So do not differentiate instance. By comparing the trend of counter and other counter under System object, we can often see some clues.

Context switch/sec

--------------------

Context switch indicates the scheduling of the overall thread in the system, switching frequency. Thread switching is a much more expensive operation. Frequent thread switching back causes a lot of CPU cycles to be wasted. So when you see a high CPU, be sure to compare it with context switch. If both have the same trend, high CPUs are often caused by contention (line contention) rather than dead loops.

Exception dispatches/sec

-------------------

Exception dispatches indicates the frequency of abnormal dispatch and processing in the system. As with thread switching, exception handling also requires a lot of CPU overhead. The analysis method is identical with the context Swith.

File Data operations/sec

-------------------

File Data operations records how frequently the disk file reads and writes in the current system. By observing the trend of the counter and other performance pointers, it is possible to determine whether the disk file operation is a performance bottleneck. Similar counters also have network Interface, Bytes total/sec

Memory Object

=============

The memory object records the statistics of the overall RAM in the current system.

Avaiable Mbytes and Committed Bytes

---------------------

Available MBytes records the current amount of physical memory remaining. Committed bytes records the amount of memory for all process commits. The combination of two counters can be observed:

    1. The addition of the two provides a rough estimate of how much of the system's total available memory makes it easier to estimate physical configuration.
    2. When the available MBytes is less than 100MB, it indicates that the overall memory tension of the system will affect the performance of all processes in the system. You should consider increasing physical memory or checking for memory leaks.
    3. By comparing the private Bytes and virtual Bytes in the Process object, it is convenient to further confirm if there is a memory leak and determine if the memory leak is caused by a single process.

Free System Page Table Entries, pool Paged Bytes and Pool Nonpaged Bytes

--------------------

These three counters can measure the amount of free memory in the kernel mentality. Especially when the/3GB switch is used, the kernel memory address is compressed, which leads to a lack of kernel memory, which in turn raises some very strange problems.

. NET CLR Memory Object

=============

The. NET CLR Memory object records information about the CLR in the CLR process that is associated with the data. All the counters under this category are interesting and mean very directly. It is recommended to use an example program for testing and research. The following are the two most commonly used counters.

Bytes in all Heaps

----------------------

The Bytes in all heaps records the memory space occupied by all CLR objects that cannot be reclaimed in the process when the last GC occurred. The counter is not real-time, and the counter is updated each time the GC occurs. The change of managed heap and native memory can be distinguished by comparing the private bytes with the process of the same processes. For memory leak, it is easy to distinguish between the leak of managed heap or the leak of native memory.

%time in GC

The%time in GC records how frequently the GC occurs. Generally speaking 15% is more normal. When more than 20%, the GC occurs too frequently. Because the GC not only brings high CPU overhead, but also suspends the CLR thread of the target process, high-frequency GC is very dangerous. By comparing CPU utilization and other performance metrics, it is often possible to see the performance impact of GC. High frequency GC is often due to:

    1. The load is too high.
    2. Unreasonable architecture, low memory utilization.
    3. Memory leaks, memory fragmentation causes memory pressure.

Asp. NET performance monitoring

============

If the target program is ASP. NET, in an object that starts with ASP, these counters are useful for measuring the performance of ASP. Since many counters exist in more than one object category, only the specific counter names are listed below, instead of the specific object.

Application Restarts

-------------------

Application restarts records the number of ASP. NET application domain restarts. The reason for the ASP. NET AppDomain restart is often that the virtual directory is modified. For example, the Web. config file was modified, or the antivirus program scanned the parent directory. This counter allows you to see if there is an abnormal restart.

Request Execution Time

-------------------

Request execution time records the execution times of requests, which are the most direct parameters for measuring the performance of ASP. The average value of the counter is measured to see if performance is expected. It is important to note that because Windows is not a real-time system, you cannot use spikes to measure overall performance. For example, when a GC occurs, the request execution time must be more than the GC time. Therefore, the average is the effective standard.

Request Current

-------------------

Request current records the requests that are currently being processed and waiting to be processed. Ideally, the request current equals the number of CPUs, which means that the requests are in good agreement with the ability of the hardware resources to be processed concurrently, and that the hardware investment is running at an optimal state. But generally, when the load is large, the Request current increases. If request current has more than 10 cases over a period of time, it indicates a performance problem. Observe the corresponding CPU and other resources at this time. If the CPU is not high, it is likely that there are blocking in the program, such as waiting for the database request, resulting in the request cannot be completed in time.

Request/second

------------------

The Request/second counter records the number of requests that arrive at ASP per second. This is a direct parameter that measures the ASP. Watch to see if the request/second exceeds the expected throughput of the program. If the request/second has sudden fluctuations, look for a denial of service attack. By comparing Request/second, request current, request execution time and system resources, you can often see the changes in the overall performance of ASP and the impact of each factor.

Request in Application Queue

------------------

When ASP. NET does not have a spare worker thread to handle the new incoming request, the new request is placed in the application queue. When the application queue backlog request also exceeds the set value, ASP. NET directly returns the 503 Server Too busy error and discards the request. So normally, the request application queue should always be 0, otherwise it indicates that there has been a backlog of requests and that the performance problem is serious.

Excerpt from <windows user-state program efficient Troubleshooting >

Basic metrics for Windows Performance Monitor (CPU, memory, hard disk parameters)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.