---cpu__linux of Linux performance monitoring and analysis

Source: Internet
Author: User
Tags time interval cpu usage

CPU Performance Metrics

1. Ratio of CPU used by user process

2. Ratio of CPU used by system process

3. Wio, waiting for I/O but the CPU is in idle state ratio.

4. CPU Idle Rate

5. CPU Ratio for context exchange

6,nice

7,real-time

8, run the length of the process queue

9, average load


the common tools for monitoring CPU performance under Linux are

1. Iostat

Only the average information for all CPUs can be viewed

2. Vmstat

Can view the average information of all CPUs,

Ability to view CPU queue information

3. Mpstat

The ability to view individual and all CPU information.

4. Sar

Similar to the Mpstat

5. Top

6. Nmon


Iostat

$ iostat
Linux 2.6.18-92.el5          08/30/2012

avg-cpu:  %user   %nice  %system%iowait%steal   % Idle
           1.16    0.01    0.62    0.18    0.00   98.03

Vmstat
$ vmstat-n 5
procs-----------memory-------------Swap-------io------System-------CPU------
 R  B   swpd   free   buff  cache   Si-so bi-bo in   CS US sy ID WA St
 0  0     1261196 981892 3638872    0    0     0    1    1  1 1 + 0 0

The meaning of the-N 5 parameter is to refresh every 5 seconds

Procs

R--The following number represents the running sequence. If this value is continuously larger than the number of CPUs in the system, the system is running slowly, and most processes are waiting for the CPU. If the number of R is greater than 4 times times the CPU, the system is facing a CPU shortage or CPU speed is too low, causing the system to run too slow.

System

In--number of interrupts generated per second

CS-The number of context switches generated per second.

The larger the two values, the greater the CPU time the system process consumes.

Cpu

US-The percentage of time that the user process consumes CPU. Long-term high, you need to optimize the program.

SY-The percentage of time that the system process consumes CPU. SY is a high value, not a benign performance.

WA-The percentage of CPU time that IO waits to consume, when the value is high, indicates that IO waiting is more serious, possibly due to a large amount of random access to the disk, or disk bottlenecks.

ID-The percentage of the CPU in idle time. If it lasts 0 and Sy is us twice times the situation, the system is facing a shortage of CPU resources. When this problem occurs, adjust the CPU usage for the application. Enables the application to use the CPU more efficiently. At the same time, you can consider adding more CPUs.

Mpstat-(Multiprocessor Statistics)

Implementation monitoring, information stored in the/proc/stat file

$ mpstat-p All 2
Linux 2.6.18-92.el5 ()         08/30/2012

08:16:34 PM  CPU   %user    %nice%sys% Iowait    %irq   %soft  %steal   %idle intr/s 08:16:36 PM all    0.78    0.00    0.26    0.26    0.00    0.26 0.00 98.44 1058.85 08:16:36
PM    0    0.52    0.00    0.52    0.00 0.00 0.52 0.00 98.44 1058.85 08:16:36
PM    1    0.52    0.00    0.00    0.00    0.00    0.00   0.00 99.48      0.00

The above means: every 2 seconds to sample all CPU usage, total sampling 10 times. The syntax is as follows:

Mpstat [-p {| All}] [internal [count]]

-p to monitor which CPU, general use of all on it

Time of Internal interval

Number of Count samples

Output parameter meaning

%user--User state CPU time ratio

%nice--CPU time for negative processes

%system-Nuclear mentality time

Iowait-IO Wait Time

IRQ--

Soft

Idle

INTR/S number of CPU receive interrupts per second


SAR

$ sar-u 2
Linux 2.6.18-92.el5 ()         08/30/2012

08:28:36 PM       CPU     %user   %nice%system% Iowait    %steal     %idle
08:28:38 PM       all      0.26     0.00 0.00 0.78 0.00 98.97
08:28:40 PM       all      0.52      0.00      0.52     0.00 0.00 98.97

SAR [Options] [-A] [-o file] t [n]

In the command line, the N and t two parameters are grouped together to define the sampling interval and the number of times, T is the sampling interval, and the
The parameter, n is the number of samples, is optional, the default is 1,-o file represents the command result in binary format
stored in a file that is not a keyword here, file name. Options for command line option, SAR command
A lot of options, the following list only the common options:

-A: Sum of all reports.
-U:CPU Utilization
-V: Processes, I nodes, files, and lock table states.
-D: Hard drive usage reports.
-R: Usage statistics for memory and swap space.
-G: Case of serial I/O.
-B: Buffer usage.
-A: File read/write status.
-C: System call condition.
-Q: Report Queue Length and system average load
-R: The activity of the process.
-Y: terminal equipment activity.
-W: System Exchange activity.
-X {PID | SELF | All}: Reports the statistics for the specified process ID, the SELF keyword is the statistic of the SAR process itself, the ALL keyword is the statistics for all system processes



%user:cpu the percentage of time in user mode.
%nice:cpu the percentage of time in user mode with nice values.
%system:cpu the percentage of time in system mode.
%iowait:cpu the percentage of time that the input and output finish is waiting.
%steal: The percentage of the virtual CPU's unconscious wait time when the Hypervisor maintains another virtual processor.
%idle:cpu percent of idle time.
In all of the display, we should mainly note that the%iowait and%idle,%iowait values are too high, indicating that there is an I/O bottleneck on the hard disk, high%idle value, indicating that the CPU is more idle, if the%idle value is high but the system response is slow, it is possible that the CPU waiting to allocate memory, If the%idle value continues below 10, the CPU processing power of the system is relatively low, indicating that the most needed resource in the system is CPU.


Using SAR to run Process Queue Length analysis:
#sar-Q 2 10
Linux 2.6.18-53.el5pae (Localhost.localdomain) 03/28/2009
07:58:14 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
07:58:16 PM 0 493 0.64 0.56 0.49
07:58:18 PM 1 491 0.64 0.56 0.49
07:58:20 PM 1 488 0.59 0.55 0.49
07:58:22 PM 0 487 0.59 0.55 0.49
07:58:24 PM 0 485 0.59 0.55 0.49
07:58:26 PM 1 483 0.78 0.59 0.50
07:58:28 PM 0 481 0.78 0.59 0.50
07:58:30 PM 1 480 0.72 0.58 0.50
07:58:32 PM 0 477 0.72 0.58 0.50
07:58:34 PM 0 474 0.72 0.58 0.50
average:0 484 0.68 0.57 0.49

Runq-sz the process that is ready to run runs the queue.
Number of processes and threads in the Plist-sz process queue
Ldavg-1 the system average load (load average) before a minute
System average load (load average) for the first five minutes of ldavg-5
System average load (load average) for the first 15 minutes of ldavg-15

By the way, the meaning of load avarage
The load average can be understood as the number of processes per second that the CPU waits to run.
In Linux systems, SAR-Q, uptime, W, top commands will have the system average load average output, then what is the system average load.
The system average load is defined as the average number of tasks running in a queue at a specific time interval. A process is located in the run queue if the following conditions are true:
-It is not waiting for the I/O operation results
-It does not actively enter the waiting state (that is, no ' wait ' is invoked)
-Not stopped (for example: waiting to terminate)
For example:
# uptime
20:55:40 up, 3:06, 1 user, load average:8.13, 5.90, 4.94
The final content of the command output represents the average number of processes running in the queue in the past 1, 5, and 15 minutes.
In general, as long as the current number of active processes per CPU is not greater than 3 then the performance of the system is good, if the number of tasks per CPU is greater than 5, then the performance of the machine is a serious problem. For the above example, assuming the system has two CPUs, the current number of tasks for each CPU is: 8.13/2=4.065. This means that the performance of the system is acceptable.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.