Linux view CPU performance and operating status instructions Mpstat,vmstat,iostat,sar,top

Source: Internet
Author: User

Metrics to measure CPU performance:

1, the user uses the CPU situation;
CPU runs regular user processes
CPU Running niced Process
CPU running real-time process

2, the system uses CPU condition;
For I/O Management: interrupts and drives
For memory management: page swapping
User Process Management: Process start and context switch

3,wio: The rate at which the process waits for disk I/O and the CPU is idle.

4,CPU idle time, except for the above Wio

5,cpu ratio used for context switching

6,nice

7,real-time

8, the length of the running process queue

9, average load

The tools commonly used in Linux to monitor overall CPU performance are:

§mpstat:mpstat not only can view the average information of all CPUs, but also can view the information of the specified CPU.

§vmstat: Can only view the average information of all CPUs; View CPU queue information;

§iostat: You can only view the average information for all CPUs.

§sar: As with Mpstat, you can see not only the average CPU information, but also the information of the specified CPU.

§top: The displayed information is close to PS, but top can understand the CPU consumption and can update the display according to the time specified by the user.

Here are the following:

One, Vmstat

[[Email protected] ~] #vmstat-N 3 (Refresh once every 3 seconds)
procs-----------Memory--------------------swap------IO------system----------CPU--------
R b swpd free buff cache si so bi bo in CS US sy ID WA
10 144 186164 105252 2386848 0 0 18 166 83 2 48 21 31 0
20 144 189620 105252 2386848 0 0 0 177 1039 1210 34 10 56 0
00 144 214324 105252 2386848 0 0 0 10 1071 670 32 5 63 0
00 144 202212 105252 2386848 0 0 0 189 1035 558 20 3 77 0
20 144 158772 105252 2386848 0 0 0 203 1065 2832 70 14 15 0

Red content indicates CPU-related parameters

PROC (Esses)
--r: If the sequence running in processes (process R) is more contiguous than the number of CPUs in the system indicates that the system is now running slower, there are most processes waiting for the CPU.
If the output number of R is more than 4 times times the number of available CPUs in the system, then the system is facing a CPU shortage, or the CPU rate is too low, the system has a majority of processes waiting for the CPU, causing the system to run too slow process.
SYSTEM
--in: Number of interrupts generated per second
--cs: Number of context switches generated per second
The larger the 2 values above, the greater the CPU time consumed by the kernel will be seen.

Cpu
-us: Percentage of CPU time consumed by user processes
When the value of us is higher, it means that the user process consumes more CPU time, but if it is used over a long period of 50%, then we should consider optimizing the program algorithm or accelerating (such as Php/perl)
-sy: The percentage of CPU time consumed by the kernel process (when the value of SY is high, it indicates that the system kernel consumes more CPU resources, which is not a benign performance, we should check the cause)
-wa:io percentage of CPU time waiting to be consumed
When the value of WA is high, it indicates that IO waits are serious, which may be caused by a large number of random accesses to the disk, or there may be bottlenecks (block operations) on the disk.
-ID:CPU is in the idle time percentage, and if the idle time (CPU ID) lasts 0 and the system time (CPU Sy) is twice times the user time (CPU us) The system faces a shortage of CPU resources.

Workaround:
When the above problems occur, please adjust the CPU usage of the application. This allows the application to use the CPU more efficiently. You can also consider adding more CPUs. The use of CPU can be combined with mpstat, PS aux top prstat– A and so on some of the corresponding commands to comprehensively consider the use of specific CPUs, and those processes that are taking up a lot of CPU time. In general, the problem with the application is larger. For example, some SQL statements are unreasonable and so on will cause this phenomenon.

Second, SAR
SAR [Options] [-A] [-o file] t [n]

In the command line, the N and t two parameters are combined to define the sampling interval and the number of times, T is the sampling interval, which must be
The parameters, n is the number of samples, is optional, the default value is 1,-o file means the command result in binary format
stored in a file, where filename is not a keyword and is a file name. Options is the command line option, the SAR command
There are a lot of options listed below, only the common options:

-A: The sum of all reports.
-U:CPU Utilization
-V: Process, I node, file, and lock table state.
-D: Hard drive usage report.
-R: Usage statistics for memory and swap space.
-G: Serial I/O condition.
-B: Buffer usage.
-A: File read and write.
-C: System call condition.
-Q: Report Queue Length and system average load
-R: The activity of the process.
-Y: Terminal equipment activity situation.
-W: System Exchange activity.
-X {PID | Self | All}: Reports the statistics for the specified process ID, which is the statistics of the SAR process itself, and the all keyword is the statistic for all system processes.

Analysis of CPU utilization by using SAR
#sar-U 2 10
Linux 2.6.18-53.el5pae (Localhost.localdomain) 03/28/2009
07:40:17 PM CPU%user%nice%system%iowait%steal%idle
07:40:19 PM All 12.44 0.00 6.97 1.74 0.00 78.86
07:40:21 PM All 26.75 0.00 12.50 16.00 0.00 44.75
07:40:23 PM All 16.96 0.00 7.98 0.00 0.00 75.06
07:40:25 PM All 22.50 0.00 7.00 3.25 0.00 67.25
07:40:27 PM All 7.25 0.00 2.75 2.50 0.00 87.50
07:40:29 PM All 20.05 0.00 8.56 2.93 0.00 68.46
07:40:31 PM All 13.97 0.00 6.23 3.49 0.00 76.31
07:40:33 PM All 8.25 0.00 0.75 3.50 0.00 87.50
07:40:35 PM All 13.25 0.00 5.75 4.00 0.00 77.00
07:40:37 PM All 10.03 0.00 0.50 2.51 0.00 86.97
Average:all 15.15 0.00 5.91 3.99 0.00 74.95

The contents of the display include:

%user:cpu the percentage of time in user mode.
%nice:cpu the percentage of time in user mode with nice value.
%system:cpu the percentage of time in system mode.
%iowait:cpu the percentage of time to wait for the input output to finish.
%steal: The percentage of the virtual CPU's unconscious wait time when the hypervisor maintains another virtualized processor.
%IDLE:CPU idle time percentage.
In all the display, we should pay attention to the value of%iowait and%idle,%iowait is too high, indicating that there is an I/O bottleneck on the hard disk, the%idle value is high, the CPU is idle, if the%idle value is high but the system response is slow, it is possible that the CPU waits to allocate memory, this should increase the memory capacity. If the%idle value continues below 10, the system's CPU processing power is relatively low, indicating that the most resource to be addressed in the system is the CPU.

Run Process Queue Length analysis with SAR:
#sar-Q 2 10
Linux 2.6.18-53.el5pae (Localhost.localdomain) 03/28/2009
07:58:14 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
07:58:16 PM 0 493 0.64 0.56 0.49
07:58:18 PM 1 491 0.64 0.56 0.49
07:58:20 PM 1 488 0.59 0.55 0.49
07:58:22 PM 0 487 0.59 0.55 0.49
07:58:24 PM 0 485 0.59 0.55 0.49
07:58:26 PM 1 483 0.78 0.59 0.50
07:58:28 PM 0 481 0.78 0.59 0.50
07:58:30 PM 1 480 0.72 0.58 0.50
07:58:32 PM 0 477 0.72 0.58 0.50
07:58:34 PM 0 474 0.72 0.58 0.50
average:0 484 0.68 0.57 0.49

Runq-sz the running queue for the running process.
The number of processes and threads in the Plist-sz process queue
Ldavg-1 average system load (load average) one minute before
Ldavg-5 average system load (load average) for the first five minutes
Ldavg-15 average system load (load average) for the first 15 minutes

By the way, the meaning of load avarage
The load average can be understood as the number of processes per second that the CPU waits to run.
In the Linux system, SAR-Q, uptime, W, top and other commands will have the average load average output of the system, then what is the average system load?
The average system load is defined as the average number of tasks running in a queue during a specific time interval. If a process meets the following criteria, it will be in the run queue:
-It is not in the result of waiting for I/O operation
-It does not actively enter the waiting state (that is, "Wait" is not called)
-Not stopped (for example: waiting to be terminated)
For example:
# uptime
20:55:40 up, 3:06, 1 user, load average:8.13, 5.90, 4.94
The final content of the command output represents the average number of processes running in the queue in the past 1, 5, 15 minutes.
In general, as long as the current number of active processes per CPU is not greater than 3 then the performance of the system is good, if the number of tasks per CPU is greater than 5, then the performance of this machine is a serious problem. For the above example, assuming that the system has two CPUs, the current number of tasks per CPU is: 8.13/2=4.065. This indicates that the performance of the system is acceptable.

Three, Iostat

#iostat-C 2 10
Linux 2.6.18-53.el5pae (Localhost.localdomain) 03/28/2009
AVG-CPU:%user%nice%system%iowait%steal%idle
30.10 0.00 4.89 5.63 0.00 59.38
AVG-CPU:%user%nice%system%iowait%steal%idle
8.46 0.00 1.74 0.25 0.00 89.55
AVG-CPU:%user%nice%system%iowait%steal%idle
22.06 0.00 11.28 1.25 0.00 65.41

Four, Mpstat
Mpstat is the abbreviation of multiprocessor statistics and is a real-time system monitoring tool. Some statistical information about its report and CPU, which is stored in the/proc/stat file. In a multi-CPUs system, it not only can view the average status information of all CPUs, but also can view the information of specific CPU. The following describes only Mpstat CPU-related parameters, Mpstat syntax is as follows:

Mpstat [-P {| All}] [internal [count]]

The meaning of the parameter is as follows:

Parameter interpretation

-P {| All} indicates which CPU is monitored and the CPU is valued in [0,cpu number-1]

Internal interval of two samples adjacent to each other

Count number of samples, count can only be used with delay

When there are no parameters, Mpstat displays the average of all information after the system starts. When there is interval, the first line of information is the average information since the system started. Starting with the second line, the output is the average information for the previous interval time period. The CPU-related output has the following meanings:

Parameter interpretation get data from/proc/stat

CPU Processor ID

User's CPU time in internal time period (%), does not include the nice value is negative process dusr/dtotal*100

Nice in the internal time period, the nice value is the CPU time of the negative process (%) dnice/dtotal*100

System in internal time period, core time (%) dsystem/dtotal*100

Iowait in internal time period, HDD io wait Time (%) diowait/dtotal*100

IRQ in internal time period, soft interrupt time (%) dirq/dtotal*100

Soft in internal time period, soft interrupt time (%) dsoftirq/dtotal*100

Idle during the internal time period, the CPU drops idle time (%) for any reason other than waiting for a disk IO operation didle/dtotal*100

INTR/S the number of interrupts received by the CPU per second in the internal time period dintr/dtotal*100

Total CPU Working time =TOTAL_CUR=USER+SYSTEM+NICE+IDLE+IOWAIT+IRQ+SOFTIRQ

total_pre=pre_user+ pre_system+ pre_nice+ pre_idle+ pre_iowait+ pre_irq+ PRE_SOFTIRQ

Duser=user_cur–user_pre

Dtotal=total_cur-total_pre

Where _cur represents the current value, and _pre represents the value before interval time. All values in the table above are desirable to a two-bit decimal point.
#mpstat-P All 2 10
Linux 2.6.18-53.el5pae (Localhost.localdomain) 03/28/2009

10:07:57 PM CPU%user%nice%sys%iowait%irq%soft%steal%idle intr/s
10:07:59 PM All 20.75 0.00 10.50 1.50 0.25 0.25 0.00 66.75 1294.50
10:07:59 PM 0 16.00 0.00 9.00 1.50 0.00 0.00 0.00 73.50 1000.50
10:07:59 PM 1 25.76 0.00 12.12 1.52 0.00 0.51 0.00 60.10 294.00


Linux view CPU performance and operating status instructions Mpstat,vmstat,iostat,sar,top

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.