Linux Performance monitoring: CPU

Last Update:2015-06-15 Source: Internet

Author: User

Tags switches

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

CPU usage depends primarily on what kind of resources are running on the CPU, such as copying a file that typically consumes less CPU, because most of the work is done by DMA (Direct Memory Access), but after the copy is complete, the CPU knows that the copy has been completed. Scientific computing usually takes up more CPU, and most of the computational work needs to be done on the CPU, and the subsystems such as memory, hard disk and so on have only temporary data storage work. To monitor and understand CPU performance you need to know the basics of some operating systems, such as: interrupts, process scheduling, process context switches, operational queues, and so on. Here Vpsee use an example to briefly introduce these concepts and their relationship, the CPU is very innocent, is a hard working wage earners, every moment has a job to do (process, thread) and own a work list (can run the queue), by the Boss (process scheduling) to determine what he should do, He needs to communicate with the boss in order to get the boss's ideas and adjust his work in a timely manner (context switching), some of which need to be reported to the boss in time (interrupted), so the workers (CPU) in addition to doing their own work, there is a lot of time and energy to spend on communication and reporting.

CPU is also a kind of hardware resources, as well as any other hardware devices need to drive and management programs to use, we can think of the kernel process scheduling as a CPU management program, to manage and allocate CPU resources, reasonable scheduling process to preempt the CPU, and decide which process to use the CPU, which process to wait. Process scheduling in the kernel of the operating system is mainly used to dispatch two types of resources: process (or thread) and interrupt, process scheduling to different resources assigned different priorities, the highest priority is the hardware interrupt, followed by the kernel (System) process, and finally the user process. Each CPU maintains a running queue to hold those threads that can be run. The thread is either in a sleep state (blocked is waiting for IO) or is in a running state, and if the current load on the CPU is too high and the new request continues, there will be a situation where the process scheduler is temporarily not able to cope, and at this point the thread has to be temporarily placed in the operational queue. Vpsee is here to discuss performance monitoring, which is discussed in a bunch of not mentioned performance, then what is the relationship between these concepts and performance monitoring? The relationship is significant. If you are the boss, how do you check the efficiency (performance) of the wage earners? We will generally use the following information to determine whether a worker is lazy:

How many jobs are accepted and completed by the wage earners and reported to the Boss (interrupted);
The employee communicates with the boss, negotiates the work progress of each work (context switch);
The work list of the wage earners are all full (can run the queue);
Wage earners how efficient, is not in the lazy (CPU utilization).

Now we can switch the workers to CPU, we could monitor the CPU performance by looking at these important parameters: interrupt, Context switch, run queue, CPU utilization.

Bottom line

Previous Linux Performance monitoring: the introduction of performance monitoring before the need to know the bottom line, then monitor the CPU performance of the bottom line is what? Usually we expect our system to reach the following goals:

CPU utilization, if the CPU has 100% utilization, then a balance should be reached: 65%-70% User time,30%-35% System time,0%-5% Idle time;
Context switching, context switching should be linked to CPU utilization, if you can maintain the above balance of CPU utilization, a lot of context switching is acceptable;
The queue can be run, each running queue should not exceed 3 threads (per processor), such as: Dual processor system should not be more than 6 threads in a running queue.

Vmstat

Vmstat is a small tool that looks at the overall performance of the system, works well even in very heavy situations, and can capture continuous performance data at time intervals.

$ vmstat 1procs-----------memory-------------Swap-------io------System-------CPU------R  B   swpd   free< C3/>buff  cache   si   so    bi    bo   in   CS US sy ID WA St 2  1    140 2787980 336304 3531996< C13/>0    0     0 (   1166 5033  3  3) 0 0 1    2788296 336304 3531996  0    0     0     0 1194 5605  3  3 0  0  1    2788436 336304 3531996  0    0     0     0 1249 8036  5 4 (0 0 1)    2782688 336304 3531996  0    0     0     0 1333 7792  6  6  0 3  1    2779292 336304 3531992  0    0     0    28 1323 7087  4  5  0

Parameter description:

R, the number of threads that can run the queue, all of which are operational, except that the CPU is temporarily unavailable;
b, the number of processes being blocked, waiting for IO requests;
In, number of interrupts processed
CS, number of context switches being made on the system
US, percent of CPU consumed by the user
Sy, the percentage of cores and interrupts consuming CPU
WA, all running threads are blocked waiting for IO, when the percentage of CPU idle
Id,cpu percent of total idle

Give two realistic examples to actually analyze:

$ vmstat 1procs-----------memory-------------Swap-------io------System-------CPU------R  B   swpd   free< C3/>buff  cache   si   so    bi    bo   in   CS US sy ID WA St 4  0    140 2915476 341288 3951700< C13/>0    0     0     0 1057  523 bayi  0  0  0 4  0    2915724 341296 3951700  0< C24/>0     0     0 1048  546  0  0  0 4 0    2915848 341296 3951700  0    0< C35/>0     0 1044  514  0  0  0 4  0    2915848 341296 3951700  0    0     0< C46/>24 1044  564  0  0  0 4  0    2915848 341296 3951700  0    0     0     0 1060  546  0  0  0

From the above data you can see the points:

Interrupts (in) is very high, the context switch (CS) is relatively low, indicating that the CPU has been constantly requesting resources;
System Time (SY) remains above 80% and the context switch is low (CS), indicating that a process may have been hogging the CPU (constantly requesting resources);
The run queue (R) is just 4.

$ vmstat 1procs-----------memory-------------Swap-------io------System-------CPU------R  B   swpd   free< C3/>buff  cache   si   so    bi    bo   in   CS US sy ID wa st14  0    140 2904316 341912 3952308<    0     0   460 1106 9593  1  0  017 0 c13/>0    2903492 341912 3951780  0    0     0     0 1037 9614  1  0  020  0    2902016 341912 3952000  0    0     0     0 1046 9739  1  0  017  0    2903904 341912 3951888  0    0     0    1044 9879 PNS  0  0  016  0    2904580 341912 3952108  0    0     0     0 1055 9808  1  0  0

From the above data you can see the points:

Context Switch (CS) is much higher than interrupts (in), which means the kernel has to switch processes back and forth;
Further observation shows that system time (SY) is very high and user time (US) is low, plus a high-frequency context switch (CS), indicating that a large number of system calls are being invoked by the running application;
Run Queue (R) is above 14 threads, according to the hardware configuration of this test machine (quad core), should be kept within 12.

Mpstat

Mpstat and Vmstat are similar, the difference is that Mpstat can output multiple processors of data, the output below shows that CPU1 and CPU2 are basically not in handy, the system has enough capacity to handle more tasks.

$ mpstat-p All 1Linux 2.6.18-164.el5 (vpsee) 11/13/200902:24:33 PM  CPU%user%nice%sys%iowait    % IRQ   %soft  %steal   %idle    intr/s02:24:34 PM  all    5.26    0.00    4.01   25.06    0.00    0.00    0.00   65.66   1446.0002:24:34 PM    0    7.00    0.00    8.00    0.00    0.00    0.00    0.00   85.00   1001.0002:24:34 PM    1   13.00    0.00    8.00    0.00    0.00    0.00    0.00   79.00    444.0002:24:34 PM    2    0.00    0.00    0.00  100.00    0.00    0.00    0.00    0.00      0.0002:24:34 PM    3    0.99    0.00    0.99    0.00    0.00    0.00    0.00   98.02      0.00

How do I see how much CPU resources a program or process consumes? Here is the operation of Firefox on a Sunray server in Vpsee, with only 2 users currently using Firefox:

$ while:; Do Ps-eo Pid,ni,pri,pcpu,psr,comm | grep ' Firefox '; Sleep 1; Done  PID  NI PRI%cpu PSR COMMAND 7252   0  3.2   3 Firefox 9846   0  8.8   0 Firefox 7252   0  3.2   2 Firefox 9846   0  8.8   0 Firefox 7252   0  24  3.2   2 Firefox

Linux Performance monitoring: CPU

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More