Https://help.aliyun.com/knowledge_detail/41225.html?spm=5176.7841174.2.2.ifP9Sc
Note: The relevant configuration and instructions in this article have been tested in the CentOS 6.5 64-bit operating system. Other types and versions of the operating system configuration may vary, please refer to the appropriate operating system official documentation.
If the CPU of ECS Linux system keeps running high, it will affect the stability of the system and the operation of the cloud server. This paper gives a brief description of the troubleshooting of high CPU occupancy rate problem.
CPU Load Viewing method
Use Vmstat to view the CPU load on the system's latitude
You can view the usage of CPU resources from the system dimension through Vmstat.
Usage notes:
格式:vmstat -n 1# -n 1 表示结果一秒刷新一次。示例输出:
$ vmstat -n 1procs —————-memory————— —-swap— ——-io—— -system— ———cpu——- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 70352 169448 448452 0 0 0 4 10 11 0 0 99 0 0 0 0 0 70376 169448 448484 0 0 0 0 175 406 0 0 100 0 0 0 0 0 70376 169448 448484 0 0 0 0 173 414 0 1 99 0 0 0 0 0 70376 169448 448484 0 0 0 128 212 429 3 0 96 1 0^C
Echo Description:
Returns the main data column description in the result:
- R: Represents the thread that the CPU in the system waits to process. Because the CPU can only process one thread at a time, the larger the value, the more often it indicates that the system is running slower.
- US: Percentage of CPU time consumed by user mode. When the value is high, the user process consumes more CPU time, for example, if the value exceeds 50% over a long period, the program algorithm or code needs to be optimized.
- Sy: Percentage of CPU time consumed by kernel mode.
- Wa:io the percentage of CPU time that is waiting to be consumed. When this value is high, the IO wait is more serious, which may be caused by a large number of disks for random access, or it may be a bottleneck in disk performance.
- ID: Percentage of CPU time in idle state. If the value continues to be 0 and the SY is twice times that of us, it usually indicates that the system is facing a shortage of CPU resources.
Use top to view CPU load for process latitude
You can view the usage of its CPU, memory, and other resources by top from the process latitude.
Usage notes:
Format:TopExample output:Top- 17:27:13Up27Days, 3:13, 1User,Load average: 0.02, 0.03, 0.05Tasks: 94Total, 1Running, 93Sleeping, 0Stopped, 0Zombie%Cpu(S): 0.3us, 0.1Sy, 0.0Ni, 99.5Id, 0.0Wa, 0.0Hi, 0.0Si, 0.1StkibMem: 1016656Total, 946628Used, 70028Free, 169536BufferskibSwap: 0Total, 0Used, 0Free. 448644CachedMemPID USER PR NI VIRT RES SHR S%Cpu%MEM time+COMMAND1Root20 0 41412 3824 2308S0.0 0.4 0:19.01 systemd 2 root 20 0 Span class= "lit" >0 0< Span class= "PLN" > 0 S 0.0 0.00: 00.04 Kthreadd
Echo Description:
The third row on the default interface shows the overall usage of the current CPU resources, and the resource usage for each process is shown below.
You can enter the size letter P directly in the interface, so that the monitoring results are sorted in reverse CPU usage, and then locate the process that consumes high CPU in the system. Finally, according to the system log and the program itself related logs, the corresponding process to do further analysis to determine the reasons for its high CPU consumption.
Operation Case Use top to directly terminate CPU-intensive processes
As mentioned earlier, you can view the system's load problems with the top command and locate processes that consume more CPU resources.
You can quickly terminate the corresponding exception process directly in the top run interface. The description is as follows:
- To terminate a process, simply press the lowercase k key.
- Enter the PID of the process you want to terminate (the first column of the top output). For example, if you want to terminate a process with a PID of 23, enter 23 and press ENTER.
- As shown, after successful operation, the interface will appear similar to "Send PID signal [15/sigterm]" Prompt information for the user to confirm. Press ENTER to confirm.
Low CPU usage but high load
- Problem Description:
The Linux system has no business programs running, and with top view, the CPU is very idle, as shown, but the load average is very high:
- Treatment methods:
Load average is an evaluation of the CPU load, the higher the value, the longer the task queue, the more tasks waiting to be performed.
When this happens, it may be caused by a zombie process. You can see if the D-state process exists through the directive PS-AXJF .
The D state refers to the non-interruptible sleep state. The process of this state cannot be killed, nor can it exit itself. It can only be resolved by restoring its dependent resources or rebooting the system.
KSWAPD0 process consumes high CPU
The operating system uses paging mechanism to manage physical memory, the operating system will be a part of the disk as virtual memory, because the memory speed than the disk much faster, so the operating system to a certain kind of paging mechanism to the unnecessary pages into the disk, the required pages into memory, due to the memory is not enough, This page change action continues, KSWAPD0 is the virtual memory management in charge of paging, when the server memory is not enough when the kswapd0 will perform a paging operation, this paging operation is very expensive host CPU resources. If the process is found to be in a non-sleep state through top, and it runs for a long time, you can initially determine that the system is continuously paging, and you can turn the problem to an out-of-memory reason for troubleshooting.
- Problem Description:
The kswapd0 process consumes a large amount of CPU resources on the system.
- Treatment methods:
While the Linux system manages memory through paging, a portion of the disk is zoned out as virtual memory. KSWAPD0 is the process of changing pages in virtual memory management for Linux systems. When the system is running low on memory, KSWAPD0 will frequently make a page change operation. Because the paging operation consumes CPU resources very much, it causes the process to continue to consume higher CPU resources.
If the kswapd0 process is found to be in a non-sleep state through a monitor such as top, and it runs for a long time and continues to consume higher CPU resources, it is usually caused by the continuous paging operation of the system. You can use free, PS and other instructions to further query the system and the memory footprint of the process in the system, do further troubleshooting and analysis.
Troubleshooting the High CPU utilization of ECS Linux system in Cloud server