Every Linux system administrator should know how to verify the integrity and availability of hardware, resources, and primary processes. In addition, setting resource limits on a per-user basis is one of the required skills.
In this article, we'll cover some of the ways that you can ensure that your system hardware and software work properly to avoid potential problems that could lead to a production environment going offline or losing money.
Reporting Linux Process Statistics
You can use Mpstat to view the activity of each processor or system as a whole, either one snapshot at a time or a dynamic update.
In order to use this tool, you first need to install Sysstat:
# yum update && yum install sysstat [基于 CentOS 的系统]# aptitutde update && aptitude install sysstat [基于 Ubuntu 的系统]# zypper update && zypper install sysstat [基于 openSUSE 的系统]
You can learn more about Sysstat and the tools Mpstat, Pidstat, Iostat, and SAR[3] in Linux to learn more about Sysstat and the tools in it.
After you install Mpstat , you can use it to generate a report of processor statistics.
You can use the following command to display -P
the CPU utilization () of all CPUs (in all) for a -u
total of 3 times every 2 seconds.
# mpstat -P ALL -u 2 3
Example output:
Linux 3.19.0-32-generic (tecmint.com) Wednesday March _x86_64_ (4 CPU) 11:41:07 IST CPU%usr%nice %sys%iowait%irq%soft%steal%guest%gnice%idle11:41:09 IST all 5.85 0.00 1.12 0.12 0.00 0.00 0.00 0.00 0.00 92.9111:41:09 IST 0 4.48 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 94.5311:41:09 IST 1 2.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 97.0011:41:09 Ist 2 6.44 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 92.5711:41:09 IST 3 10.45 0.0 0 1.99 0.00 0.00 0.00 0.00 0.00 0.00 87.5611:41:09 IST CPU%usr%nice%sys%iowait%ir Q%soft%steal%guest%gnice%idle11:41:11 IST all 11.60 0.12 1.12 0.50 0.00 0.00 0.00 0.0 0 0.00 86.6611:41:11 IST 0 10.50 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 88.5011:41:1 1 IST 1 14.36 0.1.49 2.48 0.00 0.00 0.00 0.00 0.00 81.6811:41:11 IST 2 2.00 0.50 1.00 0.00 0 .0.00 0.00 0.00 0.00 96.5011:41:11 IST 3 19.40 0.00 1.00 0.00 0.00 0.00 0.00 0 .0.00 79.6011:41:11 IST CPU%usr%nice%sys%iowait%irq%soft%steal%guest%gnice%idle11:41 : ist all 5.69 0.00 1.24 0.00 0.00 0.00 0.00 0.00 0.00 93.0711:41:13 IST 0 2.97 0.00 1.49 0.00 0.00 0.00 0.00 0.00 0.00 95.5411:41:13 IST 1 10.78 0.00 1.47 0.00 0.00 0.00 0.00 0.00 0.00 87.7511:41:13 IST 2 2.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 97.0011:41:13 IST 3 6.93 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 92.57Ave Rage:cpu%usr%nice%sys%iowait%irq%soft%steal%guest%gnice%idleaverage:all 7.71 0 .04 1.16 0.21 0.0.00 0.00 0.00 0.00 90.89average:0 5.97 0.00 1.16 0.00 0.00 0.00 0.00 0.0 0 0.00 92.87average:1 9.24 0.00 1.16 0.83 0.00 0.00 0.00 0.00 0.00 88.78Average: 2 3.49 0.17 1.00 0.00 0.00 0.00 0.00 0.00 0.00 95.35average:3 12.25 0.00 1.16 0.00 0.00 0.00 0.00 0.00 0.00 86.59
To view the specified CPU ( in the following example, CPU 0), you can use:
# mpstat -P 0 -u 2 3
Example output:
Linux 3.19.0-32-generic (tecmint.com) Wednesday 30 March 2016 _x86_64_ (4 CPU)11:42:08 IST CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle11:42:10 IST 0 3.00 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 96.5011:42:12 IST 0 4.08 0.00 0.00 2.55 0.00 0.00 0.00 0.00 0.00 93.3711:42:14 IST 0 9.74 0.00 0.51 0.00 0.00 0.00 0.00 0.00 0.00 89.74Average: 0 5.58 0.00 0.34 0.85 0.00 0.00 0.00 0.00 0.00 93.23
The output of the above command includes these columns:
CPU
: The processor number represented by an integer or all represents the average of all processors.
%usr
: Percentage of CPU utilization that runs at the user level of the app.
%nice
: and %usr
same, but with nice priority.
%sys
: Percentage of CPU utilization performed by the kernel application. This does not include the time that is used to process interrupts or hardware requests.
%iowait
: Specifies (or all) the percentage of idle time for the CPU, which indicates that the current CPU is in an I/O operation-intensive state. A more detailed explanation (with an example) can be viewed here [4].
%irq
: The percentage of time that is used to process hardware interrupts.
%soft
: %irq
same as, but soft interrupt.
%steal
: The percentage of time that a virtual machine is non-autonomous waiting (time slice stealing), that is, the time that the virtual machine "wins" from the hypervisor when it competes for the CPU. This value should be kept as small as possible. If this value is large, it means that the virtual machine is or will stop functioning.
%guest
: The percentage of time that is spent running a virtual processor.
%idle
: The percentage of time that the CPU does not run any tasks. If you observe that this value is small, it means that the system is heavily loaded. In this case, you'll need to look at a detailed list of processes and what's going to be discussed below to determine what's causing the problem.
Run the following command to put the processor at a very high load and then execute the mpstat command at another terminal:
# dd if=/dev/zero of=test.iso bs=1G count=1# mpstat -u -P 0 2 3# ping -f localhost # Interrupt with Ctrl + C after mpstat below completes# mpstat -u -P 0 2 3
Finally, compare the output of Mpstat with the "normal" case:
Linux Processor related Statistics report
As you can see in the above illustration, in the previous two examples, %idle
the value of the CPU 0 is determined to be high.
In the next section, we'll discuss how to identify resource-hungry processes, how to get more information about them, and how to take appropriate action.
Linux Process Report
We can use the well-known ps
commands to sort the processes by CPU utilization by using the -eo
option (all processes selected according to the user-defined format) and the --sort
options (specifying a custom sort order), for example:
# ps -eo pid,ppid,cmd,%cpu,%mem --sort=-%cpu
The above commands only show PID
, PPID
, and process-related commands, CPU usage, and RAM usage, and are sorted in descending order of CPU usage. When you create the. iso file, run the above command, which is the preceding lines of the output:
Find processes based on CPU utilization
Once we have found the process of interest (for example PID=2822
, the process), we can go into /proc/PID
(in this case /proc/2822
) listing the contents of the directory.
This directory is the directory where the process runs to save multiple files and subdirectories about the process details.
For example:
/proc/2822/io
Includes IO statistics for the process (number of read-write characters at IO operations).
/proc/2822/attr/current
Shows the current SELinux security properties for the process.
/proc/2822/cgroup
If the Configcgroups kernel setting option is enabled, this displays the control group (cgroups) to which the process belongs, and you can use the following command to verify that configcgroups is enabled:
# cat /boot/config-$(uname -r) | grep -i cgroups
If this option is enabled, you should see:
CONFIG_CGROUPS=y
According to the Red Hat Enterprise Linux 7 Resource Management Guide [5] Chapter first to fourth, OpenSUSE system Analysis and Tuning Guide [6] nineth, Ubuntu 14.04 Server Document Control Groups Chapter [7], you can use manages the number of resources per process that are allowed to be used.
/proc/2822/fd
This directory contains a symbolic link to each open file that describes the process. The following shows information about creating the . ISO mirroring process in tty1 (the first terminal):
Find Linux Process Information
The above display stdin(file descriptor 0),stdout(file descriptor 1),stderr(file descriptor 2) are mapped accordingly to /dev/zero, /root/test.iso , and /dev/tty1.
More information on the /proc
" /proc
file System" and Linux developer manuals that can be viewed kernel.org maintained.
Set resource limits for each user in Linux
If you are not careful enough to let any user use the unlimited number of processes, you may end up with an unexpected system shutdown or lock-in because the system is in an unusable state. To prevent this from happening, you should set an upper limit for the number of processes that the user can start.
You can set the limit by adding the following line at the end of the /etc/security/limits.conf file:
* hard nproc 10
The first field can be used to represent a user, group (*)
, or everyone, and the second field enforces a limit on the number of processes that can be used (NPROC). Exit and log back in to allow the settings to take effect.
Then let's look at what happens when a non-root user (a legitimate user or an illegal user) tries to cause a shell fork bomb (see WiKi[8]). If we do not set a limit, the Shell fork bomb will launch two instances of the function indefinitely, and then copy an infinite number of instances in a loop. Eventually cause your system to die.
However, if the above restrictions are used, the Fort Bomb will not succeed, but the user will still be locked out until the system administrator kills the associated process.
Run the Shell Fork bomb
tip : limits.conf
You can see the restrictions that other ulimit can change in the file.
Other Linux Process management tools
In addition to the tools discussed above, a system administrator may also need to:
a) adjust the execution priority (use of system resources) by using Renice . This means that the kernel assigns -20
19
more or less system resources to the process based on the assigned priority (known as "niceness", which is a range of integers).
The smaller the value, the higher the execution priority. The average user (not root) can only increase the niceness value of all their processes (meaning lower priority), while the root user may be able to increase or lower the niceness value of any process.
The basic syntax for the Renice command is as follows:
# renice [-n] <new priority> <UID, GID, PGID, or empty> identifier
If the parameter after the new priority is not (empty), the default is PID. In this case, the niceness value of thepid=identifier process is set to <new priority>
.
b) interrupt the normal execution of a process when needed. This is what is commonly referred to as the "kill" process [9]. Essentially, this means sending a signal to the process that will properly end the run and release any resources that are consumed in an orderly manner.
Use the KILL command to kill the process in the following way []:
# kill PID
Alternatively, you can use Pkill[11] to end all processes that specify a user (-u)
, a specified group, (-G)
or even a common parent process ID (-P)
. These options can be followed by a number or an identifier represented by a name.
# pkill [options] identifier
For example:
# pkill -G 1000
GID=1000
All processes of the group will be killed. and
PPID 是 4993
all processes that will kill.
It pkill
pgrep
is a good idea to use test results, or to -l
list process names with options before running. It requires the pkill
same parameters, but only returns the PID of the process (without any other action), and pkill
kills the process.
# pgrep -l -u gacanepa
Use the following picture to illustrate:
Find a user-run process in Linux
Summarize
In this article we explored some of the methods used to monitor resources in order to verify the integrity and availability of important hardware and software components in a Linux system.
We also learned how to take appropriate action in special cases (by adjusting the execution priority of a given process or ending a process).
Read the original
Linux process Resource usage monitoring and per-user set process limits