How to flexibly apply Linux process resource monitoring and process constraints

Last Update:2016-10-07 Source: Internet

Author: User

Tags cpu usage pkill

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Guide

Every Linux system administrator should know how to verify the integrity and availability of hardware, resources, and primary processes. In addition, setting resource limits on a per-user basis is one of the required skills.

In this article, we'll cover some of the ways that you can ensure that your system hardware and software work properly to avoid potential problems that could lead to a production environment going offline or losing money.

reporting Linux Process statistics

You can use Mpstat to view the activity of each processor or system as a whole, either one snapshot at a time or a dynamic update.

In order to use this tool, you first need to install Sysstat:

# Yum Update && yum install Sysstat              [CentOS-based system]# APTITUTDE update && aptitude install Sysstat   [Ubuntu-based system]# zypper update && zypper install Sysstat        [OpenSUSE-based system]

You can learn more about Sysstat and the tools Mpstat, Pidstat, Iostat, and SAR in Linux to learn more about Sysstat and the tools in it.

After you install Mpstat , you can use it to generate a report of processor statistics.

You can use the following command to display the CPU utilization (-u) of all CPUs (by-P all) every 2 seconds, with a total of 3 times.

# Mpstat-p All-u 2 3

Example output:

Linux 3.19.0-32-generic (tecmint.com) Wednesday March _x86_64_ (4 CPU) 11:41:07 IST CPU%usr%nice    %sys%iowait%irq%soft%steal%guest%gnice%idle11:41:09 IST all 5.85 0.00 1.12 0.12 0.00    0.00 0.00 0.00 0.00 92.9111:41:09 IST 0 4.48 0.00 1.00 0.00 0.00 0.00 0.00 0.00  0.00 94.5311:41:09 IST 1 2.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 97.0011:41:09 Ist 2 6.44 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 92.5711:41:09 IST 3 10.45 0.0 0 1.99 0.00 0.00 0.00 0.00 0.00 0.00 87.5611:41:09 IST CPU%usr%nice%sys%iowait%ir Q%soft%steal%guest%gnice%idle11:41:11 IST all 11.60 0.12 1.12 0.50 0.00 0.00 0.00 0.0 0 0.00 86.6611:41:11 IST 0 10.50 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 88.5011:41:1 1 IST 1 14.36 0.1.49 2.48 0.00 0.00 0.00 0.00 0.00 81.6811:41:11 IST 2 2.00 0.50 1.00 0.00 0 .0.00 0.00 0.00 0.00 96.5011:41:11 IST 3 19.40 0.00 1.00 0.00 0.00 0.00 0.00 0 .0.00 79.6011:41:11 IST CPU%usr%nice%sys%iowait%irq%soft%steal%guest%gnice%idle11:41    : ist all 5.69 0.00 1.24 0.00 0.00 0.00 0.00 0.00 0.00 93.0711:41:13 IST 0 2.97    0.00 1.49 0.00 0.00 0.00 0.00 0.00 0.00 95.5411:41:13 IST 1 10.78 0.00 1.47 0.00    0.00 0.00 0.00 0.00 0.00 87.7511:41:13 IST 2 2.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 97.0011:41:13 IST 3 6.93 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 92.57Ave Rage:cpu%usr%nice%sys%iowait%irq%soft%steal%guest%gnice%idleaverage:all 7.71 0 .04 1.16 0.21 0.0.00 0.00 0.00 0.00 90.89average:0 5.97 0.00 1.16 0.00 0.00 0.00 0.00 0.0       0 0.00 92.87average:1 9.24 0.00 1.16 0.83 0.00 0.00 0.00 0.00 0.00 88.78Average:    2 3.49 0.17 1.00 0.00 0.00 0.00 0.00 0.00 0.00 95.35average:3 12.25 0.00 1.16 0.00 0.00 0.00 0.00 0.00 0.00 86.59

To view the specified CPU ( in the following example, CPU 0), you can use:

# Mpstat-p 0-u 2 3

Example output:

Linux 3.19.0-32-generic (tecmint.com)   Wednesday March     _x86_64_    (4 CPU) 11:42:08  IST  CPU    %usr   %nice    %sys%iowait    %irq   %soft  %steal  %guest  %gnice   %idle11 : 42:10  IST    0    3.00    0.00    0.50    0.00    0.00    0.00    0.00    0.00    0.00   96.5011:42:12  IST    0    4.08    0.00    0.00 2.55 0.00    0.00    0.00    0.00    0.00   93.3711:42:14  IST    0    9.74    0.00    0.51    0.00    0.00    0.00    0.00    0.00    0.00   89.74Average:       0    5.58    0.00    0.34    0.85    0.00    0.00    0.00    0.00    0.00   93.23

The output of the above command includes these columns:
CPU: The processor number represented by an integer or all represents the average of all processors.
%usr: Percentage of CPU utilization that runs at the user level of the app.
%nice: Same as %usr , but with nice priority.
%sys: Percentage of CPU utilization performed by the kernel application. This does not include the time that is used to process interrupts or hardware requests.
%iowait: Specifies the percentage of idle time (or all) of the CPU, which indicates that the current CPU is in an I/O operation-intensive state.
%IRQ: The percentage of time that is used to process hardware interrupts.
%soft: Same as %irq , but soft interrupt.
%steal: The percentage of time that a virtual machine is non-autonomous waiting (time-slice-stealing), that is, the time that the virtual machine "wins" from the hypervisor when it competes for the CPU. This value should be kept as small as possible. If this value is large, it means that the virtual machine is or will stop functioning.
%guest: The percentage of time that is spent running a virtual processor.
%idle: The percentage of time that the CPU does not run any tasks. If you observe that this value is small, it means that the system is heavily loaded. In this case, you'll need to look at a detailed list of processes and what's going to be discussed below to determine what's causing the problem.
Run the following command to put the processor at a very high load and then execute the mpstat command at another terminal:

# dd If=/dev/zero of=test.iso bs=1g count=1# mpstat-u-P 0 2 3# ping-f localhost # Interrupt with Ctrl + C after Mpstat Below completes# Mpstat-u-P 0 2 3

Finally, compare the output of Mpstat with the "normal" case:

As you can see in the above illustration, in the previous two examples, depending on the value of %idle , it is possible to determine the CPU 0 load is high.

In the next section, we'll discuss how to identify resource-hungry processes, how to get more information about them, and how to take appropriate action.

Linux Process Report

We can use the famous PS command to sort the list of processes by CPU utilization using the -eo option (all processes selected based on user-defined format) and the --sort option (Specify a custom sort order), for example:

# Ps-eo Pid,ppid,cmd,%cpu,%mem--sort=-%cpu

The above command shows only PID,PPID, and process-related commands, CPU usage, and RAM usage, sorted in descending order of CPU usage. When you create the. iso file, run the above command, which is the preceding lines of the output:

Once we have found the process of interest (such as the pid=2822 process), we can enter /proc/pid ( in this case, /proc/2822) to list the contents of the directory.

This directory is the directory where the process runs to save multiple files and subdirectories about the process details.

For example:

/proc/2822/io: Includes IO statistics for the process (number of read-write characters at IO operation).

/proc/2822/attr/current: Shows the current SELinux security properties of the process.

/proc/2822/cgroup: If the configCGROUPS kernel setting option is enabled, this displays the control group (CGROUPS) to which the process belongs, you can use the following command to verify that CONFIG is enabled CGROUPS:

# cat/boot/config-$ (uname-r) | Grep-i cgroups

If this option is enabled, you should see:

Config_cgroups=y

Based on the content of chapter first to Fourth of the Red Hat Enterprise Linux 7 Resource Management Guide, the OpenSUSE System Analysis and Tuning guide, chapter Nineth, Ubuntu 14.04 Server document control Groups, you can use the cgroups Manage the number of resources that each process is allowed to use.

/proc/2822/fd This directory contains a symbolic link to each open file that describes the process. The following shows information about creating the . ISO mirroring process in tty1 (the first terminal):

The above display stdin(file descriptor 0),stdout(file descriptor 1),stderr(file descriptor 2) are mapped accordingly to /dev/zero, /root/test.iso , and /dev/tty1.

set resource limits for each user in Linux

If you are not careful enough to let any user use the unlimited number of processes, you may end up with an unexpected system shutdown or lock-in because the system is in an unusable state. To prevent this from happening, you should set an upper limit for the number of processes that the user can start.

You can set the limit by adding the following line at the end of the /etc/security/limits.conf file:* Hard Nproc
The first field can be used to represent a user, group, or Everyone (*), and the second field enforces limits on the number of processes that can be used (NPROC). Exit and log back in to allow the settings to take effect.

Then let's look at what happens when a non-root user (a legitimate user or an illegal user) tries to cause a shell fork bomb. If we do not set a limit, the Shell fork bomb will launch two instances of the function indefinitely, and then copy an infinite number of instances in a loop. Eventually cause your system to die.

However, if the above restrictions are used, the Fort Bomb will not succeed, but the user will still be locked out until the system administrator kills the associated process.

Tip : You can see the restrictions that other ulimit can change in thelimits.conf file.

other Linux process management tools

In addition to the tools discussed above, a system administrator may also need to:

a) adjust the execution priority (use of system resources) by using Renice . This means that the kernel assigns more or less system resources to the process based on the assigned priority (known as "niceness", which is an integer ranging from 20 to 19).

The smaller the value, the higher the execution priority. The average user (not root) can only increase the niceness value of all their processes (meaning lower priority), while the root user may be able to increase or lower the niceness value of any process.

The basic syntax for the Renice command is as follows:

# Renice [-n] <new priority> <uid, GID, Pgid, or empty> identifier

If the parameter after the new priority is not (empty), the default is PID. In this case, the niceness value of thepid=identifier process is set to <new priority>

b) interrupt the normal execution of a process when needed. This is what is often called "killing" the process. Essentially, this means sending a signal to the process that will properly end the run and release any resources that are consumed in an orderly manner.

Use the kill command to kill the process as follows:

# Kill PID

Alternatively, you can use Pkill to end the specified user (-u), the specified group (-g), and even all processes with a common parent process ID (-p). These options can be followed by a number or an identifier represented by a name.

# Pkill [Options] Identifier

For example:

Kills all processes that the group gid=1000.

# PKILL-G 1000

Killing PPID is the 4993 of all processes.

# pkill-p 4993

It is a good idea to use the PGREP test results before running Pkill, or to list the process names with the-l option. It requires the same parameters as Pkill, but only returns the PID of the process (without any other action), and Pkill kills the process.

# pgrep-l-U Gacanepa

Use the following picture to illustrate:

Summary

In this article we explored some of the methods used to monitor resources in order to verify the integrity and availability of important hardware and software components in a Linux system. We also learned how to take appropriate action in special cases (by adjusting the execution priority of a given process or ending a process). We hope that the concepts described in this article will help you.

Originally from: http://www.linuxprobe.com/linux-process-monitoring.html

Free to provide the latest Linux technology tutorials Books, for open-source technology enthusiasts to do more and better: http://www.linuxprobe.com/

How to flexibly apply Linux process resource monitoring and process constraints

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More