Linux Process Management
Lab Environment:
User name: Shiyanlou
Password: ajw3tui5
Introduction to the Management control experiment of Linux process
Through this experiment we will master some of the tools provided by Linux for process viewing and control, to master these tools so that we can in some processes out of the ordinary when we can resolve and view the time
The knowledge points involved in the experiment
- Running status view of the process
- End control of the process
- Order of execution of the process
First, the process of the view
Whether in the test or in the actual production environment or their own use of the process, there are some exceptions to the process, so Linux provides us with some tools to see some of the status of the process information, we can view the status of the process through the top dynamic of the system some information such as CPU, Memory information and so on, we can also view the current process information statically through PS, and we can also use Pstree to view the tree structure of the current active process.
The use of the 1.1 top tool
The top tool is a common viewing tool that allows us to view real-time changes in the process of key information changes in our system.
top
Top is a program that executes in the foreground, so the execution knows that in such an interface, we can do some things with some instructions to filter. First, let's look at some of the information that's shown
We see the first row in the top display,
content |
explain |
Top |
Represents the name of the current program |
11:05:18 |
Represents the time of the current system |
Up 8 Days,17:12 |
Indicates how long the machine has been started. |
1 user |
Indicates that there is only one user in the current system |
Load average:0.29,0.20,0.25 |
Average CPU load in 1, 5, 15 minutes, respectively |
The load average in Wikipedia is interpreted as the system load is a measure of the amount of work, a computer system is doing which is the current C The measure of the PU workload, in particular, refers to the average length of the running queue, which is a computed value that is related to the average number of processes waiting on the CPU.
What do we think of this load average data?
Suppose our system is a single CPU single core, it is likened to a one-way bridge, the CPU task compared to a car.
- Load = 0 means that there is no car on the bridge and the CPU has no tasks;
- Load < 1 means the bridge is not a lot of cars, everything is still very smooth, CPU task is not many, the resources are sufficient;
- Load = 1 when it means that the bridge has been covered by the car, there is no space, the CPU has been working hard, all the resources have been exhausted, of course, it is still within the scope of capacity, but a little slow;
- Load > 1 means that not only the bridge has been occupied by the car, and even the bridge is full, the CPU is in full force, the system resources are used up, but there are a lot of process in the request, waiting. If this value is greater than 2, greater than 3, exceeding the CPU's working capacity of 2, 3. And if the value > 5 indicates that the system is already overloaded. "Note 1"
This is a single CPU single-core situation, and in real life we need to divide the resulting value by our number of cores. We can look at the number of CPUs and the number of cores by the command.
#查看物理CPU的个数#cat /proc/cpuinfo |grep "physical id"|sort |uniq|wc -l#每个cpu的核心数cat /proc/cpuinfo |grep "physical id"|grep "0"|wc -l
We can tell from the above index that the critical value of load is 1, but in real life, more experienced operations or system administrators will set the threshold to 0.7. The index here is divided by the number of cores, so don't confuse the
- If load < 0.7 does not pay attention to him;
- If 0.7< load < 1 We need to pay a little attention, although it can be dealt with but this value is not far from the critical point;
- If load = 1 we need to be vigilant, because at this time there is no more resources, has been doing its utmost;
- If the system is running out of load > 5, you need to work overtime to solve the problem.
Usually we will look at the 15-minute value to see the general trend, and then look at the 5-minute comparison to see if there is a downward trend.
The code for viewing BusyBox knows that the data is checked for the number of active processes every 5 seconds, and then the value is computed, and then load is read from the/proc/loadavg. And how the value of this load is calculated, which is the source of the load calculation
#define FSHIFT/* NR of bits of precision */#define FIXED_1 (1<<fshift)///1.0 as fixed-point (fixed-point) */#define LOAD_FREQ (5*HZ)//5 sec intervals, calculate average load value every 5 seconds */#define CAL C_load (Load, exp, n) LOAD *= exp; Load + = n (fixed_1-exp); Load >>= fshift;UnsignedLong avenrun[3]; Export_symbol (Avenrun);/** Calc_load-given tick count, update the Avenrun load estimates.* this was called while holding a write_lock on xtime_l ock.*/static inline void calc_load ( Span class= "Hljs-keyword" >unsigned long ticks) {unsigned Span class= "Hljs-keyword" >long active_tasks; /* fixed-point */static int count = Load_freq; Count-= ticks; if (Count < 0) {count + = load_freq; active_tasks = count_active _tasks (); calc_load (Avenrun[0], exp_1, active_tasks); calc_load (Avenrun[1", Exp_5, Active_tasks); calc_load (Avenrun[2], exp_15, active_tasks);}}
Interested friends can study, how to calculate. The later part of the code is the equivalent of its formula
Let's go back to the point and look at the second row of data, basically the second line is the process of a situation statistics
content |
explain |
Tasks:26 Total |
Total number of processes |
1 Running |
Number of 1 running processes |
Sleeping |
Number of processes with 25 sleep |
0 stopped |
Number of processes not stopped |
0 Zombie |
No Zombie process Count |
Look at the third row of data on top, which is basically a CPU usage statistic.
content |
explain |
Cpu (s): 1.0%us |
Percentage of CPU occupied by user space |
1.0% Sy |
Percentage of CPU consumed by kernel space |
0.0%ni |
CPU percentage of processes that have changed priority in user process space |
97.9%id |
Percentage of idle CPU |
0.0%wa |
Percentage of CPU time waiting for input and output |
0.1%hi |
A hard interrupt (Hardware IRQ) consumes a percentage of the CPU |
0.0%si |
Soft interrupt (software IRQ)% of CPU occupied |
0.0%st |
(Steal time) is the percentage of the virtual CPU waiting for the actual CPU when Hypervisor services another virtual processor |
CPU utilization, is a time period of CPU usage statistics, through this indicator can be seen in a certain period of time the CPU is occupied, load Average is the load of the CPU, it contains information is not CPU usage status, but in a period of time the CPU is in place Statistics on the number of processes waiting to be processed by the CPU, these two indicators are not the same.
Look at the top of the fourth row of data, this line is basically a usage of memory statistics
content |
explain |
8176740 Total |
Total Physical Memory |
8032104 used |
Total amount of physical memory used |
144636 Free |
Total Free Memory |
313088 buffers |
Amount of memory to use as the kernel cache |
Attention
The maximum amount of physical memory available in the system is not the single value of free, but the cached and the buffers + swap
Look at the top of the fifth row of data, this line is basically a use of the swap area statistics
content |
explain |
Total |
Total Swap Area |
Used |
Total number of swap areas used |
Free |
Total Free Swap Area |
Cached |
The total amount of buffer swap, in-memory content is swapped out to the swap area, and then swapped into memory, but the used swap area has not been overwritten |
Here's a case for the process.
column name |
explanation |
PID |
process ID |
User |
The user who owns the process |
PR |
Priority precedence value for the process execution |
NI |
The nice value of the process |
VIRT |
The total number of virtual memory used by the process task |
RES |
The number of physical memory used by the process, also known as resident memory |
SHR |
The size of the process's shared memory |
S |
Status of the process process: S=sleep r=running z=zombie |
%cpu |
Utilization of the process CPU |
%MEM |
Utilization of the process memory |
time+ |
Total time that the process is active |
COMMAND |
name of the process running |
Attention
The nice value , called the static priority, is a priority value for the user space, with values ranging from 20 to 19. The smaller the value, the higher the process priority, and the higher the value, the lower the priority. 20 to 19 in Nice values, medium-20 priority, 0 is the default, and 19 is the lowest
The PR value indicates that the priority value is called the dynamic priority, which is the actual precedence value of the process in the kernel, and the value range of the process priority is defined by a macro, the name of which is Max_prio, and its value is 140. Linux actually achieves 140 priority ranges, with a value range from 0-139, the smaller the value, the higher the priority level. And 0-99 of these are real-time values, while 100-139 is for the user.
Where the PR of the value of the 139 to a corresponding PR = + ( -20 to +19), where the -20 to +19 is a nice value, so that both are priority, and there are countless relationships, but their values, their scope is not the same
The total number of virtual memory used by the VIRT task, which contains all the code, data, shared libraries, and the total amount of space occupied by the swap space pages, etc.
In the above we have said that top is a foreground program, so it is an interactive
Common Interactive Commands |
explain |
Q |
Exit program |
I |
Toggle display of information about average load and start time |
P |
Sort based on CPU usage percent size |
M |
Sort based on the size of the resident memory |
I |
Ignoring idle and zombie processes, this is a switch-on command |
K |
Terminates a process, the system prompts to enter the PID and send the signal value. The general termination process uses a 15 signal, and the 9 signal does not end normally. This command is masked in safe mode. |
Good use of top can help us to observe the bottleneck of the system, or the problem of the system.
Use of the 1.2 PS tool
PS is also one of our most commonly used tools for viewing processes, and we have a command to understand what information he can bring to me.
ps aux
ps axjf
Let's take a general look at what's going to happen to us and what that information represents
content |
explain |
F |
The flag for the process is 4 to indicate that this program is limited to root, and if 1 means that this subroutine is only copied (fork) and not actually executed (EXEC) |
USER |
The owner of the process |
Pid |
ID of the process |
PPID |
PID of its parent process |
Sid |
The ID of the session |
Tpgid |
ID of the foreground process group |
%cpu |
Percentage of CPU consumed by the process |
%MEM |
Percentage of memory consumed |
NI |
The nice value of the process |
VSZ |
Process uses virtual memory size |
Rss |
Size of pages residing in memory |
Tty |
Terminal ID |
S or STAT |
Process status |
Wchan |
The process resources that are waiting |
START |
Time to start the process |
Time |
The time that the process consumes the CPU |
COMMAND |
Name and parameters of the command |
The tpgid Bar reads-1 of all are not control terminal processes, that is, the daemon
STAT represents the state of the process, and the status of the process is many, as shown in the following table
Status |
explain |
R |
Running. In operation |
S |
Interruptible Sleep. Waiting to be called |
D |
Uninterruptible sleep. Non-terminal sleeping |
T |
stoped. Pausing or tracking status |
X |
Dead. About to be withdrawn |
Z |
Zombie. Zombie Process |
W |
Paging. Memory Exchange |
N |
Low-priority processes |
< |
High-priority processes |
S |
The leader of the process |
L |
Lock status |
L |
Multithreading status |
+ |
Foreground process |
where d is not the state of the terminal sleep, the process in this State does not accept any foreign signal, so can not use the KILL command to kill the process in the D state, whether it is kill,kill-9 or kill-15, generally in this state may be the time of the process IO problem.
PS Tool has a number of parameters, the following to explain some of the commonly used parameters
Use the-l parameter to display a list of bash-related process information for this login.
-l
In contrast, we use the following command more often, he will list all the process information
ps aux
If we look at one of these processes, we can also work with grep and regular expressions.
ps aux | grep zsh
We can also view the process in a tree-like display
ps axjf
Of course, if you feel like using this at this time without putting the information you want together, we can also use this command to customize the parameters we need to display
ps -afxo user,ppid,pid,pgid,command
This is a simple and practical tool that you want to use more flexibly, and want to know more about the parameters we can use man to get more relevant information.
Use of 1.3 pstree tools
By Pstree you can see the same number of processes very directly, and the main thing is that we can see the correlation between all processes.
pstree
pstree -up#参数选择:#-A :各程序树之间以 ASCII 字元來連接;#-p :同时列出每个 process 的 PID;#-u :同时列出每个 process 的所屬账户名称。
II. management of the process the mastery of the 2.1 kill command
In the last experiment, we talked about how the process is derived, and what is the correlation between them, and let's review that when a process ends or ends abnormally, it returns an end process or other operation to its parent process, or receives a SIGHUP signal, which SIGHUP The signal can not only be sent by the system, we can use kill to send this signal to operate the process of the end or restart, and so on.
Last class we use the KILL command to manage some of our jobs, this lesson we will try to use kill to operate some of the processes that are not part of the job category, directly to the PID
#首先我们使用图形界面打开了 gedit、gvim,用 ps 可以查看到ps aux#使用9这个信号强制结束 gedit 进程kill -9 1608#我们在查找这个进程的时候就找不到了ps aux | grep gedit
2.2 Execution order of the process
When we are using the PS command we can see that most of the processes are dormant, and if those processes are awakened, then who is the first to enjoy the CPU service, and what is the order of the subsequent processes? How should the queue of process scheduling be arranged?
Of course, the priority value of the process to determine the priority of the process scheduling, and the priority value is the above mentioned PR and nice to control and reflect the
And Nice's value can be modified by the Nice command, but note that the nice value can be adjusted range is 20 ~ 19, where Root has the supremacy of power, can adjust their own process can also adjust the other user's program, and is all the values can be used, and the average user can only To modulate its own process, and its use can only range from 0 to 19, as the system sets a limit to avoid the general user preemption of system resources
#这个实验在环境中无法做,因为权限不够,可以自己在本地尝试#打开一个程序放在后台,或者用图形界面打开nice -n -5 vim &#用 ps 查看其优先级ps -afxo user,ppid,pid,stat,pri,ni,time,command | grep vim
We can also use Renice to modify the priority of a process that already exists, as well because the reason for permissions cannot be attempted in an experimental environment
renice -5 pid
Linux Learning 9-Process management knowledge