First, Introduction
This article is based on the Linux Kernel Version 4.13.0-36-generic Source code, to carry out in-depth analysis of its process model, specifically contains the following:
1. How the operating system organizes processes
2. How process status is converted
3. How the process is scheduled
4. Your own view of the operating system process model
(Note:The connection address of the Linux Kernel Version 4.13.0-36-generic Source code:https://elixir.bootlin.com/linux/v4.13/source/ Kernel)
Ii. Processunderstanding of the 2.1 process
1). The process is an abstraction of the running program. A process is an instance of an executing program, including the current values of program counters, registers, and variables.
2). Narrowly defined: A process is a running instance of a program (an instance of a computer programs, which is being executed).
3). Generalized definition: A process is a running activity of a program with certain independent functions about a data set. It is the basic unit of the operating system dynamic execution, in the traditional operating system, the process is not only the basic allocation unit, but also the basic execution unit.
2.2 approaching the process
Rather than tangled up in the definition of the process, try to practice it yourself.
On the Windows operating system, we can view the various processes by opening task Manager.
On the Linux operating system, we can enter PS aux in the terminal command line and return to view all processes.
Explanations of each of these fields:
(PID: Process id;%cpu:cpu occupies number;%MEM: physical memory; VSZ: virtual memory; RSS: actual physical memory; STAT: Process state; start: Start time; COMMAND: Process name)
third, how the operating system is organized process
Task_struct is a data structure of the Linux kernel that holds the information of the process, which can be found in the include/linux/sched.h, and the processes of all systems exist in the kernel in the form of task_struct lists.
now introduce task_struct contains some of the main content:
PID: Identification of the process
Processor: Identifies the CPU that the user is using
State: Identifies the status of the process (there are six types, as described below)
Prority: Priority of real-time processes, invalid for normal processes
Policy: Represents a process scheduling strategy
A. Process identifier PID
The process identifier PID is a unique identifier that describes this process to differentiate other processes. It is defined in Task_struct as follows:
For the range of PID values, in Include/linux/threads.h, the following macro is defined:
In the case of Config_base_small configured to 0, the PID value range is 0~32767, and the maximum number of processes in all systems is 32,768.
B. Status of the process
The following are defined in Task_struct:
The possible values for the State member are as follows:
Now let's look at a few states of a process in Linux:
1. Operational status (task-running): Indicates that the process is either executing or is about to be ready to execute.
2. Interruptible blocking State (task-uberruptible): Indicates that the process is blocked until a condition is true. If the condition is reached, the state becomes operational.
3. Non-interruptible blocking state (Task-un interruptible): Cannot wake by accepting a signal.
4. Zombie State (Task-zomble): Indicates that the execution of the process has been aborted, but the parent process has not yet used a system tune such as Wait () to learn about its termination information.
5. Paused state (task-stopped): Indicates that the process is stopped execution
The basic state of a process such as:
C. Priority of the process
The priority is defined in Task_struct as follows:
Iv. How the process is scheduledunderstanding of 4.1 scheduling
In the operating system, this part of the completion of the selection work is called a scheduler, the algorithm used by the program is called the scheduling algorithm. According to how to deal with clock interruption, the scheduling algorithm can be divided into two categories: non-preemptive and preemptive type. Different environments require different scheduling algorithms, in other words, in different systems, the scheduling optimization is different, so it is necessary to divide three environments: batch processing, interactive and real-time.
The typical scheduling algorithms are:
(1) First come first service
(2) Short work priority
(3) Priority level
(4) High response ratio preferred
(5) Time slice rotation
Here, I'd like to introduce the CFS (Completely Fair Scheduler) Scheduler, the fully fair scheduler.
4.2 CFS scheduling algorithm
CFS is implemented after the Linux Kernel 2.6.23, using a red-black tree, with an algorithm efficiency of O (log (n)).
The most important two points of scheduling algorithm scheduling algorithm is to schedule which process executes and how long it will be executed by the scheduled process. The former is called the scheduling policy , the latter is the execution time .
4.2.1, scheduling policy
CFS arranges a virtual clock vruntime for each process in the CFS_RP (CFS run queue). The scheduler always chooses the process execution with the lowest vruntime value, which records how long the process has been running, and the size of the process has a quantitative relationship with its weight and run time.
Vruntime = Actual run time * 1024/process Weight
All processes use a nice value of 0 for the weight of 1024 as the benchmark, in order to calculate their own vruntime increase speed. Some conversion relationships are given below:
Time allocated to Process = Schedule period * The sum of process weights/all process weights
Vruntime = Actual run time * 1024/process Weight
Vruntime = (Schedule period * process weight/sum of all process weights) * 1024/process weights
Vruntime = (sum of the dispatch period/all process weights) * 1024
Although the weight of the process is different, their vruntime growth rate should be the same, regardless of the weight. A process with a small value of vruntime, stating that it had previously occupied the CPU for a short time, was unfairly treated, and therefore chose to be the next running process.
4.2.2, Execution time
CFS uses the proportion of all the scheduled process priorities in the current system to determine the time slices that each process executes, namely:
Time allocated to Process = Schedule period * The sum of process weights/all processes.
kernel Implementation of 4.2.3andCFS scheduling algorithm A. Red-black tree-skeleton
1. The red-black tree is self-balanced, and no path is more than twice times longer than any other path.
2. Running on the tree at O (log n) time occurs (n is the number of nodes in the tree) and can quickly and efficiently insert or delete tasks.
Pirates of the image of a wave ... After you continue to steal the diagram, see the Red and black tree data structure
(The above content and related structure can be found in include/linux/sched.h.) )
Then let's go over the structure of CFS in detail. Scheduling entity Sched_entity, which represents the dispatch unit to be given, when the group is scheduled to shut down can equate him as a process. Each task_struct has a sched_entity process vruntime and weights that are stored in the structure.
Sched_entity is organized by the red-black tree, all the sched_entity are inserted into the red-black tree with the Vruntime key, while the leftmost node of the cache tree, which is the vruntime smallest node, can quickly select the least vruntime process.
B. Two important structural bodies.
Complete Fair Queue Cfs_rq : describes the various running information for a normal process running on a CPU in the task_running state.
structCFS_RQ {structLoad_weight load;//total process weights for running queuesUnsignedintNr_running, h_nr_running; /the number of processes U64 exec_clock; //Running ClocksU64 Min_vruntime; /the Vruntime propulsion value of the CPU running queue is generally the smallest #ifndef config_64bit u64 min_vruntime_copy in the red and black trees;#endif structRb_root Tasks_timeline;//the root node of the red and black tree structRb_node *rb_leftmost;//point to the node with the lowest vruntime value structSched_entity *curr, *next, *last, *Skip; #ifdef config_sched_debug unsignedintNr_spread_over;#endif#ifdef CONFIG_SMPstructsched_avg avg; U64 runnable_load_sum; unsignedLongrunnable_load_avg, #ifdef config_fair_group_sched unsignedLongTg_load_avg_contrib; unsignedLongPropagate_avg;#endifatomic_long_t Removed_load_avg, removed_util_avg; #ifndef config_64bit u64 load_last_update_time_copy;#endif#ifdef config_fair_group_sched unsignedLongH_load; U64 last_h_load_update; structSched_entity *H_load_next;#endif/* config_fair_group_sched */#endif/* CONFIG_SMP */#ifdef config_fair_group_schedstructRQ *rq;//The system has the normal process of the running queue, the real-time process of the running queue, these queues are included in the AH Rp Run queue inton_list; structList_head leaf_cfs_rq_list; structTask_group *TG;/*group that ' owns ' this runqueue*/#ifdef config_cfs_bandwidthintruntime_enabled; U64 Runtime_expires; S64 runtime_remaining; U64 Throttled_clock, Throttled_clock_task; U64 Throttled_clock_task_time; intthrottled, Throttle_count; structList_head throttled_list;#endif/* Config_cfs_bandwidth */#endif/* config_fair_group_sched */};
Scheduling Entities sched_entity: Record the running state information for a process
structsched_entity {/*For load-balancing:*/ structLoad_weight load;//weight of the process structRb_node Run_node;//run a red-black tree node in the queue structList_head Group_node;//related to group schedulingUnsignedintON_RQ;//whether the process is now in the task_running stateU64 Exec_start;//the start time of a scheduled tickU64 Sum_exec_runtime;//the actual time that the process has runU64 Vruntime;//Virtual Run timeU64 Prev_sum_exec_runtime;//The time that the process has run before this dispatchU64 nr_migrations; structsched_statistics statistics; #ifdef config_fair_group_schedintdepth; structSched_entity *parent;//parent process in group scheduling /*RQ on which this entity is queued:*/ structCFS_RQ *cfs_rq;//which running queue the process is in now /*RQ "owned" by this entity/group:*/ structCFS_RQ *my_q;#endif structsched_avg avg ____cacheline_aligned_in_smp;};
v. Views on the operating system process model
The operating system provides a conceptual model for executing serial processes in parallel, and processes can be created and terminated dynamically, and each process has its own address space. Processes exchange information between processes through inter-process communication primitives, and a process can be in a running, managed, or blocked state. Process scheduling, the CFS scheduler was introduced in the Linux kernel version of 2.6.24, before the O (1) scheduler was used. CFS is responsible for allocating CPU resources to ongoing processes, with the goal of maximizing program interaction performance and minimizing overall CPU utilization. It is implemented using red and black trees.
Vi. References
1. Modern operating system version fourth (Andrew S. Tanenbaum,herbert Bos)
2. 73322717
3. 79623130
4. https://www.cnblogs.com/qingjiaowoxiaoxioashou/p/5547260.html
5. 51585645
6. 6642040
The process model of source code analysis based on Linux Kernel Version 4.13.0-36-generic