Deep source analysis of Linux process models

Source: Internet
Author: User
Tags data structures prev switches

1. Preface (Experimental content)
    • How the operating system organizes processes
    • How process state is transformed (gives process state transition diagram)
    • How the process is scheduled
    • Talk about your view of the operating system process model
2. About the process

  (1) Definition:

process is a computer program on a data set on a running activity, the system is the basic unit of resource allocation and scheduling, is the basis of the operating system structure. In the early process design-oriented computer architecture, the process is the basic execution entity of the program, and in the contemporary thread-oriented computer architecture, the process is the container of the thread. A program is a description of instructions, data, and its organization, and the process is the entity of the program. The concept of the process is mainly two points: first, the process is an entity. Each process has its own address space, in general, the package expands the text area (text region), the data region, and the stack. The text area stores the code executed by the processor, the data region stores variables and the dynamically allocated memory used during process execution, and the stack area stores the instructions and local variables for the active procedure call. Second, the process is an "executing procedure". The program is an inanimate entity, and only the processor gives the program life (the operating system executes it) to become an active entity, which we call a process.

(2) Characteristics of the process:

    • Dynamics: The essence of the process is the process of the procedure in the multi-channel program system, the process is the dynamic generation, extinction;
    • Concurrency: Any process can execute concurrently with other processes;
    • Independence: The process is a basic unit that can operate independently, and it is also an independent unit for allocating resources and dispatching.
    • Asynchrony: Because of the inter-process constraints, so that the process has a discontinuity of execution, that is, the process at their own independent, unpredictable speed ahead;

(3) The difference between a process and a program, a thread:

  In a process-oriented system (such as earlier versions of Unix,linux 2.4 and earlier), a process is the basic execution entity of a program, and in a thread-oriented system such as most modern operating systems, Linux 2.6, and newer versions, the process itself is not a basic operating unit, but a container for threads. Simply put, the process and the program is the difference between dynamic and static, the process and the program is many-to-one, as well, threads and processes are also many-to-one.

3.1 How the operating system organizes processes

In a Linux system, a process is /linux/include/linux/sched.h defined in a header file as a task_struct struct, an instantiation of which is a process, task_struct composed of many elements, and some important elements are analyzed below.

  • identifier : A unique identifier associated with the process used to distinguish between the processes being executed and other processes.
  • status : Describes the status of the process, because the process has several states such as suspend, block, run, and so on, so there is an identifier to record the execution state of the process.
  • Priority : If several processes are executing, it involves the sequencing of the process being executed, which is related to the identifier of the process priority.
  • Program Counter : The address of the next instruction that will be executed in the program.
  • Memory pointer : A pointer to program code and process-related data.
  • Context Data : The data in the processor's registers when the process executes.
  • I/O status information : Includes the I/O requests that are displayed, the I/O devices assigned to the process, and the list of files used by the process.
  • accounting information : including the total time of the processor, account number and so on.
3.1 Process status (state)

In the task_struct struct, the state statement that defines the process is

volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */

valatileThe purpose of the keyword is to ensure that this instruction is not omitted due to compiler optimizations and requires a direct read value each time, thus guaranteeing the stability of real-time access to the process.
The /linux/include/linux/sched.h possible values that the process can find in the header file state are as follows

/*

* Task State bitmask. note! These bits is also

* Encoded in fs/proc/array.c:get_task_state ().

* We have separate sets of flags:task->state

* is on runnability, while Task->exit_state is

* The task exiting. Confusing, but the This

* Modifying one set can ' t modify the other one by

* mistake.
*/
Define Task_running 0
Define Task_interruptible 1
Define Task_uninterruptible 2
Define task_stopped 4
Define task_traced 8

/* in Tsk->exit_state */
Define Exit_zombie 16
Define Exit_dead 32

/* in Tsk->state again */
Define Task_noninteractive 64
Define Task_dead 128

According to state the comments that follow, you can get when state<0 indicates that this process is in a non-operational state when state=0 indicates that the process is running and when state>0 indicates that the process is in a stopped state.
The following lists some common values for state
| Status | Description |
| :---------------------- | :----------------------------------------------------------- |
| 0 (task_running) | The process is in a state that is running or ready to run |
| 1 (task_interruptible) | Process is in an interrupted sleep state and can be awakened by signal |
| 2 (task_uninterruptible) | The process is in a non-disruptive state of sleep and cannot be awakened by a signal |
| 4 (task_stopped) | Process is stopped execution |
| 8 ( task_traced) | process is monitored |
| (Exit_zombie) | The zombie status process, which indicates that the process was terminated, but its parent program has not yet obtained information about its termination.
| (Exit_dead) | Process dead, this state is the final state of the process |

3.2 Process identifier (PID)

c pid_t pid; /*进程的唯一表示*/ pid_t tgid; /*进程组的标识符*/

In a Linux system, all threads in a thread group use the same PID as the thread group's lead thread (the first lightweight process in the group) and are stored in the Tgid member. Only the PID members of the thread group's lead thread are set to the same value as Tgid. Note that the Getpid () system call returns the Tgid value of the current process instead of the PID value. (a thread is the smallest unit that a program runs, and a process is the basic unit in which the program runs.) )

3.3 Marking of the process (flags)
unsigned int flags; /* per process flags, defined below */

Information that reacts to the status of the process, but not the running state, for the kernel to identify the current state of the process for next steps

The possible values of the flags members are as follows, with the macros starting with PF (Processflag)

/** Per Process Flags*/#define Pf_alignwarn 0x00000001/* Print Alignment Warning MSGS *//* Not implemented yet, only for 486*/#define Pf_starting 0x00000002/* Being created */#define Pf_exiting 0x00000004/* Getting shut down */#define Pf_exitpidone 0x00000008/* PI exit done on shut down */#define PF_FORKNOEXEC 0x00000040/* forked but didn ' t exec */#define PF_SUPERPRIV 0x00000100/* Used super-user privileges */#define Pf_dumpcore 0x00000200/* DUMPED core */#define Pf_signaled 0x00000400/* killed by a signal */#define PF_MEMALLOC 0x00000800/* Allocating memory */#define Pf_flusher 0x00001000/* Responsible for disk writeback */#define Pf_used_math 0x00002000/* If unset the FPU must be initialized before use */#define Pf_nofreeze 0x00008000/* This thread should not being frozen */#define Pf_frozen 0x00010000/* Frozen for system suspend */#define Pf_fstrans 0x00020000/* Inside a filesystem transaction */#define PF_KSWAPD 0x00040000/* I am KSWAPD */#define Pf_swapoff 0x00080000/* I AM in Swapoff */#define Pf_less_throttle 0x00100000/* Throttle Me less:i clean memory */#define PF_BORROWED_MM 0x00200000/ * I am a kthread doing use_mm *#define PF_RANDOMIZE 0x00400000/ * Rand Omize Virtual address space */#define PF_SWAPWRITE 0x00800000/ * allowed to write to swap */#define Pf_spread_ Page 0x01000000/ * Spread page cache over Cpuset */#define PF_SPREAD_SLAB 0x02000000/ * Spread some SLAB cache s over Cpuset */#define PF_MEMPOLICY 0x10000000/ * Non-default NUMA mempolicy */#define Pf_mutex_tester 0x2 0000000/ * Thread belongs to the RT Mutex tester */#define PF_FREEZER_SKIP 0x40000000/ * freezer should not Count it as Freezeable *                /
3.4 The relationship between processes
/* * pointers to (original) parent process, youngest child, younger Sibling, * older Sibli  Ng, respectively. (P->father can replaced with  * P->parent->pid)  *//* Real parent process (when being debugged) */struct task_struct *parent; /* parent process *//* * children/sibling forms the list of my children plus the
              
                * tasks I ' m ptracing. 
                */struct list_head children; /* list of my children */struct list_head sibling; /* linkage in my parent ' s children list */struct task_struct *group_leader; /* threadgroup leader */       
                  

In a Linux system, there is a direct or indirect connection between all processes, each of which has its parent process, and possibly 0 or more child processes. All processes that have the same parent process have a sibling relationship.

Real_parent points to its parent process, and if the parent process that created it no longer exists, it points to the Init process with PID 1. When it terminates, parent points to its parent process and must send a signal to its parent process. It usually has the same value as real_parent. Children represents the head of the list, and all the elements in the list are its child processes (the process's child process list). The sibling is used to insert the current process into the sibling list (the sibling list of the process). Group_leader points to the lead process for the group of processes in which it resides.

3.5 Process Scheduling 3.5.1 Priority
    int prio, static_prio, normal_prio;    unsigned int rt_priority;/* prio: 用于保存动态优先级 static_prio: 用于保存静态优先级, 可以通过nice系统调用来修改 normal_prio: 它的值取决于静态优先级和调度策略 priort_priority: 用于保存实时优先级*/
3.5.2 Scheduling Policy
unsigned int policy;cpumask_t cpus_allowed;/*    policy: 表示进程的调度策略 cpus_allowed: 用于控制进程可以在哪个处理器上运行*/

policyRepresents the process scheduling policy, currently has the following five kinds of strategies

/* * Scheduling policies */#define SCHED_NORMAL    0 //按优先级进行调度#define SCHED_FIFO 1 //先进先出的调度算法#define SCHED_RR 2 //时间片轮转的调度算法#define SCHED_BATCH 3 //用于非交互的处理机消耗型的进程#define SCHED_IDLE 5//系统负载很低时的调度算法 
Field description where the Scheduler class
Sched_normal (also called sched_other) for normal processes, implemented through the CFS scheduler. The sched_batch is used for non-interactive processor-consuming processes. Sched_idle is used when the system load is low Cfs
Sched_fifo First-in, first-out scheduling algorithm (real-time scheduling strategy), the same priority tasks first-to-first service, high-priority tasks can preempt low-priority tasks Rt
Sched_rr Rotation scheduling algorithm (real-time scheduling strategy), the latter provides roound-robin semantics, the use of time slices, the same priority of the task when the time slice will be put to the end of the queue to ensure fairness, the same, high-priority tasks can preempt low-priority tasks. Different required real-time tasks can set policies with the Sched_setscheduler () API as needed Rt
Sched_batch Sched_normal a differentiated version of the common process strategy. Using the time-sharing strategy, the CPU computing resources are allocated according to the dynamic priority (available in Nice () API settings). Note: This type of process has a lower priority than the above two types of real-time processes, in other words, real-time process priority scheduling when there is a real-time process present. But optimized for throughput Cfs
Sched_idle The lowest priority, when the system is idle to run such processes (such as the use of idle computer resources to run outside the civilized search, protein structure analysis and other tasks, is the application of this scheduling strategy) Cfs
3.6 The address space of the process

Processes have their own resources, which refer to the address space of the process, each with its own address space, and in task_struct, the process address space is defined as follows:

struct mm_struct *mm, *active_mm;/*    mm: 进程所拥有的用户空间内存描述符,内核线程无的mm为NULL    active_mm: active_mm指向进程运行时所使用的内存描述符, 对于普通进程而言,这两个指针变量的值相同。但是内核线程kernel thread是没有进程地址空间的,所以内核线程的tsk->mm域是空(NULL)。但是内核必须知道用户空间包含了什么,因此它的active_mm成员被初始化为前一个运行进程的active_mm值。 */

If the current kernel thread is scheduled to run before another kernel thread is running, then both mm and avtive_mm are null
The above is the operating system is how the process of some analysis, with these as a basis, we can proceed to the next analysis

4. How the process state is converted

As for the definition of Linux process status (state), the values and descriptions are analyzed in detail in the process state, so there is not much to repeat.
Here's a diagram of how the various states of a process are converting to each other:

5. How the process is scheduled for 5.1 data structures related to process scheduling

Before we know how the process is scheduled, we need to understand some of the data structures related to process scheduling.

5.1.1 Run Queue (runqueue)

/kernel/sched.cunder the file, the running queue is defined as struct rq each CPU has one struct rq , which is used primarily to store some basic information for scheduling, including timely scheduling and CFS scheduling. In Linux kernel 2.6, which struct rq is a very important data structure, let's take a look at some of its important fields:

 /* select out some fields to make comments */ //runqueue of the spin lock, when the operation of the runqueue, it is necessary to lock it.         Since each CPU has a runqueue, this will greatly reduce the chance of competition spinlock_t lock; //this variable is used to record the earliest time slice in the active array unsigned long expired_ Timestamp //records the total number of ready processes on the CPU, is the total number of active array and expired array processes and unsigned long nr_running; //records the number of process switches that have occurred since the CPU was running unsigned long long nr_ switches; //records the number of non-interruptible CPU processes unsigned long nr_uninterruptible; //this part is the most important part of RQ, I will carefully analyze them below struct prio_array *active, *expired, arrays[ 2];               
5.1.2 Priority Series Group (Prio_array)

In the Linux kernel 2.6 release, there are two additional prioritized arrays in RQ active array expired array .
The structure of the two queues is struct prio_array that it is defined in /kernel/sched.c , and its data structure is:

struct prio_array {    unsigned int nr_active; //     DECLARE_BITMAP(bitmap, MAX_PRIO+1); /* include 1 bit for delimiter */ /*开辟MAX_PRIO + 1个bit的空间, 当某一个优先级的task正处于TASK_RUNNING状态时, 其优先级对应的二进制位将会被标记为1, 因此当你需要找此时需要运行的最高的优先级时, 只需要找到bitmap的哪一位被标记为1了即可*/ struct list_head queue[MAX_PRIO]; // 每一个优先级都有一个list头};

Active arrayRepresents the running process queue that the CPU chooses to execute, and the process in this queue has time slices remaining, and the *active pointer always points to it.
Expired arrayIs the process used to store the Active array time slice in, the *expired pointer always points to it.
Once the active array time slice of a normal process inside is used up, the scheduler recalculates the time slice and priority of the process, removes it from the, and inserts it into active array expired array the corresponding priority queue in.
When all the tasks within the active array run out of time slices, you *active *expired can switch the running queue by simply swapping the two pointers.

5.1.3 Scheduler main function (schedule ())

schedulefunction exists /kernel/sched.c , is a very important function of Linux kernel, it is used to pick out the next process should be executed, and the completion of the process of switching work, is the main performer of the process scheduling.

5.2 Scheduling Algorithm (O (1) algorithm) 5.2.1 Introduction O (1) algorithm

What is an O (1) algorithm: The algorithm is always able to select the highest priority process in a limited time and then execute, regardless of the number of running processes in the system, so named O (1) algorithm.

Principles of the 5.2.2 O (1) algorithm

Before we mentioned two arrays sorted by priority Active array and expired array , these two arrays are the key to implementing the O (1) algorithm. The
O (1) scheduling algorithm is run each time by selecting the highest-priority process in the active array array.
So how does the algorithm find the process with the highest priority? You remember the declare_bitmap (BITMAP, max_prio+1) in the previous Prio_array ; Field? Here it plays a role (see the code comment for details), as long as you find which bit in bitmap is set to 1, you get the priority of the task that the current system is running (IDX, implemented by the Sehed_find_first_bit () method ), next to the Process chain list (queue) for IDX, all processes in the queue are currently operational and have the highest priority, and then execute them sequentially.
The procedure is defined in the schedule function with the following main code:

struct task_struct *prev, *next;struct list_head *queue;struct prio_array *array;int idx;prev = current;array = rq->active;idx = sehed_find_first_bit(array->bitmap); //找到位图中第一个不为0的位的序号queue = array->queue + idx; //得到对应的队列链表头next = list_entry(queue->next, struct task_struct, run_list); //得到进程描述符if (prev != next) //如果选出的进程和当前进程不是同一个,则交换上下文 context_switch();
6. View of the operating system process model

Many years ago, some people said that Linux will definitely replace Windows, has been so many years, as far as I know is the use of Windows more and more, the abandonment of Linux is also more and more. Very simple, from the desktop side, I think that Linux is not defeated Windows, Windows is produced by a pro-aggressive commercial companies, beautiful, charming, convenient. And Linux is not a technology, but this kind of thing to make, basically no one appreciates, others difficult to understand. But Linux has its advantages, which I support. But if it does not change, it can only maintain this state, in a bunch of fanatical professional or amateur programmers among the spread.

7. References
    • 54292300
    • 54618275
    • 51383272
    • 7160246
    • 47010721
    • Linux kernel 2.6 Source Download link
 

  

Deep source analysis of Linux process models

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.