First assignment: Linux 2.6.28 process model and CFS scheduler analysis

Last Update:2018-04-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First Assignment 1. Summary

This article focuses on the Linux Kernel 2.6.28 kernel version, describes the concept of the process and the process of invocation.

Linux Kernel Source Lookup address: https://elixir.bootlin.com/linux/v4.6/source/include/linux/types.h

2. What is the concept of process 2.1 processes

An official definition of the process:

A process is a program with a certain set of independent functions, which is a running activity of a data collection and an independent unit of the operating system for resource allocation and scheduling.

In short, a process is a management instance established by the operating system for a running program.

And a process consists of five entities:

(OS Management Run program) data structure p
(Running program's) memory code C
(Running program's) memory data D
General register information for (running program) r
Program status Word information (OS control program execution) PSW

2.2 The visible process 2.2.1 processes on Windows:

2.2.2 The process on Ubuntu

3. How the process is organized

In the Linux kernel, there is a struct used to describe and correlate processes: The task_struct data structure /include/linux/sched.h is defined in, and its code is as much as 400 lines.

3.1 Process ID

The definition of the process ID is saved in include/linux/pid.h :

enum pid_type{    PIDTYPE_PID,    PIDTYPE_PGID,    PIDTYPE_SID,    PIDTYPE_MAX};

Here we explain the most important of the PID.

3.1.1 Process identifier (PID)

Linux assigns a unique process ID, the PID, to the process. He is the unique code of the process in the system, but a process ID is not permanently owned by a process, the PID of running the process at different times is not the same, the process generated using the fork or clone system is assigned a new unique PID value by the kernel.

pid_t pid;

As the above code shows, the PID in task_struct the definition of pid_t , and it is essentially the int type, so the essence of PID is a number.

Range of 3.1.2 PID

In include/linux/threads.h , the system limits the maximum value of the PID value.

#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)

Thus, in general, the maximum number of processes in a Linux system is 32,768.

The generation of 3.1.3 PID

So where does the PID come from? The kernel/pidc answer to this question is:

Static intAlloc_pidmap (structPid_namespace *pid_ns) {intI, offset, Max_scan, pid, last = pid_ns->last_pid;structPidmap *map; PID = last +1;if(PID >= pid_max) pid = Reserved_pids;    offset = pid & bits_per_page_mask;    Map = &pid_ns->pidmap[pid/BITS_PER_PAGE]; Max_scan = (Pid_max + bits_per_page-1)/bits_per_page-!offset; for(i =0; I <= Max_scan; ++i) {if(Unlikely (!map->page)) {void*page = Kzalloc (page_size, Gfp_kernel);/** Free the page if someone raced with us* Installing it:             */SPIN_LOCK_IRQ (&pidmap_lock);if(map->page) Kfree (page);Elsemap->page = page; SPIN_UNLOCK_IRQ (&pidmap_lock);if(Unlikely (!map->page)) Break; }if(Likely (Atomic_read (&map->nr_free))) { Do{if(!test_and_set_bit (offset, map->page))                    {Atomic_dec (&map->nr_free); Pid_ns->last_pid = pid;returnpid                } offset = Find_next_offset (map, offset); PID = Mk_pid (Pid_ns, map, offset);/** Find_next_offset () found a bit, the PID from it* is in-bounds, and if we fell the last* Bitmap Block and the final block was the same* As the starting point, PID is before last_pid.             */} while(Offset < bits_per_page && pid < Pid_max && (i! = Max_scan | | pid < LAST | | ! (Last+1) (& Bits_per_page_mask))); }if(Map < &pid_ns->pidmap[(Pid_max-1) (/bits_per_page]) {++map; offset =0; }Else{map = &pid_ns->pidmap[0]; offset = reserved_pids;if(unlikely (last = = offset)) Break;    } PID = Mk_pid (Pid_ns, map, offset); }return-1;}

alloc_pidmapThe function is used to assign the PID, and similarly, the kernel/pid.h same definition of the method of recovering the PID:

staticvoid free_pidmap(struct upid *upid){    int nr = upid->nr;    struct pidmap *map = upid->ns->pidmap + nr / BITS_PER_PAGE;    int offset = nr & BITS_PER_PAGE_MASK;    clear_bit(offset, map->page);    atomic_inc(&map->nr_free);}

3.2 Status of the process 3.2.1 process state definition

In Linux, there are 6 main process statuses:

Code	name	Description
R	Task_running	Executable State
S	Task_interruptible	Interruptible Sleep Status
D	Task_uninterruptible	Non-disruptive sleep state
T	task_stopped or task_traced	Pause state or Trace status
Z	Task_dead-exit_zombie	Exit status, process becomes zombie process
X	Task_dead-exit_dead	Exit status, process is about to be destroyed

They are include/linux/sched.h defined in the:

#define TASK_RUNNING            0#define TASK_INTERRUPTIBLE      1#define TASK_UNINTERRUPTIBLE    2#define TASK_STOPPED            4#define EXIT_ZOMBIE            16#define EXIT_DEAD              32

In some operating system textbooks, the running state refers to the process that is executing in the CPU, while the executable but not yet called state is defined as the ready state, and the two states are uniformly defined as task_running states in Linux.
In the case of normal operation of the machine, most of the processes in the system are in the task_interruptible state, while the principle of maintaining rapid mobility and not consuming too much CPU resources is taken for granted.
Why is the sleep state divided into interruptible and non-interruptible two kinds? The implication is to avoid being interrupted during the process-to-device interaction, causing the machine to fall into an uncontrolled state.
The process is in the Task_dead state during the exit process, at which time most of the resources consumed by the process will be recycled, except task_struct for a few special resources, so the state of this desire to remain is called a zombie (ZOMBIE).

3.2.2 Process State transitions

The following diagram provides a brief overview of the transition of process state in the system:

Although there are 6 different process states in the system, the transformation of process state is essentially only the task_running and non-task_running.

For example, when a task_interruptible state process receives an end instruction, it is not converted directly to the Task_dead state, but is awakened into the task_running state, and then the task_running state enters the Task_dead state. When a process is in the task_running state, it has only two options: The response signal enters the task_stoped or Task_dead state, or the system call enters the task_interruptible state.

4. How the process is dispatched 4.1 CFS Scheduler

As the kernel version changes, the O (1) Scheduler is replaced by the CFS (fully fair Scheduler) after the Linux Kernel 2.6.23 release.

CFS vruntime is used to measure the priority of the process. It calculates the following formula

vruntime = 进程被分配的运行时间 * NICE_0_LOAD / 进程权重

which represents the weight of the process for Nice NICE_0_LOAD 0, with a value of 1024, and a process weight corresponding to the nice value one by one, which is converted by a global array prio_to_weight .

Static Const intprio_to_weight[ +] = {/ * -20 * /     88761,71755,56483,46273,36291,/ * -15 * /     29154,23254,18705,14949,11916,/ * -10 * /      9548,7620,6100,4904,3906,/*-5 */      3121,2501,1991,1586,1277,/ * 0 * /      1024x768,820,655,526,423,/ * 5 * /       335,272,215,172,137,/ * Ten * /        the, the, -, About, $,/ * * *         $, in, at, -, the,};

But how does the process run out of time to know?

It is calculated with the formula进程实际运行时间 = 调度周期 * 进程权重 / 所有进程权重之和

Where the dispatch period is the time that all processes in the task_running state are dispatched once.

If the process is run idealized, the actual running time of the process as the system assigned to its running time, and then contact the two available

vruntime = （调度周期 * 进程权重 / 所有进程权重之和）* 1024 / 进程权重 = 调度周期 * 1024 / 所有进程总权重

By the above we can find: even if the weight of the different processes is not the same, but ideally the vruntime same, so if a process is a vruntime small value, it does not get it due to the running time, the operating system should first choose it to run.

The above is the main idea of CFS.

vruntimeand process weights are saved in the sched_entity data structure, and it is a dispatch entity, include/linux/sched.h defined in:

structsched_entity {structLoad_weight load;/ * for load-balancing * /    structRb_node Run_node;structList_head Group_node;unsigned intON_RQ;    U64 Exec_start;    U64 Sum_exec_runtime;    U64 Vruntime;    U64 Prev_sum_exec_runtime;    U64 Last_wakeup; U64 Avg_overlap;#ifdef config_schedstatsU64 Wait_start;    U64 Wait_max;    U64 Wait_count;    U64 wait_sum;    U64 Sleep_start;    U64 Sleep_max;    S64 Sum_sleep_runtime;    U64 Block_start;    U64 Block_max;    U64 Exec_max;    U64 Slice_max;    U64 nr_migrations;    U64 Nr_migrations_cold;    U64 Nr_failed_migrations_affine;    U64 nr_failed_migrations_running;    U64 Nr_failed_migrations_hot;    U64 nr_forced_migrations;    U64 nr_forced2_migrations;    U64 nr_wakeups;    U64 Nr_wakeups_sync;    U64 nr_wakeups_migrate;    U64 nr_wakeups_local;    U64 Nr_wakeups_remote;    U64 Nr_wakeups_affine;    U64 nr_wakeups_affine_attempts;    U64 nr_wakeups_passive; U64 Nr_wakeups_idle;#endif#ifdef config_fair_group_sched    structSched_entity *parent;/* RQ on which this entity is queued: */    structCFS_RQ *cfs_rq;/* RQ "owned" by this entity/group: */    structCFS_RQ *my_q;#endif};

4.2 Red and black trees

The difference is sched_entity organized by a red-and-black tree in chronological order:

vurtimeThe least-valued process is stored on the left side of the tree, so you can quickly select vruntime the process with the least value.

5. View of the operating system process model

For a long time, the operating system has tried to define fairness, and is the interactive process necessarily an absolute voice? CFS gives his answer, it no longer attempts to differentiate between interactive processes, but treats all processes equally, as its name Completely Fair. Its appearance makes the well-known O (1) Scheduler is only a blip, the development of Linux has now spanned many versions, and CFS has not been replaced, it with its own unique superiority to declare their own sovereignty.

6. References

Status of the Process--CSDN blog

Process ID--CSDN Blog

CFS Scheduler--CSDN Blog

First assignment: Linux 2.6.28 process model and CFS scheduler analysis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

First assignment: Linux 2.6.28 process model and CFS scheduler analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

First assignment: Linux 2.6.28 process model and CFS scheduler analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support