First assignment: The process model of Linux and the analysis of the CFS scheduler algorithm

Source: Internet
Author: User

1. With regard to process 1.1. Definition of the process

Process: The basic unit that can run independently and be allocated as a resource in a system, which consists of a set of machine instructions, data and stacks, and is an active entity that can run independently.

    1. A process is a single execution of a program
    2. Processes can be executed in parallel with other computations
    3. A process is the process by which a program runs on a data set, which is an independent unit of system resource allocation and scheduling.
1.2. Characteristics of the process

1. Dynamic: The essence of the process is a process of execution, the process is dynamic generation, the dynamic extinction.
2. Concurrency: Any process can execute concurrently with other processes.
3. Independence: The process is a unit that can run independently, but also the system allocates resources and scheduling independent units.

4. Asynchrony: Because of the inter-process constraints, the process has a discontinuity of execution, that is, the process is moving forward at its own independent, unpredictable speed.

2. Organization on the process

task_struct is a data structure of the Linux kernel that is loaded into RAM and contains information about the process. Each process puts its information in the task_struct data structure, task_struct contains the following:

Identifier: A unique identifier that describes the process used to differentiate other processes.

Status: Task status, exit code, exit signal, etc.

Priority: The priority relative to other processes.

Program counter: The address of the next instruction that will be executed in the program.

Memory pointers: Pointers to program code and process-related data, as well as memory blocks shared with other processes.

Context data: The data in the processor's registers when the process executes.

I/O status information: Includes the I/O requests that are displayed, the I/O devices assigned to the process, and the list of files used by the process.

Accounting information: can include the total processor time, the sum of the number of clocks used, time limit, accounting number and so on.

The data structure that holds the process information is called task_struct and can be found in the include/linux/sched.h . So the processes running in the system are in the kernel in the form of task_struct lists.

2.1. Process Status

2.1.1. Process status

volatile Long State ;   int exit_state;

2.1.2. Possible values for State members

  #define  task_running 0  #define  Task_i Nterruptible 1  #define  task_uninterruptible 2  #def INE  __task_stopped 4  #define  __task_traced 8 /*   in tsk->exit_state  */   #define  exit_zombie  #define  EXIT _dead /*   in Tsk->state again  */  #define  task_dead  #define  task_wakekill  #define  task_waking 

2.1.3. Each state of a process

Task_running Indicates that the process is executing or is in the state of readiness to execute
Task_interruptible The process will transition from the state to the ready state as soon as the waiting condition is set, because it waits for some condition to be blocked (pending)
Task_uninterruptible The meaning is similar to task_interruptible, but we can't wake them up by transmitting arbitrary signals, but only when the resources it waits for are available.
task_stopped Process is stopped execution
task_traced Processes are monitored by processes such as debugger.
Exit_zombie The execution of the process is terminated, but its parent process has not yet used a system called wait () to know its termination information, when the process becomes a zombie process
Exit_dead Process is killed, that is, the final state of the process.
Task_killable When the process is in this new sleep state that can be terminated, it works like task_uninterruptible, but can respond to a deadly signal

2.1.4. State transition Diagram

2.2. Process identifier (PID)

2.2.1. Identifier definitions

pid_t pid; // identifier of the process

2.2.2. About identifiers

PID is a number assigned to it that uniquely identifies a process in its namespace in Linux, called the process ID number, or PID.

The program run system is automatically assigned to the process a unique PID. After the process aborts, the PID is reclaimed by the system and may continue to be assigned to the new running program.

Is temporarily unique: After the process has been aborted, the number is recycled and may be assigned to another new process.

2.3. Process Markers

2.3.1. Marker

unsigned int flags; /* per process flags, defined below */

Flags the state information of the reaction process for the kernel to identify the state of the current process.

2.3.2. Flags value Range

#definePf_exiting 0x00000004/* getting shut down */#definePf_exitpidone 0x00000008/* pi exit done on shut down */#definePF_VCPU 0x00000010/* I ' m a virtual CPU */#definePf_wq_worker 0x00000020/* I ' m a workqueue WORKER */#definePf_forknoexec 0x00000040/* forked but didn ' t exec */#definePf_mce_process 0x00000080/* PROCESS policy on MCE errors */#definePf_superpriv 0x00000100/* Used super-user privileges * *#definePf_dumpcore 0x00000200/* DUMPED core */#definepf_signaled 0x00000400/* killed by a signal */#definePf_memalloc 0x00000800/* Allocating memory */#definepf_nproc_exceeded 0x00001000/* Set_user noticed that Rlimit_nproc was exceeded */#definePf_used_math 0x00002000/* If unset the FPU must be initialized before use */#definePf_used_async 0x00004000/* used async_schedule* (), used by module Init */#definePf_nofreeze 0x00008000/* This thread should not be frozen */#definePf_frozen 0x00010000/* FROZEN for system suspend */#definePf_fstrans 0x00020000/* Inside a filesystem transaction * *#definePF_KSWAPD 0x00040000/* I am KSWAPD */#definePf_memalloc_noio 0x00080000/* allocating memory without IO involved */#definePf_less_throttle 0x00100000/* throttle me less:i clean memory */#definePf_kthread 0x00200000/* I am a kernel thread */#definePf_randomize 0x00400000/* RANDOMIZE virtual address space */#definePf_swapwrite 0x00800000/* allowed to write to swap */#definePf_no_setaffinity 0x04000000/* Userland is not allowed to meddle with cpus_allowed */#definePf_mce_early 0x08000000/* EARLY Kill for MCE process policy */#definePf_mutex_tester 0x20000000/* Thread belongs to the RT MUTEX TESTER */#definePf_freezer_skip 0x40000000/* Freezer should not count it as freezable */#definePf_suspend_task 0x80000000/* This thread called freeze_processes

A few common statuses are listed below.

Status Description
Pf_forknoexec Indicates that the process has just been created, but has not yet executed
Pf_superpriv Indicates that the process has superuser privileges
Pf_signaled Indicates that the process was killed by a signal
Pf_exiting Indicates that the process is starting to close
2.4. Members representing process kinship
structTask_struct __rcu *real_parent;/*Real Parent Process*/structTask_struct __rcu *parent;/*recipient of SIGCHLD, WAIT4 () reports*//** children/sibling forms the list of my natural children*/structList_head children;/*List of my children*/structList_head sibling;/*linkage in my parent ' s children list*/structTask_struct *group_leader;/*Threadgroup leader*/

The following popular relationships can be used to understand them: Real_parent is the process of the "biological father", regardless of whether it is "foster", the parent is the process now the father process, may be "stepfather"; here children refers to the child's list of the process, You can get all the child's process descriptors, but using List_for_each and list_entry,list_entry in fact directly uses container_of, in the same vein, sibling the linked list of the sibling of the process, which is the list of all the children of his father. Usage is similar to children; struct Task_struct *group_leader This is the process descriptor of the main thread, and perhaps you might wonder why the thread is represented by the process descriptor, because Linux does not implement the thread-related structure individually. Just use a process instead of a thread, and then do some special processing of it, struct list_head thread_group; This is the list of threads that the process is wired to.

3. Scheduling of Processes 3.1. Completely FAIR Scheduler CFS

The CFS (fully fair scheduler) is the process scheduler used by the Linux kernel 2.6.23, which draws the idea of complete fairness from RSDL/SD, no longer tracks the sleep time of the process, and no longer attempts to differentiate between interactive processes. It treats all processes uniformly, and that is the meaning of fairness. Its rationale is as follows: Set a schedule period ( Sched_latency_ns ), the goal is to allow each process to run at least once in this cycle, or it is said that each process waits for the CPU maximum time does not exceed this scheduling period; then depending on the number of processes , all processes split the CPU usage in this scheduling cycle, because the priority of the process is different from the nice value, the time to split the scheduling cycle to be weighted; the cumulative run time of each process is kept in its own Vruntime field, and the vruntime of the process has the right to run at its lowest level. The algorithm and implementation of CFS are quite simple, and many tests show that their performance is also very superior.

Macro sched_nomal and Sched_batch are primarily used for CFS scheduling. The definitions of these macros can be found in include/linux/sched.h . The file kernel/sched.c contains the implementation of the kernel scheduler and related system calls. The core function of the dispatch is schedule () in sched.c , and the schedule function encapsulates the framework of the kernel scheduler. The detail implementation invokes the implementation of a function in a specific scheduling algorithm class, such as kernel/sched_fair.c or kernel/sched_rt.c .

3.2. Algorithm for process scheduling

In CFS, when a clock tick is interrupted, the Scheduler_tick () function in SCHED.C is called directly by the clock interrupt (the code of the timer timer) and we call it when the interrupt is disabled. Note In the fork code, the Sched_tick call is also caused when the time of the parent process is modified. The Sched_tick function first updates the dispatch information and then adjusts the position of the current process in the red-black tree. If you find that the current process is no longer the leftmost leaf, mark the need_resched flag and call Scheduler () to complete the process switch when the interrupt is returned, or the current process will continue to consume the CPU. Note that this is different from the previous scheduler, which was that the tick interrupt caused the time slice to decrement and the priority adjustment was triggered and re-dispatched when the time slice was exhausted. The code for the Sched_tick function is as follows:

voidScheduler_tick (void)  {      intCPU =smp_processor_id (); structRQ *rq =Cpu_rq (CPU); structTask_struct *curr = rq->Curr;        Sched_clock_tick (); Spin_lock (&rq->Lock);      Update_rq_clock (RQ);      Update_cpu_load (RQ); Curr->sched_class->task_tick (RQ, Curr,0); Spin_unlock (&rq->Lock);    Perf_event_task_tick (Curr, CPU); #ifdef CONFIG_SMP RQ->idle_at_tick =idle_cpu (CPU);  Trigger_load_balance (RQ, CPU); #endif  }  

It first obtains the current running process in the running queue on the current CPU, updates the Runqueue-level variable clock, and then invokes the CFS tick Handler Task_tick_fair () through the interface name Task_tick in Sched_class. To handle clock interrupts. We see the implementation of the CFS algorithm in KERNEL/SCHED_FAIR.C.

The specific scheduling classes are as follows:

Static Const structSched_class Fair_sched_class ={. Next= &Idle_sched_class,. Enqueue_task=Enqueue_task_fair,. Dequeue_task=Dequeue_task_fair,. Yield_task=Yield_task_fair,. Check_preempt_curr=check_preempt_wakeup,. Pick_next_task=Pick_next_task_fair,. Put_prev_task=Put_prev_task_fair, #ifdef CONFIG_SMP. Select_task_rq=Select_task_rq_fair,. Load_balance=Load_balance_fair,. Move_one_task=Move_one_task_fair,. Rq_online=Rq_online_fair,. Rq_offline=Rq_offline_fair,. task_waking=Task_waking_fair,#endif  . Set_curr_task=Set_curr_task_fair,. Task_tick=Task_tick_fair,. Task_fork=Task_fork_fair,. prio_changed=Prio_changed_fair,. Switched_to=Switched_to_fair,. Get_rr_interval=Get_rr_interval_fair, #ifdef config_fair_group_sched. Task_move_group=Task_move_group_fair,#endif  }; 

The Task_tick_fair function is used to poll a process in a scheduling class. The implementation is as follows:

Static void Task_tick_fair (structstructint  queued)  {      struct CFS_RQ *cfs_rq;       struct sched_entity *se = &curr->se;        For_each_sched_entity (SE) {  /**           /= cfs_rq_of (SE );          Entity_tick (CFS_RQ, SE, queued);      }  }  

This function obtains the dispatch entities of each layer, obtains the CFS run queue for each dispatch entity, and invokes the Entity_tick process for processing. The function Entity_tick source code in KERNEL/SCHED_FAIR.C is as follows:

Static voidEntity_tick (structCfs_rq *cfs_rq,structSched_entity *curr,intqueued) {      /** Update run-time statistics of the ' current '. */Update_curr (CFS_RQ); #ifdef Config_sched_hrtick/** Queued ticks is scheduled to match the slice, so don ' t bother * validating it and just reschedule. */      if(queued) {Resched_task (rq_of (CFS_RQ)-Curr); return; }      /** don ' t let the period tick interfere with the Hrtick preemption*/      if(!sched_feat (Double_tick) &&hrtimer_active (&rq_of (CFS_RQ)Hrtick_timer)) return; #endif        if(Cfs_rq->nr_running >1|| !sched_feat (wakeup_preempt)) Check_preempt_tick (CFS_RQ, Curr); }  

This function updates the runtime statistics of the current process with Kernel/sched_fair.c:update_curr () and then calls Kernel/sched_fair.c:check_preempt_tick () to detect the need to reschedule. Use the next process to preempt the current process. Update_curr () realizes the bookkeeping function, which is called by the system timer cycle, realizes as follows:

StaticInlinevoid__update_curr (structCfs_rq *cfs_rq,structSched_entity *Curr, unsignedLongdelta_exec) {unsignedLongdelta_exec_weighted; Schedstat_set (Curr->exec_max, Max (u64) delta_exec, curr->Exec_max)); Curr->sum_exec_runtime + = delta_exec;/*Total run time update*/Schedstat_add (CFS_RQ, Exec_clock, delta_exec);/*update CFS_RQ's Exec_clock*/      /*use priority and delta_exec to calculate weighted for update vruntime*/delta_exec_weighted= Calc_delta_fair (delta_exec, Curr);
4. View of the operating system process model

The operating system (Operation system), in essence, does not refer to the windows, menus, and applications that we normally see. Those are just the clouds in the sky. The operating system is actually hidden behind, we simply do not see the part. Operating systems in general, the work is: process management, memory management, file management, device management and so on. The core concept of the operating system is the process, which is also one of the most important and basic concepts in concurrent programming. Process is a dynamic process, that is, the process has a life cycle, it has resources, is the execution process of the program, its state is changing. The so-called scheduler is part of the process management.

Linux First Scheduler is the complexity of O (n) of the initial scheduling algorithm, the disadvantage of this algorithm is that when there are many tasks in the kernel, the scheduler itself is time-consuming, so from the linux2.5 began to introduce the famous O (1) scheduler. However, the O (1) Scheduler is replaced by another better scheduler, which is the CFS scheduler completely Fair Scheduler. This is also introduced in the 2.6 kernel, specifically 2.6.23, that is, starting from this version, the kernel uses CFS as its default scheduler, and the O (1) scheduler is discarded. But in fact, any scheduler algorithm is not enough to meet the needs of all applications, CFS has some negative test reports. I believe that with the development of Linux, there will be a new scheduling algorithm, we wait and see.

First assignment: The process model of Linux and the analysis of the CFS scheduler algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.