Process scheduling 1

Last Update:2016-05-06 Source: Internet

Author: User

Tags call back

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

To schedule a process, you need to save the corresponding information in the process:

Prio and Normal_prio: Dynamic priority;
Static_prio: Static priority, set when the process is created, can be changed at run time;
Rt_priority: Priority of real-time processes;
Sched_class: The scheduling class to which the process belongs;
Sched_entity: Scheduler is not only able to dispatch processes, but also to dispatch process groups;
Policy: The scheduling strategy applied by the process;
Cpus_allow: Which processor can be executed with a bit domain save process;
Run_list and Time_slice: For the loop real-time scheduler, save the process of a running table and processes can use the remainder of the CPU time period;

The scheduler always finds the scheduled process, because each CPU has a pid=0 swapper process,

The types of process scheduling in Linux are as follows:

Sched_normal: Ordinary time-sharing process, (the main analysis of CFS completely fair Dispatch)

SHEED_RR: The real-time process of temporal slice rotation;

Sched_fifo: Advanced first-out real-time process;

Scheduling of normal processes:

The new process always inherits the static priority of the parent process;

Nice () system calls can change the priority of the process; The kernel uses a number from 100 to 139 to represent the static priority of the normal process; the smaller the number the higher the priority;

Each process has a time slice. A normal process with a higher static static priority can obtain a larger CPU time slice, in order to prevent the remaining process starvation, a high-priority process time slices should be replaced by the process of not run out of time slices;

Activity process:

These process time slices are not exhausted and allow to continue running

Expiration process:

The process time slice is run out and is disabled, knowing that all active processes are running out of date,

Scheduling of real-time processes:

Each real-time process is related to a real-time priority, with real-time priority from 1 to 99, and the scheduler always prioritizes high priority, and real-time processes are active processes

Situations in which real-time processes are superseded:

Process is robbed by a high-priority process

Process blocks and goes to sleep

Process stops or is killed

Discard CPU via system call

The main process calls are: Run entity, dispatch class, run queue.

Scheduling queue:

Runqueue for each CPU variable, each CPU has one;

Cpoy from

/* * This is the main, PER-CPU runqueue data structure. * * Locking rule:those places want to lock multiple runqueues * (such as the load balancing or the thread migration Code), lock * Acquire operations must is ordered by ascending &runqueue. */struct RQ {/* Runqueue Lock: */raw_spinlock_t lock;/* * nr_running and Cpu_load should is in the same cacheline because * Remote CPUs Use both these fields when doing load calculation.  */unsigned int nr_running; The number of executed tasks in the RQ logging CPU # define Cpu_load_idx_max 5   /* to represent the load of the processor, and in RQ for each processor      will have a corresponding cpu_ to that processor The load parameter is configured to call functions      update_cpu_load_active for cpu_load updates each time the      processor triggers scheduler tick. At the time of system initialization      Call function Sched_init to initialize the RQ cpu_load array to 0.     the best way to understand his update is through the function Update_ Cpu_load, the formula below?     CPU_LOAD[0] will wait directly for the value of Load.weight in RQ.      cpu_load[1]= (cpu_load[1]* (2-1) +cpu_load[0])/2     cpu_load[2]= (cpu_load[2]* (4-1 ) +cPu_load[0])/4     cpu_load[3]= (cpu_load[3]* (8-1) +cpu_load[0])/8     cpu_load[4]= (CPU _load[4]* (16-1) +cpu_load[0]/16     Call function This_cpu_load, the CPU load value returned is cpu_load[0]     In the case of CPU blance or migration, the call function      Source_load Target_load get to the processor Cpu_load index value,     to calculate */ ????? Still in doubt unsigned long cpu_load[cpu_load_idx_max];unsigned long last_load_update_tick; #ifdef config_no_hzu64 Nohz_stamp; unsigned long nohz_flags; #endifint skip_clock_update;/* capture load from *all* tasks on this CPU: *//*load-> The weight value will be the sum of the      Load->weight of the currently executed schedule entity, which means that the load->weight of RQ is higher,        also says that the higher the sum of the responsible scheduling unit Load->weight      indicates that the execution unit loaded by the processor is heavier */ struct load_weight load; /* When calling Update_cpu_load in each scheduler tick,     This value adds one that can be used to feedback the current number of CPU load updates */ unsigned long nr_ Load_updates;u64 nr_switches;//indicates the number of times the CPU is switch_context in scheduleCumulative count; struct CFS_RQ cfs;   Fair dispatching cfsstruct Rt_rq RT; Real-time scheduling #ifdef config_fair_group_sched/* List of leaf cfs_rq on this CPU: */struct list_head leaf_cfs_rq_list; #ifdef config_ smpunsigned long H_load_throttle, #endif/* CONFIG_SMP */#endif//config_fair_group_sched */#ifdef CONFIG_RT_GROUP_  Schedstruct list_head leaf_rt_rq_list; #endif/* * This is part of a global counter where the total sum * over all CPUs Matters. A task can increase this counter on * One CPU and if it got migrated afterwards it could decrease * it on another CPU. Always updated under the Runqueue Lock: */ /* in general, the TASK status for Linux kernel can be task_running      task _interruptible (Sleep),      task_uninterruptible (Deactivate task, at which time the task will be from RQ        Remove) or task_stopped.       This variable will count how many tasks in RQ currently belong to       task_ The state of the uninterruptible. When calling functions       active_task, the nr_uninterruptible value is reduced by one and through the function      Enqueue_Task the corresponding task based on the scheduling class     in the corresponding RQ, and the current RQ nr_running value plus a */ unsigned long nr_ Uninterruptible; /*curr: Point to the current processor is executing task;     idle: Point to the idle task belonging to Idle-task scheduling class;      STOP: points to the current highest level belonging to Stop-task scheduling class     task;*/ struct task_struct * Curr, *idle, *stop;----------------/* processor-based jiffies values to record the next processor      balancing point in time */ unsigned      Long next_balance; /* To store the memory management     structure of the previous task when context-switch occurs, and can be used in function finish_task_switch, through the function Mmdrop release the previous      Task Memory resource */  struct mm_struct *prev_mm; /* is used to record the current RQ clock value, which is basically equal to the value through Sched_ clock_cpu     (RQ) return value and will update the current RQ via      function Update_rq_clock on each call Scheduler_tick The clock value.      In the implementation section, the function sched_clock_cpu will get the corresponding sched_ through sched_clock_local or      ched_clock_remote Clock_data, and the sched_clock_data     value that is processed will pass through the function Sched_clocK_tick updates on each call to Scheduler_tick;     */ u64 clock;u64 clock_task; /* to record how many tasks in the current RQ are waiting for i/ O's sleep status      in actual use, such as when driver accepts calls from a task but is in a phase waiting for i/o     reply, in order to take full advantage of the processor's execution resources,      can call function Io_schedule in driver, at which time      adds a nr_iowait in the current RQ and sets the current task's io_wait to 1      then trigger scheduling to give other tasks a chance to get processor execution time */atomic_t nr_iowait; #ifdef config_smp /*root Domain is a mechanism based on multi-core architecture,     will be remembered by the RQ structure of the root domain currently in use, including      current CPU mask (including Span,online RT Overload), reference count and cpupri     when root domain has been referenced by RQ, RefCount adds one, and vice versa. While cpu     mask span indicates that RQ can be hung on the CPU Mask,noline for RQ is currently scheduled to execute real-time on the      CPU mask CPU Task. You can refer to the function pull_rt_task, and when a task in RQ that belongs to      Real-time has been executed, it will pass through the function Pull_rt_task from the      RQ belongs to the Rto_mask CPU mask can be executed on the processor, find out if there is a processor      There is more than one real-time task, if any, will go to the current execution complete      REAl-time task on the processor      CPUPRI differs from the task itself with 140 (0-139)      Task priority (0-99 for RT Priority And 100-139 is nice value-20-19).      CPU priority itself has 102 priority (including, 1 for invalid,     0 for idle,1 for normal,2-101 to real-time Priority 0-99) .     reference function Convert_prio, Task priority if it is 140 will respond to      CPU Idle, If it is greater than 100, it will be applied to the CPU normal,     If the task priority is 0-99, it will be real-time to the CPU priority 101-2.)      In the actual operation, such as through the function cpupri_find     into a real-time Task to be inserted, this time according to Cpupri      PRI_TO_CPU Select a processor that currently performs real-time task and that task     has a lower priority than the task currently being inserted,     and through the CPU Mask (Lowest_mask) return to the currently available processor mask.     Part of the work can be kernel/sched_cpupri.c.     In the process of initialization, the function Sched_init call function init_defrootdomain,     initialize to root domain and CPU priority system .     * *  struct root_domain *rd; /*schedule domain is based on multi-core structuresMachine .     Each processor will have a basic scheduling domain,     Scheduling Domain can have a hierarchical structure, through the parent      can find the previous layer of domain, or through the child to find      the next layer of domain (null means end.). And through the span     field, the processor range that this domain can cover .     The number of all processors in the base domain, usually in the system,     The number of processors that child domain can cover does not exceed its      Parent domain. And when the task balance     in scheduling domain is the largest range .     at the same time as the processor that the domain can cover, each schedule Domain will include one or more      CPU Groups (structured as struct sched_group) and link      CPU Groups together through next change (into a single circular linked list),     Each CPU Group has a variable cpumask to define this CPU group     The processor range that is covered by the processor. And CPU group includes processors     , which must be covered in the Schedule domain processor scope .     when scheduling Domain's balancing, it will be under the CPU groups     as a unit, according to Cpu_power (will be covered by the group of processors      Tasks Loading) to a different CPU groups the negative charge,     With the move of the tasks, the purpose of the balancing .     in the structure that supports SMP, in the function sched_init, call open_softirq,     Register SCHED_SOFTIRQ software IRQ with its corresponding callback function      Run_rebalance_domains. Scheduler_tick,     through function trigger_load_balance at each call function to confirm whether the current jiffies value has been      Greater than runqueue the next time you want to touch the next_balance value of the load balance,     and through the function RAISE_SOFTIRQ the SCHED_SOFTIRQ software IRQ.      After the software IRQ touches, it calls the function run_rebalance_domains,     and in the function Rebalance_domains, On the subsequent processor      scheduling domain Load balance .     About scheduling domain, you can also refer to the content      Linux Kernel file documentation/scheduler/sched-domains.txt.     */ struct Sched _domain *sd;unsigned Long cpu_power;unsigned char idle_balance;/* for active balancing */ /* If this value is not 0, Indicates that there will be a closing function to be called before the schedule schedule      end. (implemented as an inline function      post_schedule in KERNEL/SCHED.C), currently only real-time Scheduling      class has support for this machine (calling function Has_pushable_tasks      in kernel/sched_rt.c). */ int Post_schedule;int active_balance;int push_cpu;struct cpu_stop_work active_balance_work;/* CPU of this runqueue: */int Cpu;int online;struct list_head cfs_tasks;u64 rt_avg;u64 age_stamp;u64 idle_stamp;u64 avg_idle; #endif #ifdef CONFIG_IRQ _time_accountingu64 prev_irq_time, #endif #ifdef config_paravirtu64 prev_steal_time; #endif #ifdef CONFIG_PARAVIRT_ Time_accountingu64 PREV_STEAL_TIME_RQ; #endif/* calc_load related fields */unsigned long Calc_load_update;long calc_ load_active; #ifdef config_sched_hrtick#ifdef config_smp/* in function Init_rq_hrtick initialize Runqueue high-resolution      Tick, this value is set to 0.     in the function Hrtick_start, the current runqueue     is consistent with the runqueue used by the current processor ,     If you call the function Hrtimer_restart directly, and vice versa      According to the value of hrtick_csd_pending in Runqueue, if       Hrtick_csd_pending is 0, it passes through the function      __smp_call_function_Single allows Runqueue's other      processor to perform rq->hrtick_csd.func functions      __hrtick_start. And wait for the processor to finish after,     only re-set hrtick_csd_pending as 1.     that is, Runqueue hrtick_csd_ Pending is used as a protection system for      SMP structures, which is performed by processor A, processor B,      _hrtick_start functions. And it's about   .   How a processor can be used to touch another processor      perform a function in the KERNEL/SMP.C     -related smp_call_function_ in the xxxxxxx. s*/ int hrtick_csd_pending;struct call_single_data HRTICK_CSD; #endifstruct Hrtimer hrtick_timer;# Endif#ifdef config_schedstats/* Latency Stats */struct sched_info rq_sched_info;unsigned Long Long rq_cpu_time;/* could a Bove be Rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime? *//* Sys_sched_yield () stats */unsigned int yld_count;/* schedule () stats */unsigned int sched_count;unsigned int Sched_go idle;/* try_to_wake_up () stats */unsigned int ttwu_count;unsigned int ttwu_local; #endif #ifdef config_smpstruct Llist_ Head wake_list; #endif};

Dispatch class: Sched_class

struct Sched_class {const struct sched_class *next; points to the next schedule_class when the task can run, it will call back Enqueue_task, put the task in Runqueue R Btree, call inc_nr_running at the same time to add the value of Nr_runqueue in Runqueue to a void (*enqueue_task) (struct RQ *rq, struct task_struct *p, int Flags), Void (*dequeue_task) (struct RQ *rq, struct task_struct *p, int flags), Void (*yield_task) (struct RQ *RQ); BOOL (*yi Eld_to_task) (struct RQ *rq, struct task_struct *p, bool preempt); void (*check_preempt_curr) (struct RQ *rq, struct task_s Truct *p, int flags), struct task_struct * (*pick_next_task) (struct RQ *rq), Void (*put_prev_task) (struct RQ *rq, struct t Ask_struct *p); #ifdef config_smpint (*SELECT_TASK_RQ) (struct task_struct *p, int sd_flag, int flags); void (*pre_schedule ) (struct RQ *this_rq, struct task_struct *task), Void (*post_schedule) (struct RQ *this_rq), Void (*task_waking) (struct TA Sk_struct *task), Void (*task_woken) (struct RQ *this_rq, struct task_struct *task), Void (*set_cpus_allowed) (struct task_ struct *p, const struct CPUMASK *nEwmask), Void (*rq_online) (struct RQ *rq), Void (*rq_offline) (struct RQ *rq), #endifvoid (*set_curr_task) (struct RQ *rq); void (*task_tick) (struct RQ *rq, struct task_struct *p, int queued); void (*task_fork) (struct task_struct *p); void (*swit Ched_from) (struct RQ *this_rq, struct task_struct *task); void (*switched_to) (struct RQ *this_rq, struct task_struct *tas k) void (*prio_changed) (struct RQ *this_rq, struct task_struct *task, int oldprio); unsigned int (*get_rr_interval) (s  Truct rq *rq, struct task_struct *task); #ifdef config_fair_group_schedvoid (*task_move_group) (struct task_struct *p, int ON_RQ); #endif};struct load_weight {unsigned long weight, inv_weight;};

Scheduling classes: CFS and real-time scheduling using different scheduling queues, for CFS scheduling, using red-black trees, for real-time scheduling using linked lists;

Dispatch entities: The scheduler can manipulate entities that are more than a process, so the following structure is required to represent:

struct Sched_entity {struct load_weightload;/* for load-balancing */struct rb_noderun_node;struct List_headgroup_node; unsigned inton_rq;u64exec_start;u64sum_exec_runtime;u64vruntime;u64prev_sum_exec_runtime;u64nr_migrations;# Ifdef config_schedstatsstruct sched_statistics statistics; #endif #ifdef config_fair_group_schedstruct sched_entity* parent;/* RQ on which this entity was (to be) queued: */struct cfs_rq*cfs_rq;/* RQ ' owned ' by this entity/group: */struct C fs_rq*my_q; #endif};struct sched_rt_entity {struct list_head run_list;unsigned long timeout;unsigned int time_slice; struct sched_rt_entity *back; #ifdef config_rt_group_schedstruct sched_rt_entity*parent;/* RQ on which this entity was (to B e) Queued: */struct rt_rq*rt_rq;/* RQ "owned" by this entity/group: */struct rt_rq*my_q; #endif};

Process scheduling Priority:

Load:

The priority of a process is not only specified by priority, but also the load weights of the se.load stored in task_struct. The set_load_weight is responsible for calculating load weights based on the process type and its static priority.

Load weights

struct Load_weight {
unsigned long weight, inv_weight;
};

Process scheduling 1

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More