Transferred from: http://www.cnblogs.com/zfyouxi/p/4504042.html
This article is about the kernel of the process scheduling mechanism, process scheduling is an important task of the kernel, by the scheduler completed.
Process status
The entity that the kernel scheduler dispatches (KSE, Kernal Schedule Entry) are processes and threads. The kernel must know the state of all processes and threads, for example, it is meaningless to give a time slice to a blocked process. From the kernel point of view, there are 3 status of processes:
1. Implementation, indicating the process being implemented
2. Wait, no execution, but wait for the time slice to execute the process
3. Sleep, which is clogging, contains interruptible blockages and non-disruptive blockages. The process of sleep waits for an event to occur, and the scheduler cannot select the process of sleep during the next task switch
The process is constantly switching in several states
1 indicates that the executing process is waiting for an event to go to sleep
2 indicates that the executing process has surrendered CPU resources and entered a waiting state
3 indicates that the sleep process waits for the event to occur, it enters the wait state, and cannot go directly to the execution state
4 indicates that the waiting process gets CPU resources and becomes an execution state
5 indicates that the execution of the process ends and enters the terminating state
The kernel saves all the processes in a single process table, whether it's executing, waiting, or sleeping. The process of sleep is specifically flagged, and the scheduler knows that they cannot be executed immediately and will not select them on the next task switch. The process of sleep is divided into multiple queues, and they are awakened at the appropriate time. The process of sleep is divided into two types:
1. Task_interuptible, interruptible Sleep, when the kernel sends a signal to the process that it is waiting for something to happen, the response signal handler changes the process state to Task_running, which indicates that it is in an executable state, Just want the scheduler to select it to execute
2. Task_uninteruptible, non-disruptive sleep, can not be awakened by external signals, that is, do not respond to external signals, can only be awakened by the kernel itself.
A zombie process is a process in which a process resource has been freed, but also stored in a process table. The usual cause of the zombie process is that the child process has been terminated, but the parent process has not called the WAIT4 system call to confirm that the parent process knows that the child process is dead. In this way the process has freed up resources and become a zombie process because it has not been confirmed by the parent process of death.
From the dimension of running permission to investigate the process state, process state is divided into user state and kernel mentality. User Configuration can only access the data of the process itself, is limited. The nuclear mindset has unlimited access to random data.
There are two ways to switch the user state to the kernel mentality, one is the user state operation system call, will switch to the nuclear mentality. Another is the interruption, in which the break occurs, it also switches to the nuclear mentality.
For a preemptive scheduling model,
1. Interrupts have the highest privilege to preempt a time slice of a process that is in a user state or kernel mindset
2. When a process is in a kernel state and running a system call, it cannot be preempted by other processes except, of course, interrupts
3. Processes running in the user state can be preempted at any time
Scheduler
The scheduler mainly solves two problems
1. Scheduling policy, that is, decide how much execution time to allocate for each process, when to switch to the next process, what the next process is
2. Context switch, that is, when switching from process A to process B, to ensure that process B's running environment and the last time the operation was revoked, such as the contents of the Register, the virtual address space of the various data structures.
The Linux scheduler differs from the traditional time-slice-based scheduler, which considers the wait time for the process, which is the time that all the executable processes wait in a ready queue. Processes with the most stringent CPU time requirements are selected for execution. The process in the ready queue is organized into a red-black tree to speed up operations. Wait for the longest process on the far left.
The main components of the scheduler subsystem, such as the following,
1. The main scheduler and the periodic scheduler are called general-purpose schedulers to determine whether or not to dispatch. The former processing process intended to sleep or for some reason abandoned the CPU, the latter at a periodic frequency, the detection of the need for context switching
2. The Scheduler class to pick the next executing process. Scheduler class is divided into different scheduling algorithms, such as completely fair scheduling, real-time scheduling, in a modular way to execute
3. After selecting the next process to run, a context switch is required and the CPU must be tightly integrated
4. Each process belongs to a specific scheduler class, which is managed by the Scheduler class, and the generic scheduler does not involve the state of the process
The TASK_STRUCT structure of the process and scheduling-related properties such as the following
1. Prio, Static_prio, Normal_prio represents the priority information for the process. Static_prio represents a static priority, which is the nice priority value assigned at the start of the process. Normal_prio is the priority that is calculated based on the Static_prio and scheduling policies, and the sub-process base Normal_prio when the process branches. Prio is the priority that the scheduler considers, and sometimes the kernel temporarily boosts the priority of a process, which is to change the Prio value without affecting Static_prio and Normal_prio
2. Sched_class indicates the scheduler class to which the process belongs
3. sched_entity represents the scheduling entity to which the process belongs, the scheduler can not only dispatch the process, but also dispatch the process groups, threads and other scheduling entities
4. Policy represents a scheduling strategy for a process, such as sched_normal scheduling a normal process, with a completely fair scheduler class. SCHED_BATCH,SCHED_IDEL,SCHED_RR,SCHED_FIFO, etc.
5. Time_slice Specifies the remaining time slices that the process can use
The Scheduler class must provide an instance of Sched_class that specifies what the scheduler class can do
1. Enqueue_task indicates that a process has been added to the ready queue, and when a process state is changed from sleep to executable, this operation has entered the ready queue
2. Dequeue_task means moving a process out of the ready queue, such as a process switching from an executable state to an unenforceable state
3. Yield_task indicates that the process voluntarily abandons the CPU control operation
4. Check_preempt_curr represents a newly awakened process to preempt the current process, such as when Wake_up_new_task wakes up a new process
5. Pick_next_task is used to select the next running process to provide CPU resources to the process. However, there is a need to run a bottom-level context switch when switching between processes
6. Task_tick means to activate the periodic scheduler
7. Task_new indicates that the new process of fork is added to the Scheduler class
User-level programs can not directly interact with the scheduler class, all through the scheduling policy constants, such as Sched_normal,sched_batch,sched_idel mapping to the completely fair scheduler class Fair_sched_class, Sched_rr,sched_ FIFO maps to the real-time scheduler Rt_sched_class.
Each CPU corresponds to a ready queue, and an active process can only be present in a ready queue. For a process-only program, a process can only be executed at the same time on a single CPU. Thread-based programs, however, can be executed on multiple CPUs at the same time by different threads originating from a process. The structure of the ready queue such as the following
1. nr_running and load indicate the load situation of the current ready queue. The speed of the virtual clock for the ready queue is based on this information
2. CFS_RQ is a child-ready queue for the completely Fair scheduler class, RT_RQ is the ready queue for the real-time scheduler class
3. Curr point to the currently executing process
4. Clock is used to implement the clocks of the ready queue itself, and each call to the periodic scheduler updates the value of the clock
The structure of a dispatch entity, such as the following, represents a generic, scheduler-capable entity, including processes, process groups, threads, and so on.
1. Load represents the payload, which represents the proportion of the total load that the entity occupies in the queue. Computing load is a task of scheduler class, it affects the speed of virtual clock
2. Run_node represents the node of the red and black tree, allowing this entity to be present on the red and black trees
3. On_rq to indicate whether the entity is in the ready queue
4. Exec_start indicates the time at which the process started running, and Sum_exec_runtime represents the total elapsed time of the process. Each time the process runs, the Exec_start value is recorded, and each call to the Update_curr system call subtracts Exec_start from the current time and adds the difference to sum_exec_runtime.
5. When the process is revoked the CPU, the sum_exec_runtime is saved to the pre_sum_exec_runtime. When the process is preempted, the sun_exec_time grows monotonically.
6. Vruntime record the amount of time the virtual clock flows during process execution
There are two types of universal scheduler, one is the periodic scheduler and the other is the keynote. The periodic scheduler is implemented in the Schedule_tick function. The kernel invokes this function on its own initiative, which activates the periodicity of the scheduler class of the current process.
Scheduling method.
Assuming that the current process needs to be dispatched again, the Scheduler class will set the Tif_need_resched flag in task_struct and the kernel will complete the dispatch request at the appropriate time.
The main scheduler is responsible for handing the CPU from one process to another. The main scheduler is implemented in the schedule function. When the system call returns, the kernel checks whether the current process has set the reschedule flag tif_need_resched. Assuming this flag is set, the kernel calls the schedule function to switch.
The 1.schedule function first determines the current ready queue and saves the currently executing process task_struct in the Prev pointer. Change the clock value of the ready queue to cancel the tif_need_resched flag.
2. Assuming that the current process is in an interruptible state of sleep and that a wake-up signal is received, the current process state is changed to executable, otherwise the process stops moving and enters the sleep queue.
3. Invoking the put_prev_task of the Scheduler class notifies the scheduler class that the current process is being replaced by a process. Call Pick_next_task to select the next process. New processes are not necessarily selected, for example, other processes are asleep, just a process that can be executed. Once you have selected a process, prepare to perform a hardware-level context switch
4. Context_switch is responsible for running the hardware-level context switch
5. Check the reschedule flag, assuming that the current process has the TIF_NEED_SCHED flag set, go to need_resched to run.
To see what the context switch does.
1. First call the architecture-related Prepare_task_switch function to prepare for the switchover
2. MM and oldmm represent the user space virtual address context instance of the next process and the user space virtual address context instance of the previous process
3. The SWITCH_MM function replaces the user space virtual address context described by Task_struct's mm description, such as the loaded page table, which brushes out the TLB. It is primarily data that is stored in the fast cache and in the TLB for CPU use. And many other data is stored in memory, it is not necessary to switch, when needed to load from memory
4. The SWTICH_TO function switches the registers and the data in the kernel stack and restores the register content that was used before the new process user-space program to the register. One of the key points about register content is that when the user state enters the kernel mentality, the register contents of the user-space program are saved to the core stack. So when context switches, the register content does not need to be handled specifically. The process always runs from the kernel mindset, and when it returns to the user space, the contents of the register are recovered from the core stack.
The basic principle of a completely fair scheduler class is to compute the virtual clock value of a process, which is a measure of the CPU time that a waiting process can get. This value is calculated from the actual clock and the load weights of the process. All calculations related to the virtual clock are related to the Update_curr function. Update_curr triggered by the periodic scheduler
The left-most node of the red-black tree in the ready queue is the node to be selected, and the red-black tree is sorted according to the Vruntime value. The red-black tree also maintains a min_vruntime value that represents the minimum virtual clock value for the entire tree. When the vruntime value of the leftmost node is greater than Min_vruntime, the Vruntime value for the leftmost node is updated with the Min_vruntime value.
1. The Vruntime value is always increasing, that is, the node moves right in the red and black tree. When the process goes into execution, its vruntime will grow. The more important the process is, the slower the Vruntime value grows, which means the slower the right shift
2. Min_vruntime values are always monotonically increasing. When a process goes to sleep its vruntime is kept constant, and when it wakes up, its position in the red-black tree moves relative to the left and is prioritized.
Process scheduling is based on the ready queue, and the waiting process in the ready queue is an executable state. While the sleep process is in the waiting queue, processes in the wait queue are not selected by the scheduler. When the sleep process wakes up, it goes into the ready queue. There are two kinds of sleep processes, which can be interrupted and non-disruptive. There are several ways to wake up a process that waits for sleep in the queue.
1. Wait for the signal to arrive, handling the interruptible sleep process
2. The waiting event occurs, such as the process of reading the network card data into non-disruptive sleep when the network card is not data, when the NIC data is reached, the WAKE_UP function is called to wake up the process waiting for the NIC data
Computer knowledge Supplements (10) Understanding process scheduling "Go"