Date |
Kernel version |
Architecture |
author |
GitHub |
CSDN |
2016-06-14 |
Linux-4.6 |
X86 & Arm |
Gatieme |
Linuxdevicedrivers |
Linux process management and scheduling |
A unique description of each process is saved in memory and is connected to other processes through several constructs.
This is the case with the scheduler , whose task is to share CPU time between programs, creating the illusion of parallel execution, which is divided into two different parts, one involving a scheduling strategy and the other involving context switching .
What is a scheduler
Generally speaking, the operating system is the medium between the application and the available resources.
Typical resources are memory and physical devices. But the CPU can also be considered a resource, and the scheduler can temporarily assign a task to execute on it (in the time slice). The scheduler makes it possible for us to execute multiple programs at the same time, so we can share the CPU with users with various requirements.
The kernel must provide a way to share CPU time as equitably as possible between processes, while at the same time taking into account different task priorities.
An important goal of the scheduler is to efficiently allocate CPU time slices while providing a good user experience. The scheduler also faces conflicting goals, such as minimizing response time for critical real-time tasks and maximizing overall CPU utilization.
The general principle of the scheduler is to provide maximum fairness to each process in the system, based on the computing power required to allocate, or, from another point of view, to ensure that no process has been mistreated.
Scheduling policy
Traditional UNIX Operating system Audou algorithms must implement several conflicting objectives:
Process response time as fast as possible
The throughput of the background job is as high as possible
Avoiding the process of starvation as much as possible
Low-priority and high-priority processes need to be reconciled as much as possible, etc.
The task of the Scheduling strategy (scheduling policy) is to decide when and how to select a new process that consumes CPU running.
The scheduling of traditional operating systems is based on time sharing technology: Multiple processes run in "take-over" mode because the CPU time is divided into "slices (slice)", allocating a single CPU time slice to each of the running processes, of course only one process can run at any given time .
A process switch can occur if the current time limit for a running process (quantum) expires (that is, when it is exhausted) and the process has not finished running.
Time-sharing relies on timed interrupts, so it is transparent to the process and does not require additional code to be inserted into the lease to ensure CPU ticks.
Scheduling policies also classify them according to the priority of the process. Sometimes a complex algorithm is used to find the current priority of the process, but the final result is the same: each process is associated with a value (priority), which indicates how the process is allocated appropriately to the CPU.
In Linux, the priority of a process is dynamic. The scheduler keeps track of what the process is doing and periodically adjusts their priority levels. In this way, there is no CPU-using process for a longer time interval, which boosts them by dynamically increasing their priority. Accordingly, for processes that have been running on the CPU for a long time, they are punished by reducing their priority.
Process Hunger
Process starvation, that is, starvation, refers to the process of starvation when waiting time has a noticeable impact on process advancement and response. When hunger to a certain degree of progress in waiting to even completion is meaningless, it is called starvation death.
The main cause of hunger is
In a dynamic system, for each type of system resource, the operating system needs to determine an allocation policy that determines the order in which resources are allocated to processes when multiple processes request a class of resources at the same time.
Sometimes the resource allocation policy may be unfair, that is, there is no guarantee that the waiting time bounds are present. In this case, even if the system does not have a deadlock, some processes may wait longer. When the waiting time has a noticeable impact on process advancement and response, it is said that a process of starvation has occurred, and that the process is starved to death when the task given by the hunger to a certain level of progress is no longer meaningful, even if it is completed.
For example, when more than one process needs to print a file, if the system assigns a printer policy is the shortest file priority, then the long file printing task will be indefinitely due to the arrival of short files, resulting in the eventual starvation and even starve to death.
Classification of the process classification process
When it comes to scheduling issues, the process is traditionally categorized as "I/O constrained (i/o-dound)" or "CPU constrained (cpu-bound)".
type |
Nickname |
Description |
Example |
I/O restricted type |
I/O intensive |
Frequent use of I/O devices and spend a lot of time waiting for I/O operations to complete |
Database server, Text editor |
CPU-Constrained |
COMPUTE-intensive |
Spend a lot of CPU time doing numerical calculations |
Graphics Drawing Program |
Another taxonomy divides the process area into three categories:
type |
Description |
Example |
Interactive processes (Interactive process) |
Such processes often interact with the user, so it takes a lot of time to wait for keyboard and mouse actions. When the user's input is accepted, the process must be woken up quickly, otherwise the user will feel the system is unresponsive |
Shell, text-editing programs, and graphics applications |
Batching process (batch processing) |
Such processes do not have to interact with the user and therefore often run in the background. Because such a process does not have to be appropriate soon, it is often neglected by the scheduler |
Program language compiler, database search engine and scientific computing |
Real-time processes (real-time process) |
These processes are required by a strong scheduling process that is never blocked by a low-priority process. And their response time is as short as possible. |
Video audio applications, robot control programs, and procedures for collecting data from physical sensors |
Attention
The previous two classes of classification methods are independent of each other in certain procedures.
For example, a batch process is likely to be an I/O constrained (such as a database server) or CPU-constrained (a graph-drawing program)
Real-time processes and normal processes
In Linux, the scheduling algorithm can clearly confirm the identity of all real-time processes, but there is no way to distinguish between interactive programs and batch programs (collectively referred to as ordinary processes), the linux2.6 Scheduler implements a heuristic algorithm based on the past behavior of the process to determine whether the process should be treated as an interactive process or a batch process . Of course, the scheduler has a tendency to favor interactive processes over the batch process.
Linux uses different scheduling strategies based on the different classifications of the process.
For real-time processes, a FIFO or round robin scheduling strategy is used.
For normal processes, it is necessary to differentiate between interactive and batch-processing. Traditional Linux schedulers increase the priority of interactive applications, enabling them to be dispatched more quickly. The core idea of new schedulers such as CFS and RSDL is "completely fair". This design concept not only greatly simplifies the code complexity of the scheduler, but also provides a more perfect support for various scheduling requirements.
Note that Linux includes both the process and the thread schedule as one. A process can be considered a single thread, but a process can contain multiple threads that share a certain resource (code and/or data). Therefore, the process scheduling also includes the function of thread scheduling.
The scheduling algorithms for Linux processes have evolved in many ways, but their evolution is primarily directed towards common processes, as we have mentioned different scheduling strategies based on the different classifications of the process. Real-time processes and ordinary processes adopt different scheduling strategies, More general common processes also require heuristic recognition of batch processes and interactive processes.
Scheduling strategies for real-time processes are simpler because real-time process values require only the fastest response, priority-based, and each process is given a different priority depending on how important it is, and the scheduler always chooses the highest priority process to execute at each dispatch. Low priority cannot preempt high priority, so the scheduling strategy of FIFO or round robin can meet the demand of real-time process scheduling.
But the normal process scheduling strategy is more troublesome, because the ordinary process can not simply look at the priority, must be fair to occupy the CPU, otherwise it is prone to process starvation, in this case the user will feel the operating system is very card, the response is always very slow.
In addition, if a real-time process exists in the process, the real-time process is always dispatched before the normal process
Evolution of the Linux Scheduler
The first scheduler is an O (n)-based scheduling algorithm (actually traversing all tasks every time, so the complexity is O (n)), the disadvantage of this algorithm is that when there are many tasks in the kernel, the scheduler itself will take a lot of time, so, Introduction of the famous O (1) Scheduler starting from linux2.5
However, Linux is a lot of programmers around the world to develop the wisdom of the super-core, no best, only better, in the O (1) Scheduler scenery in a few days and another better scheduler replaced, it is the CFS scheduler completely Fair Scheduler . This is also introduced in the 2.6 kernel, specifically 2.6.23, that is, starting from this version, the kernel uses CFS as its default scheduler, O (1) Scheduler was discarded, in fact, the development of CFS has gone through many stages, the earliest stair algorithm (SD), Later on the SD algorithm to improve the RSDL (rotating staircase Deadline Scheduler), the algorithm is "completely fair" embryonic, until the CFS is eventually adopted by the kernel scheduler, it from rsdl/ The SD absorbs the idea of complete fairness, no longer tracks the sleep time of the process, and no longer attempts to differentiate between interactive processes. It treats all processes uniformly, and that is the meaning of fairness. The algorithms and implementations of CFS are fairly simple, and many tests show that their performance is also superior
For more information about CFS, please refer to
http://www.ibm.com/developerworks/cn/linux/l-completely-fair-scheduler/index.html?ca=drs-cn-0125
In addition, the kernel document Sched-design-cfs.txt is also introduced
Field |
version |
An O (n) Initial scheduling algorithm |
linux-0.11~2.4 |
O (1) Scheduler |
linux-2.5 |
CFS Scheduler |
Linux-2.6~ to present |
Linux Scheduler consists of
2 Scheduler
Scheduling can be activated in two ways
One is straightforward, such as a process that intends to sleep or abandons the CPU for other reasons
The other is through the periodic mechanism, at a fixed frequency to run, from time to test whether it is necessary
So the current Linux scheduler consists of two schedulers: the Main Scheduler , the periodic scheduler (both collectively referred to as the Universal Scheduler (Generic Scheduler) or the core Scheduler Scheduler))
And each scheduler consists of two content: the dispatch framework (which is essentially two function frames) and the Scheduler class
Scheduler classes are examples of implementing different scheduling strategies, such as CFS, RT class, and so on.
Their relationship is as
The current kernel supports two scheduler classes (the Sched_setscheduler system calls the policy that modifies the process): CFS (FAIR), RT (real-time), 5 scheduling Strategies: Sched_noraml (the most common strategy), Sched_ BATCH (in addition to not being preempted outside of the regular task, allows the task to run longer, better use the cache, suitable for batch work), Sched_idle (it is even weaker than nice 19, to avoid priority reversal) and SCHED_RR (cyclic dispatch, Have time slices, put them at the end of the queue), Sched_fifo (no time slices, can run any length of time); the previous three strategies used the CFS scheduler class, followed by two using the RT Scheduler class.
2 Scheduler Classes
The current kernel supports 2 scheduler classes (Sched_setscheduler system calls can modify the process's policies):CFS (Fair Scheduler),RT (real-time scheduler)
5 Scheduling strategies
Field |
Description |
Scheduler Class |
Sched_normal |
(also called sched_other) for normal processes, implemented through the CFS scheduler. The sched_batch is used for non-interactive processor-consuming processes. Sched_idle is used when the system load is low |
Cfs |
Sched_batch |
Sched_normal a differentiated version of the common process strategy. Using the time-sharing strategy, the CPU computing resources are allocated according to the dynamic priority (available in Nice () API settings). Note: This type of process has a lower priority than the above two types of real-time processes, in other words, real-time process priority scheduling when there is a real-time process present. However, for throughput optimization, the task of allowing tasks to run longer and better using caching, in addition to being preempted outside of the usual tasks, is appropriate for batch processing |
Cfs |
Sched_idle |
The lowest priority, when the system is idle to run such processes (such as the use of idle computer resources to run outside the civilized search, protein structure analysis and other tasks, is the application of this scheduling strategy) |
Cfs |
Sched_fifo |
First-in, first-out scheduling algorithm (real-time scheduling strategy), the same priority tasks first-to-first service, high-priority tasks can preempt low-priority tasks |
Rt |
Sched_rr |
Rotation scheduling algorithm (real-time scheduling strategy), the latter provides roound-robin semantics, the use of time slices, the same priority of the task when the time slice will be put to the end of the queue to ensure fairness, the same, high-priority tasks can preempt low-priority tasks. Different required real-time tasks can set policies with the Sched_setscheduler () API as needed |
Rt |
Sched_deadline |
The newly supported real-time process scheduling strategy, which is highly sensitive to latency and completion time, is applicable to burst-based computing. Scheduling algorithm based on earliest Deadline first (EDF) |
|
The first three strategies use the CFS Scheduler class, followed by two using the RT Scheduler class.
In addition, for the scheduling framework and the scheduler class, they have their own management of the running queue, the scheduling framework only identifies RQ (it is not actually a running queue), and for the CFS Scheduler class its running queue is CFS_RQ (internal use red black tree Organization scheduling entity), real-time RT running queue is Rt_ RQ (Internal use priority bitmap+ two-way linked list organization scheduling entity)
Essentially, the Universal Scheduler (Core Scheduler) is an allocator that interacts with two other components.
The scheduler is used to determine which process to run next.
The kernel supports different scheduling strategies (completely fair scheduling, real-time scheduling, scheduling idle processes when nothing is done, that is, the No. 0 process is called the swapper process, the idle process), and the scheduling class enables the ability to implement these side amounts in a modular way, where the code of a class does not need to interact with other classes of code
When the scheduler is called, he queries the Scheduler class to know which process to run next
After you select the process that will run, you must perform the underlying task switch.
This requires close interaction with the CPU. Each process happens to belong to a scheduling class, and each scheduling class is responsible for managing the owning process. The generic scheduler itself does not involve process management, and its work is delegated to the Scheduler class.
Scheduling of processes
First of all, we need to understand what kind of process will go into the scheduler to choose, is in the task_running state of the process, while the other state of the process will not enter the scheduler to dispatch.
The timing of the system scheduling is as follows
When calling Cond_resched ()
When you explicitly call schedule ()
When returning user space from a system call or an abnormal interrupt
When returning user space from the interrupt context
When the kernel preemption is turned on (default on), there are a few more scheduling opportunities, as follows
When you call Preempt_enable () in the context of a system call or an exception interrupt (multiple calls to Preempt_enable (), the system will only be dispatched on the last call)
In an interrupt context, when the interrupt handler is returned to a preempted context (this is the lower half of the interrupt, the upper half of the interrupt is actually shut down, and the new interrupt is only registered, because the upper part is processed very quickly, the new interrupt signal is executed after the upper half of the processing is completed, thus creating an interrupt reentrant)
When the system initiates the scheduler initialization, a scheduling timer is initialized, and the scheduler timer executes an interrupt at a certain time, and the interrupt will update the running time of the current running process, and if the process needs to be dispatched, a dispatch flag bit will be set in the timer interrupt, and then returned from the timer interrupt. Since it has been mentioned that there is a scheduling time when returning from the interrupt context, all the interrupt return processing in the kernel source code will have to determine if the dispatch flag bit is set, as set to execute schedule () to dispatch.
And we know that the real-time process and the normal process is co-ordinated, the scheduler is how to coordinate the scheduling between them, it is very simple, each time the scheduling, will first in the real-time process run queue to see if there is a real-time running process, if not, then go to the normal process run queue to find the next running normal process, The scheduler will run with the idle process.
The subsequent chapters will be put on the code for detailed instructions.
The system does not allow scheduling at all times, when it is in the hard interrupt period, the dispatch is prohibited by the system, after the hard interrupt to re-allow scheduling. For exceptions, the system does not prohibit scheduling, that is, in the context of the exception, the system is likely to occur scheduling.
Linux Process Scheduler Overview--linux process management and scheduling (15)