"Linux kernel design and implementation" Learning summary CHAP4

Source: Internet
Author: User
Tags ranges

Fourth Chapter process scheduling
    • The scheduler is responsible for deciding which process to run, when to run, and for how long, and the process scheduler can be seen as a kernel subsystem that allocates a limited amount of processor time resources between the running state processes. Only through the rational scheduling of the scheduler, the system resources to maximize the role of the multi-process will have and release the effect.
    • Scheduler is not too complex principle, the principle of maximizing the use of processor time, as long as there is a process can be executed, then there will always be a process is executing, but as long as the system can run more than the number of processors, it is destined to a given moment some process can not be executed, these processes are waiting to run, In a set of processes that are in a running state, choosing-one to execute is the basic work that the scheduler needs to accomplish. -
4.1 Multi-tasking
    • Multitasking is an operating system that can concurrently execute multiple processes simultaneously, on a single-processor machine, which creates the illusion that multiple processes are running concurrently on a multiprocessor machine, which enables multiple processes to actually run concurrently and concurrently on different processors. Multitasking operating systems can cause multiple processes to clog or sleep in a single-processor or multiprocessor machine, which means that they are not actually put into execution until the work is done.
    • These tasks are not in a functioning state, although they are in memory. Instead, these processes use the kernel to block themselves until an event (the keyboard enters the network data, a period of time, and so on) occurs. Therefore, modern Linux systems may have 100 processes in memory, but only one is in a running state.
    • Multi-tasking systems can be divided into two categories: non-preemptive multitasking and preemptive multitasking. Like all the variants of UNIX and many other modern operating systems, Linux offers preemptive multi-tasking.
Process scheduling for 4.2 Linux
    • From the 1th edition of Linux in 1991 to the later 2.4 kernel series, the Linux scheduler was fairly rudimentary, the design almost primitive, of course it was easy to understand, but it was difficult to handle in many running processes or multiprocessor environments, and because of this, in the Linux 2.5 development series of cores, The dispatcher did a big operation and started using a new scheduler called the O (1) Scheduler--it was named because of its algorithm's behavior.
    • It solves many of the shortcomings of previous versions of the Linux Scheduler, introducing many powerful new features and performance features, mainly thanks to the static time slice algorithm and the run queue for each processor, which helped us to get rid of the limitations of the previous scheduling program.
    • The O (1) scheduler, while still showing near-perfect performance and scalability in a multi-processor environment with several 10 (not hundreds) of processors, proves that the scheduling algorithm has some inherent weaknesses in scheduling response time-sensitive programs, These programs we call it interactive process one it undoubtedly includes all programs that require user interaction, and because of this, the O (1) scheduler, while ideal for large server workloads, is poorly performing on desktop systems where many interactive programs are running because of the lack of interactive processes, Since the beginning of the 2.6 kernel system development, the developer introduced a new process scheduling algorithm to improve the scheduling performance of the interactive program, the most famous of which is the "inverse stair deadline scheduling algorithm, which absorbs the queue theory and introduces the concept of fair dispatch to the Linux scheduler." And finally replaced the O (1) scheduling algorithm in the 2.6.23 kernel version, which is now called the "complete Fair scheduling algorithm", or simply referred to as CFS.
4.3 strategy 4.3.1 The I/O consumption and processor-consumed process scheduling strategy typically seeks to balance among the two conflicting objectives: rapid process response (short response times) and maximum system utilization (high throughput), in order to meet the above requirements, The scheduler usually uses a very complex set of algorithms to run the most worthwhile processes, but it often does not guarantee that the low-priority process will be treated fairly, and that the Unⅸ system Scheduler prefers I/O-consuming programs to provide better program responsiveness. Linux in order to ensure the performance of interactive applications and desktop systems, so the response to the process is optimized (shorten the response to think) more inclined to prioritize the I/O consumption process, although, but below you will see that the scheduler also does not ignore the processor consumption of the process. 4.3.2 Process Priority
    • The most basic class of scheduling algorithm is based on priority scheduling, which is a process based on the value of the process and its demand for the processor time to grade the idea, usually the high priority of the process first run, low after the run, the same priority of the process by the rotation of the schedule (one by one, repeated).
    • In some systems, high-priority processes use a longer slice of time, and the scheduler always chooses which time slices are not exhausted and the highest priority is run, and both the user and the system can influence system scheduling by setting the priority of the process.
    • Linux takes two different priority ranges, the first of which is the nice value, which ranges from ―20 to +19. The second range is real-time priority.
4.3.3 Time Slice E is a numeric value that indicates how long a process can continue to run before it is preempted. The scheduling policy must specify a default time slice, but this is not a simple matter, the time slice is too long can cause the system to respond poorly to the interaction, people feel that the system can not execute the application concurrently: The time slice is too short will significantly increase the processor consumption of process switching because there will certainly be a considerable amount of system time used in process switching, While these processes can be used to run the time slice is very short, in addition, I/O consumption and processor consumption of the process of the contradiction is again revealed here: I/O consumption does not require a long time slice, and the processor consumption of the process is expected to be as long as possible (for example, this can make their cache hit rate higher). 4.3.4 Scheduling policy Activity 4.4 Linux Scheduling algorithm 4.4.1 Scheduler class
    • The Linux scheduler is provided in a modular manner, which is intended to allow different types of processes to selectively select scheduling algorithms. This modular structure is called the Scheduler class.
    • It allows a number of different dynamically added scheduling algorithms to coexist, scheduling the process belongs to their own category.
    • Each scheduler has a priority, the underlying scheduler code is defined in the Sched_fair.c file, it traverses the scheduling class in order of precedence, and the Scheduler class with the highest priority of an executable process wins, choosing the program to execute below. The Complete Fair Dispatch (CS) is a scheduling class for ordinary processes, called Sched_normal in Linux.
Most of the process scheduling problems in 4.4.2 Unix systems can be solved by modifying the traditional UNIX scheduler, although this modification is not small, but it is not a structural adjustment, for example, Solve the second problem by adding the nice value as a geometric increment rather than an arithmetic increment: A new measurement mechanism is used to separate the map from Nice value to time slice from the timer beat to solve the third problem. But these solutions have avoided the real problem-the fixed switching frequency caused by allocating absolute time slices, which creates a lot of variability in fairness, and the method used by CFS is to radically redesign the time-slice allocation (in terms of the process scheduler) by completely discarding the time slice and assigning it to the process a processor to use the weighting , in this way, the CFS ensures that the process scheduling can have a constant fairness, and the switching frequency is placed in constant change. 4.4.3 The starting point of the fair dispatch of CFS is based on a simple idea: the effect of process scheduling should be the same as the system--a perfect multi-tasking in an ideal, and we can dispatch them to an infinitely small time period, so that in any measurable cycle, we give each process in a process as much running time as possible. For example, if we have two running processes, in the standard Unⅸ scheduling model, we run one of the 5ms and then run another, 5ms. However, they will occupy 100% of the processor at any one of their runtimes. Ideally, the perfect multitasking model should look like this: we can run two simultaneous processes within 5ms, each using half the capacity of the processor. Implementation of 4.5 Linux scheduling 4.5.1 time accounting all schedulers must be billed for the process run time. Most UNIX systems allocate a time slice to each process. The time slice will be reduced by one cycle per tick when the system clock ticks occur.
    • 1. Scheduler Entity Structure
    • 2. Virtual Real-time
4.5.2 process Selection in the previous section we discussed that if there is a perfect multitasking processor, the vruntime values of all the running processes will be the same, but in fact we don't find the perfect multitasking processor, Therefore, CFS tries to use a simple rule to balance the virtual running time of the process: when the CFS needs to choose the next running process, it will pick a process with minimal vruntime, which is actually the core of the CSF scheduling algorithm: Choose the task with minimum and vruntime. So the rest of this is about how to implement a process that chooses a minimum, vruntime value. Linux, red black tree called Rbtree, it is a self-balancing binary search tree, red and black tree is a tree node form of the data stored, the data will correspond to a key value, we can use these key values to quickly retrieve the data on the node (it is important that The speed of the corresponding node is retrieved by the key value and the node of the whole tree is modeled as an exponential ratio.
    • 1. Pick the next task
    • 2. Join the process to the tree
4.5.3 Scheduler Entry

The primary entry point for process scheduling is schedule (), which is defined in file KERNEL/SCHED.C.

4.5.4 sleep and wake-up hibernation (blocked) process in a special non-executable state, this is very important, if there is no such a special state, the dispatcher may elect a process that is unwilling to be executed, and worse, the hibernation must be implemented by polling, process hibernation for a variety of reasons, But it must be to wait for some events, the event may be a period of time to read more data from a file, or a hardware event, a process may also be forced into hibernation when trying to acquire an already occupied kernel semaphore, a common cause of hibernation is a file i/o--, such as a process executed on a file read () operation, which needs to be read from the disk, and the process has to wait for the keyboard input, in either case, the kernel operates the same way: the process marks itself as a hop-off state, moves from the tree of the licensed class to the black tree, puts it in the waiting queue, and then calls schedule () to select and execute--a different process The process of awakening is the opposite: the process is set to the executable state and then moved from the wait queue to a clear red-black tree. 4.6 preemption and Context switching 4.6.1 user preemption when the kernel returns to user space, it knows that it is safe, because since it can continue to execute the current process, it can of course choose a new process to execute. Therefore, whether the kernel returns after an interrupt handler or a system call, the need_resched flag is checked, and if it is set, the kernel chooses a different (more appropriate process to run. The return path returned from the interrupt handler or system call is architecture-related, in entry. S (this file contains not only the kernel entry part of the program, the kernel exit part of the relevant code is also in it) in the file is implemented through assembly language. In short, a user preemption occurs when the following conditions occur:
    • When returning user space from system transfer.
    • When the user space is returned from the interrupt handler.
4.6.2 kernel preemption Unlike most of the other Unⅸ variants and most other operating systems, Linux fully supports kernel preemption, and in kernels that do not support kernel preemption, kernel code can be executed until it is finished, that is, The scheduler has no way to reschedule when a kernel-level task is executing-the tasks in the kernel are not preempted by cooperative scheduling. Kernel code has been executed until completion (return to user space) or obvious blocking, in the 2.6 kernel, the kernel introduced the preemption capability, now, as long as the re-Dispatch is secure, the kernel can at any time preempt the task being performed. 4.7 Real-time scheduling strategy Linux real-time scheduling algorithm provides a kind of soft real-time operation mode, the meaning of soft real time is that the kernel scheduling process, try to make the process before its limited time to run, but the kernel does not guarantee that the requirements of these processes can always be met. In contrast, the hard real-time system is guaranteed to meet any scheduling requirement under certain conditions. Linux does not guarantee the scheduling of real-time tasks. Although the hard real-time operation is not guaranteed, the performance of the Linux real-time scheduling algorithm is very good. The 2.6 version of the kernel can meet strict time requirements. 4.8 Scheduling-related system calls 4.8.1 system calls related to scheduling policies and priorities
    • Sched_setscheduler () and Sched_getscheduler () are used to set and get the scheduling policy and real-time priority of the process, respectively. Similar to other system invocations, their implementations are made up of many parameter checks, initialization, and cleanup. The most important job is to read or overwrite the values of the policy and rt_priority of the process task_struct.
    • Sched_setscheduler () and Sched_getscheduler () are used to set and get the real-time priority of a process, respectively. These two system calls get encapsulated in the rt_priority of the SCHED_PARAM special structure. The maximum priority of a real-time scheduling strategy: Max_ Userrt_prio minus 1. The minimum priority is equal to 1.
    • For a normal process, the nice function can increase the static priority of a given process by a given amount. Only a superuser can use a negative value when calling it, thereby increasing the priority of the process. The nice function calls the kernel's Set_user_nice function, which sets the task_struct Static_prio value of the process.
4.8.2 system calls to processor bindings provide a mandatory processor binding mechanism for the Linux scheduler. That is, although it tries to try to make the process run on the same processor as much as possible through a soft (or natural) affinity, it also allows the user to force the designation "this process must run on these processors anyway". This enforced affinity is stored in a bitmask flag of the process. Each bit of the mask flag corresponds to a system-available processor, and by default all bits are set. 4.8.3 Discard Processor Time Linux is called through the Sched_yield () system, providing a mechanism for the process to explicitly cede processor time to other waiting execution processes by moving the process from the active queue (because the process is executing, So it must be in this queue) moved to the expiration queue, resulting in an effect that not only preempted the process and put it in the last side of the priority queue, but also put it in an expired queue-to ensure that it will not be executed for a period of time, because the real-time process does not expire, so is the exception, They are moved only to the last side of their priority queue (not placed in the expiration queue) in earlier versions of Linux, the process was placed only at the end of the priority queue, and the time to discard was often not too long, and now the application and even kernel code before calling Sched_yield () You should carefully consider whether you really want to give up processor time. Kernel code for convenience, you can directly call Sched_yield (), first to determine that a given process is actually in the executable state, and then call Sched_yield (), the user space of the application directly using the Sched_yield () system call can be. 4.9 Summary The Process Scheduler is an important part of the kernel, because the running process first uses the computer (at least for most of us) However, it is not easy to meet the various needs of process scheduling: it is difficult to find a "one-size-fits-all" algorithm that is suitable for many operational processes, but also scalability, There is also a balance between the scheduling cycle and throughput, while satisfying the demands of various workloads, but the new CFS scheduler for the Linux kernel is trying to meet all aspects of the requirements and provides the best solution with a better scalability and novel approach, covering the relevant content of process management in the previous chapters. This chapter examines the basic principles, implementation, scheduling algorithms, and interfaces used by the Linux kernel at the time of process scheduling.

"Linux kernel design and implementation" Learning summary CHAP4

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.