Linux Scheduler-User space interface

Source: Internet
Author: User
Tags int size posix

First, preface
The Linux scheduler is mysterious and tempting, and every Linux engineer wants to dive into its interior. But there is an old adage in China called "The Heart of Life", a module of ingenious internal logic (the so-called "heart") whose extension is simple and elegant interface (I call "phase"). Through the definition of the external interface, we can actually harvest 60% or 70% of the internal information of the module. Therefore, this article mainly describes the Linux scheduler open to the user space interface, I hope that through the user space Scheduler interface to understand the behavior of the Linux scheduler.

Second, nice function

The nice function modifies the nice value of the calling process, whose interface is defined as follows:

#include <unistd.h>
int Nice (int inc);

In order to explain the function of the interface, we will give a practical example. The program calls Nice (3), which increases the nice value of the current process by 3, which means that the priority of the process is reduced by 3 levels (the promotion of Nice value is better for others, and their priority is lower). If the program calls Nice (-5), the nice value of the current process is subtracted by 5, which means that the process's priority is increased by 5 levels. Returns 1 when the error is called, and there is a slight ambiguity in the success of the call. The POSIX standard specifies that the Nice function returns the new nice value, but both the system call and the C Library of Linux take the form of a successful operation returning 0. This approach makes it impossible to get the current priority when calling the Nice function, and if you want to get the current priority, call the GetPriority function, which we'll describe in the next section.

Although the nice function is used to adjust the priority, actually adjusting nice value is adjusting the CPU time assigned to the process by the scheduler, how does it affect CPU timing? We'll talk about it later when we describe the kernel code. In addition, it is important to note that nice value is a per-process setting according to the POSIX standard, but in Linux, nice value does not comply with this standard and it is a property of Per-thread.

Three, getpriority/setpriority function

From the description in the previous section, we learned about the limitations of Nice's functions, such as only modifying its own nice value, unable to get the current nice value, and so on, so we give the enhanced version of the Nice interface, which is the getpriority/setpriority function. The getpriority/setpriority function is defined as follows:

#include <sys/time.h>
#include <sys/resource.h>

int getpriority (int which, int who);
int setpriority (int which, int who, int prio);

You said the interface added function is good, how to change the name? Why not Getnice/setnice? In fact, from the description of the previous section, we did not distinguish between the scheduling priority and nice value, which was first used by nice value, and soon everyone felt that the word was not so good to understand, especially for beginners, So a noun such as a priority can give the user a better understanding of what the API does, and of course it turns out that the change is not ideal, as we'll describe later.

The Getpriority/setpriority function is powerful and can handle multiple requests, and different requests are made by the which and who parameters. When which equals prio_process, who needs to pass in the parameter of a process ID, getpriority returns the nice value of the specified process. When which equals prio_pgrp, who needs to pass the parameter of a process group ID, GetPriority returns the highest priority in the specified process group (btw,nice value is the smallest). When which equals Prio_user, who needs the user ID information, getpriority will return the one that is the smallest of the nice value in all processes that belong to the user. Who equals 0 indicates that the object to get or set is the current process (or the current process group, or the current user).

SetPriority similar to Nice, of course, is a little bit more powerful because it can receive PRIO_PROCESS,PRIO_PGRP or Prio_user parameters to set a set of processes for the nice value. The return value of setpriority is similar to other functions, 0 means success, 1 means the operation failed, but GetPriority is a little bit around. As a Linux programmer, we all know nice value is [-20, 19], if GetPriority returns this range, then the-1 priority here is a little awkward, because the general Linux C library interface function returns-1 means the call error, How do we distinguish between 1 calling the wrong return or the return value of the priority-1? GetPriority is a small number of return-1 is also probably the correct interface function: Before calling getpriority, we need to first clear the errno, call GetPriority, if return-1, we need to see if errno still maintain 0 value, if yes, That means the priority-1 is returned, otherwise an error has occurred.

Iv. interface for operating RT Priority

The traditional Unix-like kernel, the scheduler is the use of Round-robin time-sharing algorithm: If there are several processes are runnable, then do not worry, we queue, eat fruit, each process allocated a CPU time slice, We take turns to allocate the time slice to get CPU resources, all the time slices run out, then re-round the allocation. Under such a model, the Nice interface function that indirectly affects the CPU time slice is sufficient. Of course, allocating more time slices means having a higher priority, so nice Vlaue is also called the priority of the process.

However, new requirements abound (human desires are infinity D), especially in real time, so the POSIX standard (2008 version) adds the content of real-time scheduling and provides POSIX realtime scheduling API to allow user space to modify scheduling policies and scheduling priorities. This is a little awkward, the original nice value everyone has been accustomed to call the process priority, now the real process priority comes up, how to distinguish? To solve this problem, we introduce a new noun called the Dispatch strategy (scheduling policy). The scheduler often sets a set of rules to decide when, when and how long, to choose which process goes into execution state. Those "rules" are scheduling strategies.

A good scheduling strategy depends on the classification of the process, there is a class of process is everyone is familiar with the common process, the use of time-slice rotation algorithm of those processes. Of course, such processes can also be subdivided, such as computationally intensive processes (Sched_batch, the Scheduler is best not to wake up the process too often), such as the Idle class process (Sched_idle), the Idle class process has a very low priority, That is, if the system has other things to do to do other things (scheduling other processes to execute), really did not work, and then consider the idle type of process. Regardless of the normal process, its priority is OK by using a dispatch parameter such as nice value.

In addition to the ordinary process, there is a class of strict priority to schedule the process, if familiar with the RTOs, the priority-base should not be unfamiliar to the scheduler, the official large-scale pressure of the dead, as long as the high priority of the process is runnable, then the low priority process is simply no opportunity to execute. The priority here is the real priority, but nice value is already called the process priority, so the priority here is called Rt. The scheduling of RT processes is subdivided into two categories: Sched_fifo and SCHED_RR. These two scheduling strategies in the same RT priority time slightly different, Sched_fifo is who first to get the CPU resources, and always occupy, until the initiative to give up the CPU or exit, the same RT priority process will be the opportunity to execute. SCHED_RR slightly humanized a little, the same RT priority process has time slices, we take turns to execute. For real-time processes, the RT priority parameter describes all of them.

Introduction here, it is time to summarize: The process priority has two ranges, one is nice value, with the first two bars of the API to set or get. Another priority is RT priority, which completely crushes nice value, and the interface to operate RT priorities is described in this section.

OK, after a long paving process, we can finally introduce the realtime process scheduling API, the specific API is defined as follows:

#include <sched.h>

int Sched_setscheduler (pid_t pid, int policy, const struct Sched_param *param);

int Sched_getscheduler (pid_t pid);

int Sched_get_priority_max (int policy);--Returns the largest RT priority for the specified policy
int sched_get_priority_min (int policy);--Returns the smallest RT priority of the specified policy

int Sched_setparam (pid_t pid, const struct Sched_param *param);
int Sched_getparam (pid_t pid, struct sched_param *param);

Sched_get_priority_max and Sched_get_priority_min return the maximum and minimum RT priorities for the specified scheduling policy, with different operating systems implementing different priority quantities. In Linux, the RT priority for real-time processes (Sched_fifo and SCHED_RR) amounts to 99 level, with a minimum of 1 and a maximum of 99. For other scheduling policies, these functions return 0.

The Sched_getscheduler function can get the scheduling policy for the specified process (if the PID equals 0, it is the dispatch strategy that gets the calling process). The Sched_setscheduler function is used to set the scheduling policy for the specified process, which can also set RT priority for real-time processes. If the scheduling policy of the set process is a non-real-time scheduling strategy (for example, Sched_normal), then the Param parameter is meaningless and its sched_priority member must be set to 0. Sched_setparam/sched_getparam is very simple, everyone see the man page well.

V. The interface of unified lakes and rivers

It seems that the API described in the previous section is sufficient, however, the story is not over. After the previous discussion on the scheduling interface, basically we have a knowledge of the scheduler's behavior: The scheduler is based on priority (referred to as RT priorities) to work, priority is always priority scheduling. RT priority that falls on [1,99] is a real-time process, and RT priority equals 0 is the normal process. For normal processes, the scheduler also adjusts to the Nice value (which was once called priority, not to be confused with RT priorities). The process of user space can modify the scheduling policy, nice value, and RT priority through the various interface APIs described previously. Everything seems to be perfect, and the CFS type of scheduler handles common arithmetic-dense forms (such as compiling kernels) and user interaction applications (such as vi editing files). If you have an application with real-time requirements, consider having the RT type Scheduler. However, Sched_fifo and SCHED_RR do not solve the problem of how to mix some realtime applications and some timing requirements, because in this scheduling strategy, high-priority tasks will always delay low-priority tasks, If a low-priority task has some timing requirements, you simply cannot control the scheduling delay time.

To address the problem described in the previous section, a new class of processes is defined, and the priority of such processes is higher than the priority of real-time processes and ordinary processes, which have their own characteristics, for reference:

The characteristic of such processes is that every fixed period of time will be up and work, and it will take a while to handle the transaction. This kind of process is very cow, one comes up to tell the dispatcher, I am a little temper process, and other those flirtatious process is not the same, I every once in a while (period) you have to fixed assigned to me a certain CPU resources (computer time), of course, the allocation of CPU Time must be executed within that cycle, so there is a deadline concept. To address this requirement, the 3.14 kernel introduces a new class of processes called the DEADLINE process, the scheduling strategy for such processes is sched_deadline. The scheduler will also be thought highly to this kind of process, whenever the start time of a cycle (that is, the time the deadline process is awakened), the scheduler will prioritize the deadline process to the CPU timer needs, And dispatches the process execution within a specified deadline time. After executing the specified CPU time, it is possible to consider scheduling the move, but when the next cycle arrives, the scheduler will still be desperate to execute the deadline process again within deadline.

Although the deadline process takes precedence over the other two types of processes, it is of course unreasonable to use "priority" to describe such a process, which should be described using the following three parameters:

(1) Cycle time (period in)

(2) Deadline time (relative deadline in)

(3) How much CPU time is allocated in a single dispatch cycle (comp in. Time

At this point, it is estimated that you have also found that the interfaces described earlier are not suitable for setting these parameters, so the following interface APIs are added to the Gnu/linux operating system:

#include <sched.h>

int sched_setattr (pid_t pid, const struct sched_attr *attr, unsigned int flags);
int sched_getattr (pid_t pid, const struct sched_attr *attr, unsigned int size, unsigned int flags);

attr The data type of this parameter is struct sched_attr, which contains all the control parameters you want about scheduling: Policy,nice value,rt priority,period,deadline and so on. With this interface you can complete all the previous subsections to describe the API can complete the task, the only bad thing is that the interface is Linux-specific, not the POSIX standard, whether the application of this interface is a matter of opinion. More details of the knowledge is not described here, we still refer to the man page good.

Vi. Other

The interface API described above is related to the scheduler parameters, in fact, the Linux scheduler also has two types of interfaces. One is sched_getaffinity and sched_setaffinity, which is used to manipulate a thread's CPU affinity. Another interface is Sched_yield, which can yield CPU resources and let the Linux scheduler choose a suitable thread to execute. These interfaces are very simple, everyone study carefully OK.

Reference Documentation:

1. POSIX standard 2008

2, Linux under the various man page

3. Linux 4.4.6 Kernel source code

Linux Scheduler-User space interface

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.