"Linux kernel design and implementation" reading notes (Iv.)-Scheduling of processes

Source: Internet
Author: User

Main content:

    • What is scheduling
    • Scheduling implementation principle
    • The method of scheduling implementation on Linux
    • Scheduling-related system calls
1. What is scheduling

Now the operating system is multi-tasking, in order to enable more tasks to better run on the system at the same time, need a management program to manage the computer on the simultaneous running of the various tasks (that is, the process).

This management program is the scheduler, its function is simple to say:

    1. Decide which processes to run and which processes to wait on
    2. Determine how long each process runs

In addition, in order to achieve a better user experience, the running process can be interrupted immediately by other more urgent processes.

In short, scheduling is a balanced process. On the one hand, it is to ensure that each running process can maximize the use of the CPU (that is, as few switching processes, process switching too much, the CPU time will be wasted on switching), on the other hand, to ensure that the process can be fair use of the CPU (that is, to prevent a process of exclusive CPU for a long time).

2. Scheduling implementation Principle

As mentioned earlier, the scheduling function is deciding which process to run and how long the process should run.

Determines which process runs and how long it takes to prioritize processes. In order to determine how long a process can last, the concept of time slices is also introduced in the schedule.

2.1 about the priority of the process

There are 2 ways to prioritize processes, one is the nice value and the other is real-time priority.

The range of nice values is -20~+19, and the higher the value the lower the priority, which means that the nice value is 20 with the highest process priority.

The real-time priority range is 0~99, as opposed to the nice value, where real-time precedence is the higher the value the greater the priority.

Real-time processes are some processes that require relatively high response times, so processes with real-time priority in the system are running queues, and they preempt the normal process run time.

The 2 priorities of the process can be difficult to understand, which priority is the priority? What happens when a process has 2 priority levels?

In fact, the Linux kernel has long been a solution.

For the first question, which priority is more priority?

The answer is that the real-time priority is higher than the nice value, and in the kernel, the real-time priority range is 0~max_rt_prio-1 Max_rt_prio definition see include/linux/sched.h

1611 #define Max_user_rt_prio        1001612 #define Max_rt_prio             Max_user_rt_prio

The nice value in the kernel is scoped to max_rt_prio~max_rt_prio+40 that is Max_rt_prio~max_prio

1614 #define Max_prio                (Max_rt_prio + 40)

Second question, what if a process has 2 priorities at the same time?

The simple answer is that a process cannot have 2 priorities. A process with real-time priority does not have a nice value and there is no real-time priority with a nice value.

We can view the real-time priority and nice values of the process with the following command: (where Rtprio is the real-time priority, NI is the nice value)

$ ps-eo state,uid,pid,ppid,rtprio,ni,time,comms   uid   pid  ppid rtprio  ni time     COMMANDS     0     1     0      -   0 00:00:00 systemds     0     2     0      -   0 00:00:00 kthreadds     0     3     2      -   0 00:00:00 ksoftirqd/0s     0     6     2   -00:00:00 migration/0s     0     7     2   -00:00:00 watchdog/0s     0     8     2   -00:00:00 migration/1s     0     2      -   0 00:00:00 ksoftirqd/1s     0     2   -00:00:00 watchdog/1s     0     2   -00:00:00 migration/2s     0     2      -   0 00:00:00 KSOFTIRQD /2s     0     2   -00:00:00 watchdog/2s     0     2   -00:00:00 Migration/3s     0     2      -   0 00:00:00 ksoftirqd/3s     0     2   - 00:00:00 watchdog/3s     0     2      - -20 00:00:00 cpusets     0     2      --20 00:00:00 Khelper

2.2 About time slices

With a priority, you can decide who will run first. However, for the scheduler, it is not the end of the run time, you must know how often the next schedule.

Then there is the concept of time slices. A time slice is a numeric value that indicates how long a process can continue to run before it is preempted.

It can also be thought of as the time that the process ran before the next scheduled occurrence (unless the process actively abandons the CPU, or there is a real-time process to preempt the CPU).

The size of the time slice setting is not simple, set large, the system response is slow (long schedule), set small, process frequent switching brought about by the processor consumption. The default time slice is typically 10ms

2.3 Scheduling implementation principle (based on priority and time slice)

Here is a visual example to illustrate:

Assume that there are only 3 processes in the system Processa (NI=+10), PROCESSB (ni=0), PROCESSC (ni=-10), NI represents the nice value of the process, time slice =10ms

1) Before scheduling, the process priority according to a certain amount of weight mapped into a time slice (here assume that the priority is higher than the amount of 5msCPU time).

Assuming that the Processa is assigned a time slice of 10ms, then the PROCESSB priority is higher than Processa (the smaller the nice value the higher the priority), the PROCESSB should be assigned 10*5+10=60ms, and so on, PROCESSC allocation 20*5+ 10=110ms

2) The process of allocating more CPU time is prioritized when scheduling is started. Due to Processa (10ms), PROCESSB (60ms), PROCESSC (110ms). Obviously dispatch PROCESSC first.

3) 10ms (one time slice), once again dispatched, Processa (10ms), PROCESSB (60ms), PROCESSC (100ms). PROCESSC has just run 10ms, so it becomes 100ms. The PROCESSC is still scheduled at this time

4) 4 times after the dispatch (4 time slices), Processa (10ms), PROCESSB (60ms), PROCESSC (60ms). At this point PROCESSB and PROCESSC CPU time, it depends on PROCESSB and processc who in front of the CPU running queue, assuming PROCESSB in front, then dispatch PROCESSB

5) 10ms (one time slice), Processa (10ms), PROCESSB (50ms), PROCESSC (60ms). Dispatch PROCESSC again

6) PROCESSB and PROCESSC alternately run until Processa (10ms), PROCESSB (10ms), PROCESSC (10ms).

It depends on the PROCESSA,PROCESSB,PROCESSC who dispatches the queue before the CPU runs. This assumes dispatch Processa

7) After 10ms (a time slice), Processa (exit after time slice is exhausted), PROCESSB (10ms), PROCESSC (10ms).

8) After 2 time slices, PROCESSB and PROCESSC also run out.

This example is very simple, mainly to illustrate the principle of scheduling, the actual scheduling algorithm although not so simple, but the basic principle of implementation is similar:

1) Determine how much CPU time each process can occupy (there are many algorithms to determine the CPU time here, depending on the requirements will be different)

2) Run first with more CPU time

3) After running, deduct the CPU time of the running process and go back to 1)

3. How to implement scheduling on Linux

The scheduling algorithm on Linux is developing continuously, after the 2.6.23 kernel, it adopts the "completely fair scheduling algorithm", referred to as CFS.

When the CFS algorithm allocates CPU time for each process, it does not assign them an absolute CPU time, but instead assigns them a percentage of CPU time based on the priority of the process.

such as Processa (Ni=1), PROCESSB (ni=3), PROCESSC (ni=6), in the CFS algorithm, respectively, the percentage of CPU occupied is: Processa (10%), PROCESSB (30%), PROCESSC (60%)

Because the total is 100%,PROCESSB priority is 3 times times Processa, PROCESSC priority is Processa 6 times times.

The CFS algorithm on Linux has the following main steps: (or Processa (10%), PROCESSB (30%), PROCESSC (60%) as an example)

1) Calculate the vruntime of each process (note 1) and update the vruntime of the process through the Update_curr () function.

2) Select the process with minimum vruntime to run. (Note 2)

3) After the process is finished, update the process's vruntime to step 2) (Note 3)

Note 1. The vruntime here is the sum of the time the process is running virtual. Vruntime is defined in the struct sched_entity of the: Kernel/sched_fair.c file.

Note 2. It's a bit difficult to understand, depending on the vruntime to select the process to run, it seems to have nothing to do with the percentage of CPU time each process occupies.

1) For example run PROCESSC first, (VR is vruntime abbreviation), then 10ms: Processa (vr=0), PROCESSB (vr=0), PROCESSC (vr=10)

2) Then the next schedule can only run Processa or PROCESSB. (Because a process with a minimum vruntime is selected)

For a long time, Processa, PROCESSB, and PROCESSC are fairly alternating, with no relation to priority.

In fact, Vruntime is not the actual running time, it is the result of the actual running time after the weighted operation .

For example, the above 3 processes in Processa (10%) only allocated the total CPU processing time of 10%, then Processa run 10ms, its vruntime will increase 100ms.

And so on, PROCESSB run 10ms, its vruntime will increase (100/3) MS,PROCESSC run 10ms, its vruntime will increase (100/6) Ms.

The actual runtime, because the PROCESSC vruntime increases the slowest, so it will get the most CPU processing time.

The above weighted algorithm is my own to understand the convenience of simplifying, Linux on the Vruntime weighted method also have to see the source ^-^

Note 3. Linux in order to be able to quickly find the smallest vruntime, all the processes are stored in a red-black tree. The leftmost leaf node of the tree is the process with the smallest vruntime, which is updated when a new process joins or an old process exits.

In fact, the scheduler on Linux is provided as a module, each scheduler has a different priority, so there can be a variety of scheduling algorithms at the same time.

Each process can choose its own scheduler, Linux scheduling, first by the scheduler priority to select a scheduler, and then select the process under the scheduler.

4. Scheduling related system calls

There are 2 main types of scheduling related system calls:

1) related to scheduling policies and process priorities (that is, the above mentioned parameters, priorities, time slices, etc.)-the first 8 in the table below

2) processor-related-the last 3 in the table below

System calls

Describe

Nice ()

Set the nice value of a process

Sched_setscheduler ()

Set the scheduling policy for the process, that is, what scheduling algorithm the set process takes

Sched_getscheduler ()

Get the scheduling algorithm for a process

Sched_setparam ()

Set the real-time priority of a process

Sched_getparam ()

Gets the real-time priority of the process

Sched_get_priority_max ()

Get the maximum value of real-time priority, due to user rights issues, non-root users cannot set real-time priority to 99

Sched_get_priority_min ()

Gets the minimum value for the real-time priority, similar to the above

Sched_rr_get_interval ()

Gets the time slice of the process

Sched_setaffinity ()

The processing affinity of the set process is, in fact, the mask flag of cpu_allowed stored in task_struct. Each bit of the mask corresponds to a processor that is available on a system, and the default all bits are set, that is, the process can be executed on all processors in the system.

This function allows the user to set different masks so that the process can only run on one or more processors in the system.

Sched_getaffinity ()

Get processing affinity for a process

Sched_yield ()

Temporarily let the processor

"Linux kernel design and implementation" reading notes (Iv.)-Scheduling of processes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.