A brief introduction to Linux Scheduler development

Last Update:2014-09-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

Process scheduling is the core function of the operating system. Scheduler is only a part of the scheduling process, process scheduling is a very complex process, requiring multiple systems to work together to complete. The focus of this article is only the scheduler, whose main task is to select the most appropriate one in all RUNNING processes. As a general-purpose operating system, the Linux scheduler divides processes into three categories:

Interactive process

Such processes have a lot of human-computer interaction, so the process is constantly sleeping and waiting for user input. Typical applications such as the Editor VI. Such processes require a higher response time for the system, or the user may experience slow system response.

Batch processing process

Such processes do not require human-computer interaction and run in the background, requiring a large amount of system resources. But can tolerate response delays. such as compilers.

Real-time processes

Real-time demand for scheduling delays is highest, and these processes often perform very important operations that require immediate response and execution. such as video playback software or aircraft flight control system, it is obvious that such programs can not tolerate long delays in scheduling, light impact on the film screening effect, heavy machine destroyed human death.

Linux uses different scheduling strategies based on the different classifications of the process. For real-time processes, a FIFO or Round Robin scheduling strategy is used. For normal processes, it is necessary to differentiate between interactive and batch-processing. Traditional Linux schedulers Increase the priority of interactive applications, enabling them to be dispatched more quickly. The core idea of new schedulers such as CFS and RSDL is "completely fair". This design concept not only greatly simplifies the code complexity of the scheduler, but also provides a more perfect support for various scheduling requirements.

Before discussing CFS and RSDL, let's first review the scheduler used in Linux2.4 and Linux2.6.0.

Simple history of the Kernel Scheduler 2.1 Linux2.4 Scheduler

The scheduler used in Linux2.4.18 has a priority-based design, and the scheduler and Linus released in 1992 have no big difference. The pick next algorithm of the scheduler is very simple: the priority of all processes in the runqueue is compared sequentially, and the highest priority process is selected as the next scheduled process. (Runqueue is the queue that holds all the ready processes in the Linux kernel). The term pick next is used to refer to the process of selecting the next process to be dispatched from all candidate processes.

Each process is created with a time slice assigned to it. The clock interrupt decrements the current time slice of the running process, and when the time slice of the process is exhausted, it must wait for the time slice to be re-assigned before it has a chance to run. The Linux2.4 Scheduler guarantees that the time slices will be redistributed for all processes only after the time slices of all the RUNNING processes have been exhausted. This period of time is called an epoch. This design ensures that every process has the opportunity to be executed.

The requirements of various processes are not the same, Linux2.4 scheduler mainly relies on changing the priority of the process to meet the scheduling requirements of different processes. In fact, all subsequent schedulers rely primarily on the modification process priority to meet different scheduling requirements.

Real-time processes

The priority of the real-time process is statically set and is always greater than the priority of the normal process. Therefore, only if there is no real-time process in runqueue, the normal process can get the dispatch.

The real-time process uses two scheduling strategies: Sched_fifo and SCHED_RR. FIFO adopts the first-in-order strategy, for all the same priority processes, the first process to enter the Runqueue always priority to get the scheduling; Round Robin uses a more equitable rotation strategy, allowing real-time processes of the same priority to take turns to get scheduled.

Normal process

For normal processes, the scheduler tends to increase the priority of the interactive process because they require a fast user response. The priority of a normal process is determined primarily by the Counter field in the process descriptor (plus the static priority set by Nice). When a process is created, the counter value of the child process is half the value of the parent process counter, which ensures that no process can rely on the continuous fork () child process to obtain more execution opportunities.

How does the Linux2.4 scheduler improve the priority of the interactive process? As mentioned earlier, when the time slices for all RUNNING processes are exhausted, the scheduler recalculates the counter values of all processes, including not only the RUNNING process, but also the process that is sleeping. The counter of a process that is asleep is not used up, and when recalculated, their counter values are added to these unused portions, increasing their precedence. Interactive processes often sleep while waiting for user input, and when they are re-awakened and entering Runqueue, they take precedence over other processes to get the CPU. From the user's point of view, the interactive process responds more quickly.

The main disadvantages of the scheduler are:

Scalability is not good : The scheduler chooses the process to traverse the entire runqueue from which to choose the best candidate, so the execution time of the algorithm is proportional to the number of processes. In addition, the time spent in each recalculation of the counter increases linearly with the number of processes in the system, and when the number of processes is large, the cost of updating the counter operation is very high, causing the overall performance of the system to degrade.

scheduling performance on high-load systems is relatively low : 2.4 of the Scheduler is pre-allocated to each process for a larger time slice, so the scheduler is less efficient on high-load servers because the average wait time per process is proportional to the size of the time slice.

the optimization of interactive processes is not perfect : Linux2.4 The principle of identifying interactive processes is based on the assumption that interactive processes are more frequently suspended than batch processes. However, the reality is often not the case, although some batch processes do not have user interaction, but also frequently do IO operations, such as a database engine in the processing of queries often disk IO, although they do not need to quickly respond to the user, or increased priority. When such processes in the system are heavily loaded, the response time of the true interactive process is affected.

insufficient support for real-time processes : The Linux2.4 kernel is non-preemptive and does not take place when the process is in the kernel state, which is unacceptable for real-time applications.

To solve these problems, Ingo Molnar developed the new O (1) scheduler, which was not only used by Linux2.6, but also backport to Linux2.4 before CFS and RSDL, and this scheduler was used in many commercial releases.

2.2 Linux2.6 's O (1) Scheduler

From the name you can see that the O (1) Scheduler mainly addresses the extensibility issues in previous releases. The time spent in the O (1) Scheduling algorithm is constant, regardless of the number of processes in the current system. In addition, the Linux2.6 kernel supports kernel preemption, so it is better to support real-time processes. Compared to the predecessor, the O (1) Scheduler also distinguishes between interactive processes and batch-processing processes.

The Linux2.6 kernel also supports three scheduling strategies. Where Sched_fifo and SCHED_RR are used for real-time processes, while sched_normal is used for normal processes. O (1) The scheduler modifies the Linux2.4 scheduler in two ways, the first is the process priority calculation method, and the second is the pick next algorithm.

Priority calculation of the 2.2.1 process

Priority calculation for normal processes

The normal process priority is calculated dynamically, and the calculation formula contains static precedence. In general, the higher the static priority, the longer the time slice the process can allocate, and the user can modify the static priority of the process through the nice system call.

The dynamic priority is calculated from the Formula One :

Formula One

  Dynamic priority = max (+, min (Static Priority–bonus +5, 139))

Where bonus depends on the average sleep time of the process. It can be seen that in linux2.6, the relationship between the priority of a normal process and the average sleep time is: The longer the average sleep time, the greater the bonus, and thus the higher the priority.

The average sleep time is also used to determine whether the process is an interactive process. If the following formula is met, the process is considered an interactive process:

Formula Two

Dynamic priority≤3 x static PRIORITY/4 + 28

The average sleep time is the time that the process is waiting to sleep, which increases when the process goes to sleep, and decreases after entering the running state. The update time of this value is distributed within many kernel functions: Clock interrupt Scheduler_tick (), process creation, process wake from task_interruptible state, load balancing, etc.

Priority calculation for real-time processes

The priority of the real-time process is set by Sys_sched_setschedule (). This value is not dynamically modified and always has a higher priority than the normal process. Represented in the process descriptor with the Rt_priority field.

2.2.2 Pick Next algorithm

The scheduling selection algorithm for a normal process is based on the priority of the process, and the process with the highest priority is selected by the scheduler. 2.4, Time slice counter also represents the priority of a process. The 2.6 time slice is represented by the Time_slice field in the task descriptor, and the priority is represented by the prio (normal process) or rt_priority (real-time process).

The scheduler maintains two process queue arrays for each CPU: the active array and the expire array. The elements in the array hold a process queue pointer that holds a priority. There are 140 different priority levels in the system, so both arrays are 140 in size.

When the current highest priority process needs to be selected, the 2.6 scheduler does not traverse the entire runqueue, but instead selects the first process in the current highest priority queue directly from the active array. Assume that the highest priority in all current processes is 50 (in other words, no process in the system has a priority of less than 50). The scheduler directly reads active[49], obtaining a process queue pointer with a priority of 50. The first process on the queue header is the process that is selected. The complexity of this algorithm is O (1), which solves the scalability problem of the 2.4 scheduler.

In order to implement the above algorithm the active array maintains a bitmap, and when a process is inserted into a list at a given priority level, the corresponding bit is placed. The Sched_find_first_bit () function queries the bitmap, returning the array subscript for the highest priority that is currently placed. In the example above, the Sched_find_first_bit function returns 49. The IA processor can be implemented with instructions such as BSFL.

To improve the response time of interactive processes, the O (1) Scheduler not only dynamically increases the priority of such processes, but also uses the following methods:

The time Slice (time_slice) of the process is reduced by one each time the clock tick is interrupted. When Time_slice is 0 o'clock, the scheduler determines the type of the current process, and if it is an interactive process or a real-time process, resets its time slice and reinsert the active array. If it is not an interactive process, move from the active array to the expired array. In this way, real-time processes and interactive processes always get the CPU first. However, these processes cannot always remain in the active array, otherwise the process of entering the expire array will cause starvation. When a process has consumed more than a fixed amount of CPU time, even if it is a real-time process or an interactive process is moved to the expire array.

When all the processes in the active array are moved to the expire array, the scheduler swaps the active array and the expire array. When the process is moved into the expire array, the scheduler resets its time slice, so the new active array resumes its initial condition, and the expire array is empty, starting a new round of scheduling.

2.2.3 O (1) Scheduler section

The Linux2.6 Scheduler improves the scalability of the predecessor scheduler, and the time complexity of the schedule () function is O (1). This depends on two improvements:

A The Pick next algorithm relies on the active array, without traversing the runqueue;

Two The action to update all process counter periodically is canceled, and the changes to the dynamic priority are distributed in process switching, clock tick breaks, and other kernel functions.

O (1) The algorithm for the scheduler to differentiate between interactive processes and batch processes has been significantly improved in the past, but will still fail in many cases. Some well-known programs can always slow down the performance of the scheduler, causing the interactive process to react slowly:

FIFTYP.C, THUD.C, chew.c, ring-test.c, massive_intr.c

These deficiencies spawned the con Kolivas's stair scheduling algorithm SD, and later improved version rsdl. Ingo Molnar developed CFS after RSDL and was eventually adopted by the 2.6.23 kernel. Let's start by introducing these next-generation schedulers.

3 New Generation Scheduler

Before Linux2.6.0 was released, many feared that the problem with the scheduler would hinder the release of the new version. It still has poor responsiveness to interactive applications and is not perfect for NUMA support. To solve these problems, a large number of difficult to maintain and read complex code is added to the Linux2.6.0 Scheduler module, although many performance problems have been resolved, but another serious problem has always plagued many kernel developers. That's the problem with the complexity of the code.

Con Kolivas, in 2004, presented the first Patch:staircase Scheduler to improve the scheduler design. It provides a new idea for the scheduler design. Later RSDL and CFS are based on many of the basic concepts of SD. In this chapter, we will briefly explore these three major scheduler algorithms.

3.1 Stair Scheduling algorithm Staircase Scheduler

The stair Algorithm (SD) is very different from the O (1) algorithm, and it discards the concept of dynamic precedence. and adopted a completely fair way of thinking. The main complexity of the predecessor algorithm comes from the computation of dynamic precedence, which corrects the priority of the process and distinguishes the interactive process based on the average sleep time and some hard-to-understand empirical formulas. Such code is difficult to read and maintain.

The stair algorithm is simple, but the experiment proves that it responds better to the interactive process than its predecessor, and greatly simplifies the code.

Like the O (1) algorithm, the stair algorithm maintains a list of processes for each priority and organizes the lists in the active array. When you are selected to remove a scheduled process, the SD algorithm is also read directly from the active array.

Unlike the O (1) algorithm, when a process has run out of its own time slice, it is not moved to the expire array. Instead, it is added to the lower priority list of the active array, lowering it by one level. Note, however, that this task is only inserted into the lower priority task list, and the priority of the task itself has not changed. When the time slice is run out again, the task is placed in the lower priority task queue again. Just like a staircase, the task each time after the use of their own time slice of the next staircase.

When the task is down to the lowest stair, if the time slice runs out again, it goes back to the lower-level task queue of the initial priority. For example, a process with a priority of 1, when it reaches the last level of step 140, the time to run out again will return to the priority of 2 of the task queue, that is, the second step. However, the Time_slice assigned to the task will become twice times the original. For example, the task of the time slice Time_slice 10ms, then now become 20ms. The basic principle is that when the task is down to the bottom of the staircase, the time slice is used again to go back to the next step in the beginning of the last staircase. and give the task the same time slice as it was originally allocated. Summarized as follows:

The priority of the task itself is P, and when it starts down the stairs from the nth step and reaches the bottom, it will return to the level n+1 step. and give the task n+1 times the time slice.

The above describes the normal process scheduling algorithm, real-time process or the original scheduling strategy, that is, FIFO or round Robin.

Stair algorithms can avoid process starvation, and high-priority processes will eventually compete with low-priority processes, allowing low-priority processes to eventually get execution opportunities.

For an interactive app, when it goes to sleep, other processes with the same priority as it will step down the stairs and into the low-priority process queue. When the interactive process wakes up again, it remains on a high stair staircase, which can be checked faster by the scheduler, speeding up the response time.

Advantages of Stair algorithm

From the implementation point of view, SD basically still follows the overall framework of O (1), just delete the O (1) Scheduler dynamically modify the priority of the complex code, but also eliminated the expire array, thus simplifying the code. Its most important meaning is to prove the feasibility of the idea of complete fairness.

3.2 rsdl (the rotating staircase Deadline Schedule)

RSDL is also developed by Con Kolivas, which is an improvement on the SD algorithm. The core idea is "totally fair". There is no complex dynamic priority adjustment policy.

RSDL re-introduced the expire array. It assigns a "group time quota" to each priority, we mark the group time quota as TG, and each process of the same priority has the same "priority time quota" in this article, which is expressed in TP for subsequent descriptions.

When the process has run out of its own TP, it falls to the next priority process group. This process is the same as SD, and in RSDL this process is called minor rotation. Note that TP does not equal the time slice of the process, but is less than the time slice of the process. Represents the minor rotation. The process goes from priority1 to priority140 and then back to Priority2 's queue, as shown on the left, and then step down the stairs again from Priority 2, and then bounce back to the Priority3 queue again, as shown in 1.

Figure 1.

In the SD algorithm, the low-priority process at the bottom of the stair must wait for all high-priority processes to finish executing to get the CPU. Therefore, the wait time for the low-priority process cannot be determined. In rsdl, when the high-priority process group runs out of their TG (that is, the group time quota), all processes that belong to the group are forced down to the next priority process group, regardless of whether or not the process TP in the group has not been exhausted. This allows low-priority tasks to be dispatched in a predictable future. Thus, the fairness of scheduling is improved. This is the meaning of the deadline representative in RSDL.

When the process has run out of its own time slice time_slice (T2), it will be placed in the expire array in its initial priority queue (precedence 1).

Figure 2

Major rotation is triggered when the active array is empty, or when all processes are lowered to the lowest priority:. Major rotation swaps the active array and the expire array, all processes revert to their original state, and once again the process of minor rotation from new start.

RSDL Support for interactive processes

As with SD, when the interactive process is asleep, all its competitors are reduced to the low-priority process queue because of minor rotation. When it re-enters the running state, it gets a relatively high priority, which can be quickly responded to.

3.3 CFS Complete Fair Scheduler

CFS is the scheduler that is eventually adopted by the kernel. It draws on the idea of complete fairness from the RSDL/SD, no longer tracks the sleep time of the process, and no longer attempts to differentiate between interactive processes. It treats all processes uniformly, and that is the meaning of fairness. The algorithm and implementation of CFS are quite simple, and many tests show that their performance is also very superior.

According to the author Ingo Molnar: "The work of CFS 80% can be summed up in a nutshell: CFS simulates a completely ideal multitasking processor on real hardware." Under the "fully ideal multitasking processor", each process can simultaneously get the CPU execution time. When there are two processes in the system, the CPU compute time is divided into two parts, each process gets 50%. On the actual hardware, however, when a process consumes the CPU, other processes must wait. This creates unfairness.

Assuming that there are n processes in Runqueue, the current process runs 10ms. In the "fully ideal multitasking processor", 10ms should be split equally to n processes (regardless of the nice value of each process), so the current process is due for (10/n) MS, but it runs 10ms. So CFS will punish the current process so that other processes can replace the current process as much as possible at the next scheduled time. Finally, the fair scheduling of all processes is achieved. Here are some important parts of the CFS implementation to gain an in-depth understanding of how CFS works.

How CFS implements pick next

CFS discards the Active/expire array, and uses the red-black tree to select the next scheduled process. All processes that have a status of runable are inserted into the red-black tree. At each dispatch point, the CFS Scheduler chooses the leftmost leaf node of the red and black tree as the next process to get the CPU.

Tick interrupt

In CFS, tick interrupts first update the dispatch information. Then adjust the position of the current process in the red-black tree. When the adjustment is complete, if the current process is no longer the leftmost leaf, mark the need_resched flag and call Scheduler () to complete the process switch when the interrupt returns. Otherwise the current process continues to consume CPU. From here you can see that CFS has abandoned the traditional concept of time slices. Tick interrupts only update the red and black trees, all the previous schedulers decrement the time slice in the tick interrupt, and when the time slice or quota is exhausted, the priority adjustment is triggered and re-dispatched.

Red and Black tree key value calculation

The key to understanding CFS is to understand how the red and black tree key values are calculated. The key value is calculated from three factors: one is the CPU time that the process has occupied, the other is the nice value of the current process, and the third is the current CPU load.

The CPU time that the process has occupied has the greatest effect on the key value, in fact, we can simply assume that the key value is equal to the CPU time occupied by the process when we understand CFS. Therefore, the larger the value, the greater the key value, which causes the current process to move to the right of the red and black tree. In addition, CFS stipulates that a process with a nice value of 1 gets 10% more CPU time than a process with a nice value of 0. This factor is also taken into account when calculating key values, so the larger the nice value, the greater the key value.

CFS maintains two important variables for each process: Fair_clock and Wait_runtime. In this article, we will refer to variables maintained for each process as process-level variables, called CPU-level variables maintained for each CPU, called Runqueue-level variables maintained for each runqueue.

The key value that the process inserts into the red-black tree is fair_clock–wait_runtime.

Fair_clock, by its very literal meaning, is the CPU time that a process should get, which is equal to the CPU time that the process has occupied divided by the total number of processes in the current runqueue; Wait_runtime is the wait time for the process. Their difference represents the degree of fairness of a process. The larger the value, the more unfair it is to represent the current process relative to other processes.

For interactive tasks, Wait_runtime is not updated for a long time, so it can have a higher red-black tree key value, closer to the left of the red-black tree. To get a quick response.

The red and black tree is the balance tree, and the scheduler always reads a leaf node at the far left, and the time complexity of the read operation is O (LgN).

Scheduler Manager

To support real-time processes, CFS provides a scheduler module manager. Various scheduler algorithms can be registered as a module in the manager. Different processes can choose to use different scheduler modules. In 2.6.23, CFS realizes two scheduling algorithms, CFS algorithm module and real-time dispatch module. For real-time processes, a real-time scheduling module will be used. The corresponding normal process uses the CFS algorithm. Ingo Molnar also invited Con Kolivas to write rsdl/sd as a scheduling algorithm module.

CFS Source Code Analysis

The Scheduler_tick () function in SCHED.C is called directly by the clock interrupt. It first updates the Runqueue-level variable clock, and then calls the CFS tick handler function Task_tick_fair (). Task_tick_fair in the sched_fair.c. Its main job is to call Entity_tick ()

The function Entiry_tick source code is as follows:

static void Entity_tick (struct cfs_rq *cfs_rq, struct sched_entity *curr) {struct sched_entity *next;dequeue_entity (cfs_ RQ, Curr, 0); Enqueue_entity (CFS_RQ, Curr, 0); next = __pick_next_entity (CFS_RQ); if (next = = Curr) return;__check_preempt_ Curr_fair (Cfs_rq, Next, Curr,sched_granularity (CFS_RQ));}

First call the Dequeue_entity () function to remove the current process from the red-black tree, and then call Enqueue_entity () to reinsert it. These two actions adjust the position of the current process in the red-black tree. _pick_next_entity () returns the leftmost node in the red-black tree and calls _check_preempt_curr_fair if it is no longer the current process. This function sets the dispatch flag, which is called schedule () when the interrupt is returned.

The source code for the function enqueue_entity () is as follows:

Enqueue_entity (struct Cfs_rq *cfs_rq, struct sched_entity *se, int wakeup) {/* * Update the Fair clock. */update_curr (cfs_r Q); if (wakeup) enqueue_sleeper (CFS_RQ, SE); Update_stats_enqueue (CFS_RQ, SE); __enqueue_entity (CFS_RQ, SE);}

Its first job is to update the scheduling information. The process is then inserted into the red-black tree. where the Update_curr () function is the core. Complete the scheduling information update.

static void Update_curr (struct cfs_rq *cfs_rq) {struct Sched_entity *curr = Cfs_rq_curr (CFS_RQ); unsigned long delta_exec; if (unlikely (!curr)) Return;delta_exec = (unsigned long) (rq_of (CFS_RQ)->clock-curr->exec_start);curr-> Delta_exec + = Delta_exec;if (unlikely (Curr->delta_exec > Sysctl_sched_stat_granularity)) {__update_curr (CFS_RQ , curr); curr->delta_exec = 0;} Curr->exec_start = rq_of (CFS_RQ)->clock;}

The function first counts the CPU time obtained by the current process, and the rq_of (CFS_RQ)->clock value is updated in the tick interrupt, Curr->exec_start is the timestamp at which the current process started acquiring the CPU. A two-value subtraction is the CPU time obtained by the current process. The variable is stored in curr->delta_exec. Then call __update_curr ()

__update_curr (struct Cfs_rq *cfs_rq, struct sched_entity *curr) {unsigned long delta, Delta_ exec, Delta_fair, delta_mine;struct load_weight *LW = &cfs_rq-load;unsigned long load = lw->weight;delta_exec = cur R->delta_exec;schedstat_set (Curr->exec_max, Max ((u64) delta_exec, Curr->exec_max)); curr->sum_exec_ Runtime + = Delta_exec;cfs_rq->exec_clock + = Delta_exec;if (unlikely (!load)) Return;delta_fair = Calc_delta_fair ( Delta_exec, LW);d elta_mine = Calc_delta_mine (delta_exec, Curr->load.weight, LW); if (Cfs_rq->sleeper_bonus > sysctl_sched_min_granularity) {delta = min ((u64) delta_mine, cfs_rq->sleeper_bonus);d elta = min (Delta, unsigned Long) ((long) sysctl_sched_runtime_limit-curr->wait_runtime)); Cfs_rq->sleeper_bonus-= delta;delta_mine-= Delta;} Cfs_rq->fair_clock + = Delta_fair;add_wait_runtime (Cfs_rq, Curr, delta_mine-delta_exec);}

The main work of __update_curr () is to update the aforementioned Fair_clock and Wait_runtime. The difference between the two values is the key value that the subsequent process inserts into the red-black tree. The variable delta_exec holds the CPU time consumed by the current process obtained earlier. The function Calc_delta_fair () modifies the delta_exec based on the CPU load (stored in the LW variable), then saves the result to the Delta_fair variable, and then Fair_clock increases the Delta_fair. The function Calc_delta_mine () is saved in delta_mine based on the nice value (saved in curr->load.weight) and the CPU load remediation delta_exec. Based on the comments in the source code, Delta_mine represents the CPU time that the current process should receive.

Then add the Delta_fair to Fair_clock and add delta_mine-delta_exec to Wait_runtime. The function Add_wait_runtime two times to subtract the wait_runtime from the delta_mine-delta_exec. Since the calc_delt_xx () function makes only minor changes to delta_exec, we can ignore their modifications to the delta_exec for ease of discussion. The final result can be approximated as fair_clock increases by a factor of delta_exec, while Wait_runtime reduces the delta_exec by twice times. Therefore, the key value Fair_clock-wait_runtime finally increases by one times the delta_exec value. The key value increases so that the current process is inserted into the red and black tree and moves to the right.

CFS Summary

The above discussion shows that CFS has made great changes to the previous scheduler. Replace the priority series with the red Black tree, replace the dynamic priority strategy with a completely fair strategy, and introduce the module manager, which modifies the code of the original Linux2.6.0 Scheduler module 70%. The structure is more simple and flexible, the algorithm adaptability is higher. Compared to rsdl, they are all based on the principle of total fairness, but their implementation is completely different. In contrast, CFS is clearer and simpler and has better extensibility.

CFS also has an important feature, that is, the scheduling of small size. In the scheduler before CFS, each process is preempted only after it has run out of time slices or its own time quota, except that the process calls some blocking functions and actively participates in scheduling. The CFS is checked at each tick and is preempted if the current process is no longer on the left side of the red and black tree. On high-load servers, better scheduling performance can be achieved by adjusting the granularity of the scheduling.

4 Summary

Based on the development of Linux scheduler, the background knowledge of the development of CFS scheduler can be further understood. In fact, any scheduler algorithm is not yet able to meet the needs of all applications, CFS has some negative test reports. We believe that with the continuous development of Linux, there will be a new scheduling algorithm, let us wait and see. Aspiring programmers can also try to make a contribution to Linux in this area.

A brief introduction to Linux Scheduler development

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A brief introduction to Linux Scheduler development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A brief introduction to Linux Scheduler development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support