Linux:the Schedule algorithm in Linux kernel

Last Update:2016-04-20 Source: Internet

Author: User

Tags cpu usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A scheduling algorithm called CFS (completely-fair-scheduler) used in Linux kernel. The description found on the internet is not intuitive and difficult to read. But found a very easy-to-understand (Avenue to Jane Ah ...) ）：

Http://people.redhat.com/mingo/cfs-scheduler/sched-design-CFS.txt

To prevent links from invalidating, paste the full text as follows:

This is the CFS scheduler.80% of CFS's design can be summed-in a single sentence:cfs basicallymodels an "ideal, precis E multi-tasking CPU "on real hardware." Ideal multi-tasking CPU "is a (non-existent:-)) CPUs that have 100%physical power and which can run each task at precise Equal speed, Inparallel, all at 1/nr_running speed. For Example:if there is 2 tasksrunning then it runs each at 50% physical power-totally in parallel. On real hardware, we can run only a single task at once, so while Thatone task runs, the other tasks, is waiting for The CPU is at Adisadvantage-the current task gets an unfair amount of CPU time. Incfs This fairness imbalance is expressed and tracked via the Per-taskp->wait_runtime (nanosec-unit) value. "Wait_runtime" is the amount oftime the task should now run in the CPU for it to become completely fairand balanced. (Small Detail:on ' ideal ' hardware, the p->wait_runtime value would always being zero-no task would ever get ' out of B Alance' From the ' ideal ' share of CPU time. CFS ' s task picking logic is based on this p->wait_runtime value and itis thus very simple:it always tries to run the Task with the Largestp->wait_runtime value. In other words, CFS tries to run the task withthe ' gravest need ' for more CPU time. So CFS all tries to split upcpu time between runnable tasks as close to ' ideal multitaskinghardware ' as possible. Most of the rest of CFS's design just falls out of this really simpleconcept, with a few add-on embellishments like nice l Evels,multiprocessing and various algorithm variants to recognize sleepers. In practice it works like this:the system runs a task a bit, and whenthe task schedules (or a scheduler tick happens) the Task ' s CPU usage is ' accounted for ': the (small) time it just spent using the physical Cpuis deducted (minus) from P->wait_r Untime. [Minus the ' fair share ' it would havegotten anyway]. Once P->wait_runtime gets low enough so, Anothertask becomes the ' leftmost task ' of The time-ordered Rbtree it maintains (plus a small amount of ' granularity ' distance relative to the leftmosttask so that W e do not over-schedule tasks and trash the cache) then thenew leftmost task was picked and the current task is preempted. The Rq->fair_clock value tracks the ' CPU time a runnable task would havefairly gotten, had it been runnable during that Time '. So by Usingrq->fair_clock values we can accurately timestamp and measure the ' expected CPU time ' a task should has Gott En. All runnable tasks aresorted under the Rbtree by the "Rq->fair_clock-p->wait_runtime" key, Andcfs picks the ' leftmost ' Task and sticks to it. As the system progressesforwards, newly woken tasks is put into the tree more and more to theright-slowly but surely GI Ving Tsun a chance for every task to become the ' leftmost task ' and thus get on the CPU within a deterministic amount oftime. Some Implementation Details:-The introduction of scheduling Classes:an extensible hierarchy of Scheduler ModuLes. These modules encapsulate scheduling policy details and is handled by the scheduler core without the core code Assumi Ng about them too much. -SCHED_FAIR.C implements the ' CFS Desktop Scheduler ': It's a replacement for the vanilla Scheduler ' s sched_other Inter Activity code. I ' d like to give credits to Con Kolivas for the general approach Here:he have proven via rsdl/sd that ' fair scheduling ' I s possible and that it results in better desktop scheduling. Kudos con! The CFS patch uses a completely different approach and implementation from RSDL/SD. My goal was to make CFS's interactivity quality exceed that's rsdl/sd, which is a high standard to meet:-) Testing feedback is welcome to decide this one or another. All of SD's logic could be added via a KERNEL/SCHED_SD.C module as well, if Con was interested in s Uch an approach. ] CFS ' s design is quite radical:it does does use runqueues, it uses a time-ordered rbtree to build a ' TimEline ' of the future task execution, and thus have no ' array switch ' artifacts (by which both the vanilla scheduler and RSD L/SD is affected). CFS uses nanosecond granularity accounting and does not rely on any jiffies or other HZ detail. Thus the CFS Scheduler have no notion of ' timeslices ' and has no heuristics whatsoever. There is only one central tunable:/proc/sys/kernel/sched_granularity_ns which can be used to tune the Schedul Er from ' Desktop ' ("Low latencies") to ' server ' (good batching) workloads. It defaults to a setting suitable for desktop workloads. Sched_batch is handled by the CFS Scheduler module too. Due to its design, the CFS scheduler are not prone to any of the ' attacks ' that exist today against the heuristics of the Stock Scheduler:fiftyp.c, THUD.C, chew.c, ring-test.c, massive_intr.c all work fine and does not impact interactivity and produce the expected behavior. The CFS scheduler have a much stronger handling of nice levels and SCHed_batch:both types of workloads should be isolated much more agressively than under the vanilla scheduler. (another detail:due to nanosec accounting and timeline sorting, Sched_yield () are very simple under CFS, and In fact under CFS Sched_yield () behaves much better than under any other scheduler I had tested so far. )-Sched_rt.c implements Sched_fifo and SCHED_RR semantics, in a simpler it than the vanilla scheduler does. It uses runqueues (for all the RT priority levels, instead of the vanilla scheduler) and it needs no expire D Array. -reworked/sanitized SMP load-balancing:the runqueue-walking assumptions is gone from the load-balancing code now, and Iterators of the scheduling modules are used. The balancing code got quite a bit simpler as a result.

Have time to read the book "Linux Kernel Development" and "Understanding Linux KERNRL". There's a talk inside. There is also a code example.
Of course, Linux schedule algorithm certainly more than this one. Continuously updated.

Linux:the Schedule algorithm in Linux kernel

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More