Linux Process Scheduling chapter: Some Notes

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the past few days, I have read a lot of Linux Process SchedulingArticleAlthough I have mastered a rough idea, the more details I read, the less confidence I have to write this article. I have published a series of articles about Linux Process scheduling, it is very consistent with my current state of mind, just like a blind person's feeling, learning something, very pleasant, but there is always a kind of fear. But I haven't written a blog post for a long time. I still need to write an article. Please criticize and correct the incorrect content.

CPU is a valuable resource. Linux is a multi-task OS, which requires that each task or process has its own CPU. Think about how difficult it is to switch processes if there is only one CPU but 80 processes in the task_running state. If you cannot understand this difficulty, you can think about it. If you have 80 girlfriends at the same time, you have different personalities and different hobbies, how difficult it is to make every girlfriend think she is your only girlfriend. Liu Ming has an interesting analogy in "Introduction to Linux scheduler BFS". the Linux kernel scheduler is like an awkward housewife,Meeting the children's requirements for dinner may hurt the appetite of the elderly. It is too difficult to make a table to satisfy both men and women. Linux is a general operating system and cannot predict the characteristics of processes running on it, just like the taste of a family is different, children like sweet, Dad like salty, the scheduler is also difficult for the elderly to enjoy a light taste. Generally, processes can be divided into several types based on their characteristics: 1. Interactive Process This is relatively simple. Think about your use of VI to edit text, a large amount of human-computer interaction, the process continues to enter the sleep state waiting for your input, CPU usage is not high, but requires rapid response, for example, if you press the keyboard and it takes 10 seconds to display it in the editor, how bad the user experience is. 2 batch processing process The most typical process is to compile a project. If we compile a large project, we have to make it. It may take one hour to compile successfully, but no one will stare at the screen to see the compilation process, and we are more concerned with the compilation results. This kind of process is a kind of process that you don't love. This requires him to be running and don't have to pay too much attention and resources to him. 3. Real-time process The original meaning of a real-time process is that a given job must be completed within a specified period of time, which does not require you to be fast, but must be completed before the specified time, they are used in precision systems like rocket missiles. This kind of real-time is called hard real-time. For general operating systems such as Linux, it is a bit difficult to achieve hard real-time, such as interruption, disk seek, and bus requisition lock, so many mechanisms bring too much uncertainty, it is difficult to achieve hard real-time. Linux calls real-time processes soft and real-time as much as possible. If not, no devastating consequences will occur. For example, in a video player, the decoding speed is a little slow. At most, video playback has a point card, and the user experience is poor, but the machine will not be destroyed. Different processes use different Scheduling Policies to serve them:

For real-time processes, we adopt the real-time scheduling policy: FIFO or round robin (time slice rotation );
For batch processing and interaction processes, we use the CFS scheduler.

In earlier versions, when Linux uses the O (1) scheduler, there are complicated methods to identify interactive processes and reward interactive processes,AlgorithmIt is complicated. After the CFS scheduler is adopted, the core idea is simple and "completely fair", reducingCodeComplexity.

Let's talk about real-time scheduling. Real-time scheduling has two dimensions: priority and scheduling policy. Linux UsersProgramYou can call sched_setscheduler to set the priority and scheduling policy of a process. Real-time process 0 ~ There are 100 different priorities in 99, with the highest priority of 0 and the lowest priority of 99, corresponding to 100 queues. For Real-Time Processes, high-priority real-time processes exist, so low-priority real-time processes cannot find CPU resources, and general processes cannot find CPU resources. The next step is the scheduling policy. One is FIFO, first-in-first-out. If the process with the highest priority is FIFO, It will be executed until the process exits, or it calls sched_yield to take the initiative to give up the resources, or suddenly a process with a higher priority suddenly exists, occupying its CPU. The other is time slice rotation. If there are multiple processes in the real-time process queue with the highest priority, the current process runs out of its time slice and automatically routes it to the end of the team, select the process in the queue header of the same queue for execution. These tasks are done in the task_tick_rt function. As mentioned above, a real-time process exists, and a common process cannot find CPU resources completely. After adding a group scheduling policy in Linux, there are two parameters:/proc/sys/kernel/sched_rt_period_us and/proc/sys/kernel/sched_rt_runtime_us. These two parameters indicate that they are in the sched_rt_peroid_us period, the total running time of all real-time processes cannot exceed sched_rt_runtime_us. The default values are 1 second and 0.95 second. In other words, the total running time of all real-time processes within one second cannot exceed 0.95 seconds, and the remaining 0.5 seconds are left to common processes. For the CFS scheduler of common processes, the content is relatively independent. If there is time, there will be a special blog summary later. 2. process scheduling in the SMP Era What's even worse is that we are now in the multi-core era. Even if we don't talk about the high configuration of servers, my laptop configuration is not high, and it is already quad-core. It can be seen that the multi-core computer is full of streets. For the SMP load, a new problem is introduced. When a single core is used, there is only one CPU. You do not need to select several running queues, that is, a run queue. However, when multiple cores are used, a run queue is created. Do all CPUs choose to extract executable processes from the run queue or run queue for each CPU? In fact, it is okay. 1 per CPU run queue Currently, the Linux kernel uses per CPU run queue. Each CPU runs its own run queue to select executable processes, reducing competition. In addition to this benefit, there is another or even more important benefit: Reuse of the cache. The process is in the running queue of the CPU. After multiple scheduling times (the load balance of multiple CPUs is not considered for the moment), the process is still run by the same CPU, maybe the variable memory at the last run is still in the cache. If there is only one running queue and multiple CPUs, the CPU of the last execution is probably not the same as the cpu Of the current execution, so the cache of the last execution is not used. 2 single run queue All CPUs use the same queue, which is also acceptable. Some people feel that it is too good because when a CPU accesses the run queue, other CPUs cannot access the run queue, A large lock reduces the performance, especially when there are a lot of cpus. If you have 1024 CPUs, but only one running queue is available, the bottleneck of run queue is obvious. Here, many packages may be said. What else is so tangled? It must be better to run the queue per CPU. However, the actual situation is much more complicated than the theory. BFS (brain fuck sched) is a single queue scheduler, which is very effective for desktop applications. If you are interested, you can read "Introduction to Linux scheduler BFS" and "BFs, the fast future of Linux Desktop". Per CPU run queue is not completely flawless. Since there are multiple queues, it may be unbalanced. For example, if one CPU is busy and the other CPU has nothing to do, this situation itself is a waste of CPU resources. In order to solve this problem, the kernel must initiate the load balance between CPUs. Load Balancing between two CPUs. To obtain the locks of two run queue, it will damage concurrency. In addition, server Load balancer also consumes CPU resources. According to our hardware knowledge, if I have two physical CPUs and each physical CPU has two cores, then my CPU is 4 cores, each core can implement multiple hardware threads or virtual CPUs through technologies similar to SMT (simultaneous multi-threading), such as Intel's hyper-Threading Technology, if two virtual CPUs are available for each core, my hardware has eight virtual CPUs. The relationship between these virtual CPUs is different. To cope with multi-core and multi-CPU, Linux introduces the concept of scheduling domain, which is to divide different scopes based on the kinship distance. There are four layers based on the kinship distance: 1 hyper-threading Virtual CPU evolved from the same kernel using hyper-Threading Technology. We know that the CPU speed is faster than the memory access speed. If the cache is not hit, the CPU has nothing to do while waiting for the memory. This is a waste of resources, so we can switch to other threads. In this way, multiple threads reuse one core at a time. To be honest, I don't really understand this hyper-Threading Technology. I'll look at the Intel manual in another day. 2. Different cores of the same physical CPU 3. Different physical CPUs on the same NUMA Node 4. Physical CPUs on different NUMA nodes. NUMA (non-consistent Memory System), the CPU and RAM are grouped by nodes. The cost of accessing the local ram of the same CPU to the node is lower than that of accessing other nodes. The closer the CPU relationship is, the lower the cost of Process Migration between them. For example, the cost of migrating a process between two cores under the same physical CPU is lower than that between different physical CPUs, because some caches can continue to be used. Load Balance is to migrate a process from one CPU (CPU in a broad sense) to another CPU. If a process is migrated from one hyper-threading domain of one core to another, it doesn't matter, because the cost is not high, it seems that I moved from the fourth floor of the office building to the third floor of the office building, the cost is very low, but the farther the CPU relationship is, the higher the price, it is like working in Nanjing in the morning, in the afternoon, I was asked to move to Urumqi to work, so this kind of migration should be as few as possible. In fact, this is also true for the Linux kernel. OK, continue to our scheduling domain: if we have two physical CPUs, each of which has two cores, and each core has two virtual CPUs evolved through hyper-threading, so let's take a look at the figure below (this figure is copied from the Linux kernel SMP Server Load balancer analysis) Each virtual CPU belongs to a group of scheduling domain sched_domain, just like I am in Yuhua district, I am in Nanjing, I am in Jiangsu, and I am in China, the different regions I belong to reflect the different levels of my location. Each scheduling domain is divided into multiple groups, as if China is divided into more than 30 provincial administrative units, each provincial administrative unit is divided into different cities, and each city is divided into different districts. This is only the SMP scheduling of common processes. Think about real-time processes. Each CPU has a queue, maybe the real-time process with the highest priority of CPU-A is not like the real-time process with the second highest priority of CPU-B. At this time, you need to drag the real-time process in CPU-B to CPU-A for execution, otherwise, a low-priority real-time process runs abnormally first.Pull_rt_task. References 1Linux Kernel SMP Load Balancing 2 Linux Process Scheduling 3. Linux Scheduler 4 Introduction to Linux scheduler BFS 5 BFS: The Fast future of Linux Desktop This article from http://blog.chinaunix.net/uid-24774106-id-3372932.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux Process Scheduling chapter: Some Notes

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support