In-depth analysis of Linux scheduling mechanisms

Source: Internet
Author: User

A Description

Taking linux-2.4.10 as an example, this paper mainly analyzes the schedule function and its related functions in Linux process scheduling module. In addition, the relevant prerequisite knowledge will also be explained. The default system platform is the PC of its own i386 architecture.

Two Prerequisite knowledge

Before schedule analysis, it is necessary to briefly explain the system boot process, memory allocation and so on. This will naturally transition to the schedule module.

The first is the dependency between the various functional modules of Linux:

Visible process scheduling is the core of the entire kernel. But this part, I want to explain is how my PC is loading the operating system from the hard disk into memory, and start the process scheduling module. Then is the concrete analysis of the face schedule.

First, start the operating system section, involving up to three files:/arch/i386/boot/bootsect.s,/arch/i386/boot/setup.s,/arch/i386/boot/compressed/head.s. After compiling and installing a Linux system, the BOOTSECT.S module is placed on the first sector of the bootable device (disk boot sector, 512 bytes). Then the following starts the START process, three files in memory distribution with the location of the move as.

After this series of procedures, the program jumps to the initialization program init in the System module, which is the/init/main.c file. The program performs a series of initialization tasks, such as register initialization, memory initialization, interrupt settings, and so on. After that, the memory is allocated as follows:

Thereafter, the CPU sequentially reads the program from memory and executes it. After the previous main moved from the kernel state to the user state, the operating system established task 0, the process scheduler. The schedule module is followed by the process creation (fork), dispatch (schedule), destruction (exit) and various resources allocation and management of the whole Linux operating system. It is worth saying that the first process that schedule will create is init (pid=1), note that it is not the previous/INIT/MAIN.C program segment. If it is under the Gnu/debian system, the INIT process will read RCS.D,RCN.D (RC0.D~RC6.D) sequentially, rc.local three run command scripts, and so on, then the initialization of the system is complete, a series of system services are started, The system enters single-user or multi-user status. Then init reads the/etc/inittab, starts the terminal equipment ((exec) Getty) For the user to log in, such as Debian will start 6 tty, you can use the combination key CTRL+ALT+FN (F1~F6) to switch.

Here we know how Linux started the process scheduling module, but also know the process scheduling module started the first process init and after the system initialization and landing process. The following is an analysis of the schedule code and its associated function calls.

Three Data structures involved in process scheduling

Files:/linux/include/linux/sched.h

The following is a brief introduction to the two fields in the data structure task_struct.

In Linux, Processes (Linux uses lightweight processes to emulate threads) use the core data structures. A process is represented in the core by a TASK_STRUCT structure, which contains a large number of information describing the process, including the following information about the scheduler:

1. State

volatile long state; /*-1 unrunnable, 0 runnable, >0 stopped */

The process state of Linux is divided into three main categories: operational (task_running, equivalent to run state and ready state); suspended (task_interruptible, task_uninterruptible, and task_stopped) (Task_zombie), the scheduler mainly deals with processes that can be run and suspended in two states, where task_stopped is specifically used for the response of IPC signals such as SIGSTP, and Task_ Zombie refers to a "zombie" process that has exited and has not yet been withdrawn by the parent process.

2. Counter

Long counter;

This property records the time that the process is allowed to run within the current time slice.

Four Ready process selection algorithm (i.e. process scheduling algorithm)

Files:/kernel/sched.c

1. Context Switches

Switching from the context of one process to the context of another process, because of its high frequency, is often the key to the efficiency of the scheduler. The SWITCH_TO macro is called in the schedule () function, which implements a true switch between processes, and its code is stored in include/i386/system.h. SWITCH_TO macros are written in an embedded assembly, which is more difficult to understand.

Implemented by Switch_to (), and its code snippet is called during the schedule () process, implemented as a macro.

The switch_to () function returns normally, and the return address on the stack is the TASK_STRUCT::THREAD::EIP of the new process, the position where the new process was set to continue running when it was last suspended (the label "1:" At the time of the last execution of the switch_to ()). Run in the context of the new process.

This involves a function such as wakeup,sleepon to sleep and wake the process.

2. Select an algorithm

The Linux schedule () function iterates through all the processes in the ready queue and calls the goodness () function to calculate the weight of each process weight, from which the process of choosing the most weighted value is put into operation.

The Linux Scheduler is primarily implemented in the schedule () function.

Scheduling steps:

The Schedule function workflow is as follows:

(1) Clean up the current running process

(2) Select the next process to run (Pick_next_task)

(3) Set the environment for the new process to run

(4) Process context switch

Five. The Linux scheduler divides the process into three categories

Process scheduling is the core function of the operating system. Scheduler is only a part of the scheduling process, process scheduling is a very complex process, requiring multiple systems to work together to complete. The focus of this article is only the scheduler, whose main task is to select the most appropriate one in all RUNNING processes. As a general-purpose operating system, the Linux scheduler divides processes into three categories:

1. Interactive process

Such processes have a lot of human-computer interaction, so the process is constantly sleeping and waiting for user input. Typical applications such as the Editor VI. Such processes require a higher response time for the system, or the user may experience slow system response.

2. Batch Process

Such processes do not require human-computer interaction and run in the background, requiring a large amount of system resources. But can tolerate response delays. such as compilers.

3. Real-time process

Real-time demand for scheduling delays is highest, and these processes often perform very important operations that require immediate response and execution. such as video playback software or aircraft flight control system, it is obvious that such programs can not tolerate long delays in scheduling, light impact on the film screening effect, heavy machine destroyed human death.

Linux uses different scheduling strategies based on the different classifications of the process. For real-time processes, a FIFO or Round Robin scheduling strategy is used. For normal processes, it is necessary to differentiate between interactive and batch-processing. Traditional Linux schedulers Increase the priority of interactive applications, enabling them to be dispatched more quickly. The core idea of new schedulers such as CFS and RSDL is "completely fair". This design concept not only greatly simplifies the code complexity of the scheduler, but also provides a more perfect support for various scheduling requirements.

Six. Timing: When does the dispatch occur? That is: When is the schedule () function called?

There are two main ways in which scheduling occurs:

1: Active scheduling (voluntary dispatch)

The Process Scheduler function schedule () is invoked directly in the kernel, and when a process needs to wait for a resource and temporarily stops running, it puts the state on hold (sleep) and proactively requests the dispatch to yield the CPU.

2: Passive scheduling (preemptive scheduling, forced scheduling)

User preemption (2.4 2.6)

Kernel preemption (2.6)

(1) User preemption occurs in: Return user space from system call;

Returns the user space from the interrupt handler.

User preemption occurs when the kernel is about to return to user space, and if the need_resched flag is set, it causes schedule () to be called.

The active dispatch is the user program own dispatch schedule, perhaps some people may think own code can reference schedule? Maybe not, but you know WAIT4 we can call, we did not give WAIT4 code, but we know that in the execution of the WAIT4 effect is the parent process is suspended, the so-called hang is not running, give up the CPU, where the process scheduling is obvious, In fact, there are several lines in the code:

Current->state = Task_interrupible;schedule ();

There's also exit.

Current->state = Task_zombie; Schedule ();

These 2 kinds of process scheduling, from the code can also be seen (the state has been changed to sleep and zombie, and then to schedule the running process, the current process will no longer occupy the CPU running), from the effect can be seen. This means that the user program can perform process scheduling on its own.

(2) Kernel preemption: In a system that does not support kernel preemption, once a process/thread is running in kernel space, it can be executed until it voluntarily discards or the time slice runs out. Such a very urgent process or thread will not run for a long time.

In systems that support kernel preemption, higher-priority processes/threads can preempt low-priority processes/threads that are running in kernel space.

With regard to preemptive scheduling (forced scheduling), it is necessary to know that after the CPU executes the current instruction, before executing the next instruction, the CPU will determine whether an interrupt or an exception occurred after the execution of the current instruction, and if so, the CPU would compare the incoming interrupt priority and the current process priority (with hardware participation implementation, such as interrupt controller 8259A chip, to determine the priority by comparing the value of the Register, the entry address of the interrupt service program is formed by the hardware participation implementation, and so on, the specific implementation please see the relevant information and books), if the new task is higher priority, the execution of the interrupt service program, when the return interruption, Executes the process Scheduler function schedule.

About preemptive scheduling, in the system code, in addition to the previous WAIT4 and exit (these two system functions are voluntary or active scheduling), there is a place will appear schedule, is the break return code inside, there is also added restrictions, We can look at this code (the so-called Interrupt return code, which is to recover the code of the interrupt field, each execution of the interrupt will be executed to the code, no matter what the interruption), this code is:

277 Testl $ (vm_mask | 3),%EAX # Return to VM86 mode or non-supervisor?

278 jne Ret_with_reschedule

279 JMP Restore_all

We see Jne ret_with_reschedule before this one condition, the code does not explain too much, meaning: When the interrupt occurs in the user control will only execute ret_with_reschedule, then we see that Process scheduling can also occur on the eve of an outage returning to user space.

Simply said process scheduling occurs in two cases: the interruption of the return of user space on the eve, and the user program voluntarily abandon the CPU, these 2 situations will occur in the process of scheduling.

In a system that supports kernel preemption, the kernel is not allowed to be preempted under certain exceptions:

(a) The kernel is running an interrupt handler, and the process scheduler function, schedule (), will judge this and print an error message if it is called in the interrupt.

(b) The kernel is processing the bottom half (the bottom half of the interrupt) in the context of the interrupt, and a soft interrupt is performed before the hardware interrupt is returned and is still in the interrupt context.

(c) The process is holding spinlock spin lock, Writelock/readlock read-write lock, etc., when holding these locks, should not be preempted, otherwise the preemption will cause the other CPU for a long time can not obtain locks and deadlock.

(d) The kernel is executing the scheduler scheduler

To ensure that the Linux kernel is not preempted in the above scenario, the preemption kernel uses a variable preempt_count called the kernel preemption count. This variable is set in the process of the thread_info structure, whenever the kernel to enter the above several states, the variable preempt_count is added 1, indicating that the kernel does not allow preemption, and vice versa minus 1.

Kernel preemption can occur in:

1: Interrupt handler complete, before returning to kernel space

2: When the kernel code once again has a preemption, such as unlocking and enabling soft interrupts.

Dispatch Flag--tif_need_resched

Role: The kernel provides a need_resched flag to indicate whether a schedule needs to be re-executed.

Set: When a process runs out of its time slice, this flag is set

This flag bit is also set when a higher-priority process enters the executable state

Process concurrency can not be scheduled by the process consciously, only by interruption (clock interrupt).

Seven. Kernel scheduling and kernel understanding

1. Is kernel scheduling a task??

A: No, kernel scheduling can only be said to be a task scheduling algorithm, it is not always running, only when the end of the task/time slice to execute, select the next task to run.

2. What is the relationship between the task and the kernel?

A: The task is run under the management of the kernel, or it can be said that the task is running in this environment of the kernel.

Kernel scheduling is only part of the kernel functionality. The kernel itself does not exist scheduling, it can be said to be running, mainly running within and between tasks, it is responsible for the resource processing required by the task.

3. What kind of association does it have with the task that is running the highest priority??

A: No matter how high the priority, it is running in the kernel environment, the kernel is always running, but it is the CPU and other resources assigned to the task, let it run it.

4. What is a kernel?

A: In fact, the kernel is not a process, nor is it a current path.

The kernel is fused into the application through the APIs he provides. That is, the kernel is just an abstraction, he does not exist, but at some specific time and specific conditions to run in order to provide our applications with a variety of services.

In-depth analysis of Linux scheduling mechanisms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.