What the hell did 7.switch_to do?

Source: Internet
Author: User

Lecture Six (Portal: Fork + Execve: The birth of a process) we introduced the birth of the process, but there are many processes in the operating system, how to switch between processes, and what are the mysteries? We go back to the source, read carefully.

The principle of operating system describes a large number of process scheduling algorithms, these algorithms from the perspective of implementation is only to choose a new process from the run queue, the process of selecting the use of different strategies.

It is more important to understand the working mechanism of the operating system than the process scheduling timing and process switching mechanism.

Timing of process scheduling:

1. Interrupt processing (including clock interrupts, I/O interrupts, system calls, and exceptions), call schedule () directly, or call schedule () based on the need_resched tag when returning to the user state;

2. Kernel thread can directly call schedule () for process switching, can also be scheduled during interrupt processing, that is, the kernel thread as a class of special processes can be active scheduling, can also be passive scheduling;

3. The user-state process can not realize the active scheduling, only through the kernel state after a certain point in time to dispatch, that is, in the interrupt processing process scheduling.

Switching of processes

To control the execution of the process, the kernel must have the ability to suspend a process that is executing on the CPU and resume execution of a previously suspended process called process switching, task switching, context switching;

Suspending a process that is executing on the CPU is different from saving the scene at the time of the outage, before and after the interrupt is in the same process context, but only by the user-state to the kernel state execution;

The process context contains all the information required by the process execution

1. User address space: Including program code, data, user stack, etc.

2. Control information: Process descriptor, kernel stack, etc.

3. Hardware context (note that interrupts are also saved in the hardware context except that the method is saved differently)

The schedule () function selects a new process to run and invokes Context_switch for context switching, which calls switch_to for critical context switching.

next= pick_next_task (RQ, prev); The process scheduling algorithm encapsulates this function's internal

Context_switch (Rq,prev, next); Process Context Switch

switch_to takes advantage of the prev and next two parameters: Prev points to the current process, and next points to the scheduled process

Let's start with the schedule () function and step through the analysis to see what the kernel does for the switching process. The schedule () function is located in the linux-3.18.6\kernel\sched\core.c file.

asmlinkage__visible void __sched Schedule (void) {         struct task_struct *tsk = current;          Sched_submit_work (tsk);         __schedule ();}


__visible indicates that schedule () can be called anywhere in the kernel. The tail of schedule () calls __schedule (), and we enter __schedule () to see if __schedule () is located in the linux-3.18.6\kernel\sched\core.c file.

Enter __schedule (), followed by the function stack: Schedule (), __schedule ().


Static void__sched __schedule (void) {struct task_struct *prev, *next;         unsigned long *switch_count;         struct RQ *rq; int CPU;         Need_resched:preempt_disable ();         CPU = smp_processor_id ();         RQ = CPU_RQ (CPU);         Rcu_note_context_switch (CPU);          Prev = rq->curr;          Schedule_debug (prev);          if (Sched_feat (Hrtick)) hrtick_clear (RQ); /* * Make sure thatsignal_pending_state ()->signal_pending () below * can ' t be reordered With__set_curr          Ent_state (task_interruptible) * done by the caller to avoid the race withsignal_wake_up ().         */Smp_mb__before_spinlock ();          RAW_SPIN_LOCK_IRQ (&rq->lock);         Switch_count = &prev->nivcsw; if (prev->state &&!) (                            Preempt_count () & preempt_active) {if (Unlikely (Signal_pending_state (Prev->state, prev))) { Prev->state =task_runnING;                            } else {deactivate_task (Rq,prev, dequeue_sleep);                             PREV-&GT;ON_RQ = 0; /* * If A worker went to sleep, notify and askworkqueue * whether                             It wants to wake up a task Tomaintain * concurrency. */if (prev->flags& pf_wq_worker) {Structtask_struc                                      T *to_wakeup;                                     To_wakeup =wq_worker_sleeping (prev, CPU);                            if (to_wakeup) try_to_wake_up_local (to_wakeup);         }} Switch_count =&prev->nvcsw; } if (task_on_rq_queued (prev) | |          Rq->skip_clock_update < 0) Update_rq_clock (RQ); Next = pick_next_tAsk (RQ, prev);         Clear_tsk_need_resched (prev);         Clear_preempt_need_resched ();          rq->skip_clock_update = 0;                   if (likely (prev! = next)) {rq->nr_switches++;                   Rq->curr = Next;                    ++*switch_count; Context_switch (RQ, Prev,next);                    /* Unlocks the RQ */* * The context switch has flipped the stackfrom under US  * and restored the local variables which weresaved when * This task called schedule () in                    The Past.prev = = Current * was still correct, but it can be moved Toanother CPU/RQ.                   */CPU = SMP_PROCESSOR_ID ();         RQ = CPU_RQ (CPU);          } else Raw_spin_unlock_irq (&rq->lock);          Post_schedule (RQ);         Sched_preempt_enable_no_resched (); if (need_resched ()) goto need_resched;}


The key code for __schedule (void) is

Next =pick_next_task (RQ, prev)


Pick_next_task () encapsulates the Linux process scheduling policy, and we don't care what process scheduling strategy is used by Linux (interested children's shoes can delve into the pick_next_task () function), in short, Linux chooses the next process to be executed.

Next =pick_next_task (RQ, prev)

Get next after the next work to complete the process context switch, the process context switch mainly through

Context_switch (Rq,prev, next);

To achieve.

Let's go into Context_switch (Rq,prev, next); function to see the process of process context switching. Context_switch () is located in the linux-3.18.6\kernel\sched\core.c file.

Note that the function stack after entering Context_switch () is: Schedule (), __schedule () –> context_switch ().


Static Inlinevoidcontext_switch (Structrq *rq, struct task_struct *prev, struct task_struct *next) {s          Truct mm_struct *mm, *OLDMM;          Prepare_task_switch (RQ, Prev, next);         MM = next->mm;         OLDMM = prev->active_mm; /* For Paravirt, this is coupled with an exitin switch_to to * combine the page table reload and the S          Witchbackend into * one hypercall.          */Arch_start_context_switch (prev);                   if (!mm) {next->active_mm = OLDMM;                   Atomic_inc (&oldmm->mm_count);         Enter_lazy_tlb (OLDMM, next);          } else switch_mm (OLDMM, MM, next);                   if (!prev->mm) {prev->active_mm = NULL;         rq->prev_mm = OLDMM; }/* * Since the Runqueue lock would be released bythe next * Task (which was an invalid locking o P but inthe case * of the SchEduler It's an obviousspecial-case), so we * do a early LOCKDEP release here: */Spin_release (&          amp;rq->lock.dep_map,1, _THIS_IP_);         Context_tracking_task_switch (Prev,next); /* Here we just switch the registerstate and the stack.          */switch_to (prev, Next, prev);         Barrier (); /* * THIS_RQ must be evaluated again because Prevmay has moved * CPUs since it called schedule (), thus          The ' RQ ' on its stack * frame would be invalid. */Finish_task_switch (THIS_RQ (), prev);}


The key code in Context_switch () is

Switch_to (Prev,next, prev);


Switch_to () is a macro, not a function. We notice the note above the switch_to (),/* Here we just switch the register state and the stack. */: Toggles the status of registers and toggles the stack of next and prev processes.

Let's take a look at the code implementation of Switch_to (), Switch_to () is located in the Linux-3.18.6\arch\x86\include\asm\switch_to.h file. After entering switch_to () Our function stack is: Schedule (), __schedule () –> context_switch () –> switch_to (). (Switch_to () Although not a function, but in order to represent the code execution process, let switch_to () into the function stack bar ~).


/* * Saving eflags is important. It switches notonly IOPL between tasks, * It also protects other tasks from NT Leakingthrough sysenter etc. */#defineswitc                                                                                      H_to (prev, Next, last) do { /* * Context-s                         Witching clobbers All registers, Sowe clobber * them explicitly, via unused outputvariables. * (EAX and EBP are not listed because EBP issaved/restored * explicitly for Wchan access and EAX                                                                             is Thereturn value of * __SWITCH_TO ()) */                                                                                                                           unsigned long ebx, ecx, edx, ESI, EDI;    ASM volatile ("pushfl\n\t"/* Save Flags */"PUSHL%%ebp\n\t"                         /* Save EBP */"MOVL%%esp,%[prev_sp]\n\t"/* Save ESP */     "Movl%[next_sp],%%esp\n\t"/* Restore ESP */"MOVL $1f,%[prev_ip]\n\t"                                  /* Save EIP */"PUSHL%[next_ip]\n\t"/* Restore EIP */   __switch_canary "JMP __switch_to\n"/* Regparm call * * "1:\t" "PO PL%%ebp\n\t "/* Restore EBP */" popfl\n "/* restore F Lags */* o     Utput Parameters */                                                       : [prev_sp] "=m" (PREV-&GT;THREAD.SP), [PREV_IP]                                                                                                                                                             "=m" (Prev->thread.ip), "=a" (last), /* Clobbe                                         Red Output Registers: */"=b" (EBX), "=c" (ecx), "=d" (edx),                                                                                                                                           "=s" (ESI), "=d" (EDI)                                                                                                                                                   __switch_canary_oparam /* Input Parameters                          : */: [next_sp] "M" (NEXT-&GT;THREAD.SP),              [NEXT_IP]                                                                                                                            "M" (Next->thread.ip),                                                             /* Regparm parameters for __switch_to ():*/[prev] "a" (prev),     [Next]                                                                                                                                                  "D" (next)                                                                                                                                         __switch_canary_iparam                                                      :/* Reloaded Segment Registers */                                           "Memory"); } while (0)


Switch_to () is the AT/t syntax of the GCC inline assembly, which has a lot of comments, I have a sentence to translate the comments.

1.

"Pushfl\n\t"

Presses the flag of the current process (prev) into the kernel stack of the current process;

2.

"Pushl%%ebp\n\t"

Presses the kernel stack base (%EBP Register value) of the current process (prev) into the kernel stack of the current process;

3.

"Movl%%esp,%[prev_sp]\n\t"

Combined with/* outputparameters */[PREV_SP] "=m" (PREV->THREAD.SP),%[PREV_SP] stands for prev->thread.sp. That is, the top of the kernel stack stack (%ESP register value) of the current process (prev) is saved to PREV->THREAD.SP.

4.

"Movl%[next_sp],%%esp\n\t"

Combine/* InputParameters:/* [next_sp] "M" (NEXT->THREAD.SP)%[NEXT_SP] stands for next->thread.sp. That is, the top of the kernel stack stack (next->thread.sp value) of the next process is restored to the ESP register.

5.

"Movl%%esp,%[prev_sp]\n\t" "Movl%[next_sp],%%esp\n\t"

These two steps complete the switch of the kernel stack.

"Movl%[next_sp],%%esp\n\t"


The stack operation after this sentence is performed in the kernel stack of the next process (next).

6.

"Movl$1f,%[prev_ip]\n\t"

The "1:\t" address is assigned to the IP pointer of the prev process, and when the prev process is switch_to back, it starts from the "1:\t" position, and the next two sentences of "1:\t" are "popl%% Ebp\n\t "and" popfl\n ", restores the flag and kernel stack base address of the prev process .

7.

"Pushl%[next_ip]\n\t"

Combine/* InputParameters:/* [next_ip] "M" (NEXT->THREAD.IP)%[NEXT_IP] stands for next->thread.ip. That is, the next process's IP pointer (the execution starting point) is pressed into the top of the next process's kernel stack stack.

8.

"JMP __switch_to\n"

__switch_to is a function that does not have a call __switch_to function, but rather a way for the jmp __switch_to,jmp to pass parameters through the register. See/* Regparm parameters for __switch_to (): */\ Comment

[Prev]     " A "(prev), [next]     " D "(next)


Parameters are passed through "a" (representing EAX registers) and "D" (for edx registers). Since we do not have call __switch_to, the next instruction of the __SWITCH_TO function is not pressed into the kernel stack of the next process, although __switch_to is not called, but does not affect __switch_ To return (ret) when the Pop (popup) next process kernel stack stack top value, and assign this value to next IP pointer, next according to the location of the IP pointer to start execution.

What is the value that pops up when __switch_to returns? Next is the process to be swapped into execution, then next must have been swapped out, and next was swapped out with the following two sentences:

"Movl $1f,%[prev_ip]\n\t" "Pushl%[next_ip]\n\t"


These two sentences keep the "1:\t" position on top of Next's kernel stack stack.

So when __switch_to returns, it pops up the "1:\t" position and assigns that position to the IP pointer of next, so that Next's start execution position is "1:\t".

9.

         "Popl%%ebp\n\t"/             * Restore EBP   */                 "popfl\n"/                    * Restore flags */   \


Restore the next process's kernel stack and flags, and next process will be happy to execute!

__switch_to of the fuzzy zone:

"Movl%%esp,%[prev_sp]\n\t" "movl%[next_sp],%%esp\n\t"


These two words switch the kernel stack of prev to the kernel stack of next.

But to

"1:\t"


Start executing the first instruction of the next process.

The middle of this section

"Movl$1f,%[prev_ip]\n\t"        /*save    EIP   */       "pushl%[next_ip]\n\t"//    RESTOREEIP   */          __ Switch_canary                                           "jmp__switch_to\n"/       * regparmcall  */         \


The kernel stack for the next process is used, but it is also executed in the Prev process.

In general terms

"Movl%%esp,%[prev_sp]\n\t" "movl%[next_sp],%%esp\n\t" "movl$1f,%[prev_ip]\n\t"        /*save    EIP   */       "pushl%[next_ip]\n\t"/    * RESTOREEIP   */          __switch_canary                                           "jmp__switch_to\n"/       * Regparm Call  */         "1:\t"


This is a relatively vague place for prev and next, and it is not clear which process is part of the execution sequence.

What the hell did 7.switch_to do?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.