2017-2018-1 20179209 "Linux kernel Fundamentals and Analysis" Nineth Week assignment

Last Update:2017-11-25 Source: Internet

Author: User

Tags function prototype prev

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Understanding process Scheduling Timing Process Scheduling

Interrupt processing (including clock interrupts, I/O interrupts, system calls, and exceptions), call schedule () directly, or call schedule () based on the need_resched tag when returning to the user state;
Kernel threads can directly call schedule () for process switching, or in the process of interrupt processing, which means that kernel threads as a special kind of process can be active scheduling, but also can be passively dispatched;
The user-state process cannot implement the active scheduling, but can only be dispatched by a point in time after the kernel state, that is, scheduling during interrupt processing.

Schedule () function analysis

schedule()The function prototype is located in Linux-3.18.6/kernel/sched/core.c, where the main key function is pick_next_task() that after the function call, the next running process is selected according to some process scheduling policy. This is followed by the context_switch() process context switch function, which implements the following functions:

  context_switch (struct RQ *rq, struct task_struct *prev,struct task_struct *next) {struct mm_struct *mm, *old    mm    Prepare_task_switch (RQ, Prev, next);    MM = next->mm;    OLDMM = prev->active_mm;    Arch_start_context_switch (prev);        if (!mm) {next->active_mm = OLDMM;        Atomic_inc (&oldmm->mm_count);    Enter_lazy_tlb (OLDMM, next);    } else switch_mm (OLDMM, MM, next);        if (!prev->mm) {prev->active_mm = NULL;    rq->prev_mm = OLDMM;        } spin_release (&rq->lock.dep_map, 1, _THIS_IP_);                Context_tracking_task_switch (prev, next);        Switch_to (prev, Next, prev);    Key function barrier (); /* * THIS_RQ must be evaluated again because Prev could have moved * CPUs since it called schedule (), thus the ' RQ '     On its stack * frame would be invalid. */Finish_task_switch (THIS_RQ (), prev);}

The most important of the

in the Context_switch () function is the switch_to () function, which is implemented primarily by inline assembly, which is the completion of process switching.

#define SWITCH_TO (prev, Next, last) do {unsigned long ebx, EC                       X, edx, ESI, EDI; ASM volatile ("pushfl\n\t"/* Saves the current process's flags */"PUSHL%%ebp\n\t"/* to press the current EBP of the current process into the Stack of pre-process */"MOVL%%esp,%[prev_sp]\n\t"/* Save current ESP to PREV-&GT;THREAD.SP point in memory */"mo VL%[next_sp],%%esp\n\t "/* reset ESP, give the next process NEXT-&GT;THREAD.SP to ESP */" MOVL $1f,%[prev_ip]\n\t "/* Put 1: the Generation The code stores the address in memory in PREV-&GT;THREAD.IP */"PUSHL%[next_ip]\n\t"/* Reset EIP */__                                           Switch_canary "JMP __switch_to\n"/* Jump to switch_to function */"1:\t" "Popl%%ebp\n\t"/* Reset EBP */"popfl\n"/* Reset Flags */: [prev_sp] "=m" (PREV-&GT;THREAD.SP), [prev_ip] "=m" (preV-&GT;THREAD.IP), "=a" (last), "=b" (EBX), "=c" (ecx), "=                                       D "(edx)," =s "(ESI)," =d "(EDI) __switch_canary_oparam : [next_sp] "M" (NEXT-&GT;THREAD.SP), [next_ip] "M" (Next-&gt                                    ; thread.ip), [prev] "a" (prev), [next] "D" (next)                  __switch_canary_iparam "Memory"); } while (0)

Switch_to () is a process-to-B transition, and we can assume that at switch_to () this point, a process is cut out and the B process is cut in. After entering the macro inside the switch_to (), first PUSHFL and PUSHL EBP must still belong to process A, and then the ESP points to the stack of B, and the instruction flow from this point on is related to the B process. However, this time the B process is not fully ready, because EBP and the hardware context and other content has not been switched to B, the rest of the macro code is to complete these things. For A process It never feels like it has been interrupted, and it thinks that it has been executing continuously. The switch_to () function, in addition to changing the prev variable in the a process, has no other effect on a. Any process in the system sees it this way, and all processes assume that they are running on their own without interruption.

GDB Trace Fork Command

The thought of the schedule () function, I think of is the fork command, just the last class in the Menos added fork command, so use GDB to track the fork command before it.
Breakpoint Settings:

b scheduleb context_switchb pick_next_task

Originally wanted to set a breakpoint in the switch_to, but set several times did not succeed, at first still wondering why, later found that switch_to is not a separate function, but a macro defined in the macro. So the breakpoint is not set, and then try to set a breakpoint in the function near it, found that there is no result, that the "function" in the Context_switch is written in the macro, then decisively give up.
Schedule () Breakpoint:

Pick_next_task () Breakpoint:

Context_switch () Breakpoint:

Summarize

Process switching general process: A user state process A, interrupt, save cs:eip/esp/eflags wait until the kernel stack; save_all Save the scene. Call Schedule before a direct call or interrupt is returned during interrupt processing, where the assembly code in SWITCH_TO does a critical part. Switches the kernel stack, moving from the current process to the next process. Start running the next process starting from label 1 (this process must have been switched out through this process, such as the new process is not included); restore the scene of the next process, pop out eip,esp. Continue running the process before the user state is running the program.
Special cases:

In the process of interrupt processing, the user-state process and the kernel process switch to each other, similar to the general situation;
The kernel process actively calls the schedule function, only the process context switch, no interrupt context switch;
The system call that creates the child process starts and returns the user state in the child process, such as fork;
A situation in which a new executable program is loaded and returned to the user state, such as: Execve.

Textbook content mm_struct and kernel threads

When a process is dispatched, the address space pointed to by the process's mm field is loaded into memory, and the ACTIVE_MM field in the process descriptor is updated to point to the new address space. The kernel thread does not have an address space, so the MM field is null. Thus, when a kernel thread is dispatched, the kernel discovers that its mm domain is null, retains the address space of the previous process, and then the kernel updates the ACTIVE_MM field in the process descriptor of the kernel thread to point to the memory descriptor of the previous process, so when needed, A kernel thread can use the page table of the previous process. Because kernel threads do not ask for memory in the user space, they use only the information associated with the kernel memory in the address space, which is exactly the same as the normal process.

Page table

Use the Level three page table to complete address translation in Linux. The use of Multi-level page table can save the storage space that the address conversion needs to occupy.

A top-level page table is a global catalog (PGD) that contains an array of type pgd_t, and the pgd_t type in most architectures is equivalent to an unsigned long type. The table entry in the PGD points to the table entry in the Level two page directory: PMD.
Level Two page table is the middle Love You page directory (PMD), which is an array of pmd_t types, where the table entries point to the table entries in the Pte.
The last level of the page table abbreviation page table, which contains the pte_t type of page table entry, which points to the physical page.

The significance of caching (cache) presence (review)

Access to disk is much slower than memory access, so accessing data from memory is faster than accessing it from disk.
Once the data has been accessed, it is likely to be accessed again in the short term, the principle of centralizing access to the same piece of data in a short time is called the temporary local principle.

Motoki

Because the page is already in the page cache before any page I/O operations, this frequent check must be quick and efficient, otherwise the cost of searching and checking the page cache will be offset by the benefits of page caching. The page cache searches with two parameters Address_space object plus an offset. Each Address_space object has a unique base tree, which is stored in the page_tree structure. Motoki is a binary tree that can quickly retrieve the desired page in the base tree as long as the file offset is specified. Page Cache search function Find_get_page () to call Radix_tree_lookup (), the function searches the specified base tree for the specified page.
PS: Learning so much about the implementation of the Linux system, more and more the importance of data structure, the first sense of data structure This course is "waste", it may be that I write too little code. Now it seems that the data structure is really the art of art.

Multithreading to avoid congestion

A single thread may clog up the processing of a queue, not all disks are saturated, because the throughput of the disk is very limited. Because disk throughput is limited, it is easy for a thread to wait for a disk operation if only the unique thread is performing a page write-back operation. To prevent this from happening, the kernel requires multiple writeback threads to execute concurrently, so that the congestion of individual device queues can become a bottleneck for the system.

2017-2018-1 20179209 Linux kernel principles and analysis Nineth week of work

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More