"Linux kernel design and implementation" and "Linux kernel source code scenario Analysis" Reading notes _

"Linux kernel design and implementation" and "Linux kernel source code scenario Analysis" Reading notes __linux

Last Update:2018-07-25 Source: Internet

Author: User

Tags semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chapter One: Introduction to the Kernel

The scope of the processor's activity at any given point in time:

A, run in the kernel space, in the process context, on behalf of a specific process execution;

b, running in kernel space, in the interrupt context, regardless of any process, handling a particular interrupt;

C, run in user space, execute user process.

When a process is executing, the values in all registers of the CPU, the state of the process, and the contents of the stack are referred to as the context of the process. When the kernel needs to switch to another process, it needs to save all the states of the current process, that is, to save the context of the current process so that when the process is executed again, the state of the transition will be executed. In Linux, the current process context is stored in the process's task data structure. In the event of an interruption, the kernel executes the interrupt service routine in the context of the interrupted process. However, all resources that need to be used are preserved so that the execution of the interrupted process can be restored at the end of the relay service.

Chapter III: Process Management

1, the kernel holds processes in a doubly linked list called task queues, where each item in the list is a structure of type task_struct, called a process descriptor, that contains all the information for a specific process.

2, the kernel flags each process with a unique process identity value or PID, which stores the PID of each process in their respective process descriptor.

3,x86 system registers can only be created by creating a thread_info structure at the top or bottom of the stack of the process kernel stack, and finding task_struct structures indirectly by calculating offsets.

4, Process status:

The a,task_running process is executing or waiting to be executed in the run queue;

The b,task_interruptible process is sleeping, and when certain conditions are reached, the kernel will set the process state to run, or be awakened and put into operation because the signal is received.

The c,task_uninterruptible process is sleeping and does not respond to the signal.

D,task_zombie the process has ended, and the process descriptor for the child process is retained for the parent process to be informed of its message;

The e,task_stopped process stops executing and receives signals such as Sigstop, Sigttin, Sigttou, SIGTSTP, and so on.

5, system calls and exception handlers are explicitly defined interfaces to the kernel, processes can only be executed through these interfaces, and all access to the kernel must pass through these interfaces.

6, the relationship between processes is stored in the process descriptor, and each task_struct contains a pointer to its parent process task_struct called parent, and a child process list called children.

7, write-time copy: Fork () After the kernel let the parent process and child processes share the same copy, only when the parent-child process needs to write, the data will be replicated.

8,fork () and vfork () Invoke Clone (), Clone () calls Do_fork (), Do_fork () calls Copy_process (), specifically, see P27,P28.

9, a thread is an ordinary process in the kernel, which causes the process and other processes to share some resources, each with its own task_struct, and the thread's creation to invoke Clone ().

10, the difference between a kernel thread and a normal process is that there is no separate address space, run only in kernel space, the function that it gets when it is created is always executed, the function usually has a loop, and when needed, the kernel thread wakes up and executes, and the task sleeps itself.

11, the process is terminated, the Do_exit () system calls the P31,wait () function through system call WAIT4 (), when the process descriptor is finally released, Release_task () is invoked, P32;

12, the kernel of the orphan process processing: to the child process in the current thread group to find a thread as a father, if not, let Init do its parent process, traversing the child process linked list and prace process linked list;

13, when a program executes a system call or triggers an exception, it falls into kernel space, at which point the kernel runs on behalf of the process, in the context of the process. At this point, the process can sleep and invoke the scheduler. The current macro can be associated with the present process.

Fourth chapter: Process scheduling

1, in preemptive multitasking mode, the scheduler decides when to stop the running of a process so that other processes can get an opportunity to execute. This mandatory suspend action is called preemption. The preset of the time that a process can run before it is preempted, called the time slice of the process.

2, the policy determines when the scheduler will let what process run. I/O-consuming processes are used most of the time to submit I/O requests or wait for I/O requests. Often in a running state, but usually run for a short while, where I say I/O refers to any type of blocking resource; The processor-consuming process spends most of its time executing code, unless it is preempted or executed.

3, the scheduling strategy is balanced in two goals: the process responds quickly and the maximum system utilization.

4, Process priority: Process hierarchy based on the value of the process and its processor-time requirements. Linux Two priority: Nice value and real time priority, the nice value as the weight will adjust the processor time used by the process, the higher the Nice value process is given a low weight, loss of a portion of the processor usage ratio; The preemptive timing of the CFS scheduler depends on how much processor usage is consumed by the new executable program, If it is smaller than the current process, the new process is immediately operational and the current process is preempted. For example, a text editor and a video coder are the only two executables in a moment with the same nice value, because the text editor will spend more time waiting for user input, so its processor usage is definitely less than 50%, lower than the video coding program's usage, So CFS will be immediately put into operation when the user input is awakened by the text editor, preempt the video coding program, after processing, once again into the sleep waiting for the user's next input.

5,linux Scheduler is provided in a modular way, allowing different types of processes to choose different scheduling algorithms, the modular structure becomes the scheduler class, and the Complete Fair Dispatch (CFS) is a scheduling class for ordinary processes, which is to allow each process to run for a period of time, cycle rotation, Select the least run process as the next running process.

6, the Scheduler entity structure struct sched_entity is embedded in the process descriptor struct task_struct as a member named SE, and the vruntime variable in SE is the time that the process spends on the run and, The core of CFS scheduling algorithm: Selecting tasks with minimal vruntime.

CFS uses the red-black tree to run queues, and the CFS process selection algorithm summarizes the process that is represented by running the leftmost leaf node in the Rbtree tree. Adding a process to the tree occurs when the process becomes operational or when the process is first created through a fork () call, and the delete action from the tree occurs when the process is blocked (becomes not operational), or terminates.

The schedule () function of the scheduler's entry, which takes precedence, starts with the highest scheduling class, and each scheduling class has its own running queue from which to get the next running process.

Sleep: The process marks itself as a sleep state, moves out of the executable red-black number, puts it into the wait queue, and then invokes schedule () to select and execute a different process;

Wakeup: The process is set to run and then from the wait queue to the executable red-black tree;

The wait queue is a simple list of processes that are waiting for certain events to occur. The detailed steps for the P50 process to join the wait queue;

Context switches: Context_switch () function processing, calling SWITCH_MM () to switch virtual memory from the previous process map to a new process; call switch_to () to switch the processor state of the previous process to the processor state of the new process, including saving, Restore stack information and register information.

7, the kernel provides a need_resched flag to indicate whether a dispatch needs to be performed again.

8, user preemption occurs in: When returning user space from a system call and returning user space from the interrupt handler, kernel preemption occurs when the interrupt handler is executing and before the kernel space is returned, and when the kernel code is once again preempted; if the task in the kernel displays a call schedule () If the task in the kernel is blocked (this also calls schedule ());

Fifth Chapter: System call

1, in Linux, system calls are the only means by which user space accesses the kernel, which are the only legitimate portals of the kernel, except for exceptions and sinking. Unix system calls abstract out functions that are used to accomplish a certain purpose, as to how these functions can be cared for without the kernel at all. Provide mechanisms rather than policies.

2, in Linux, each system call is given a system call number, and the kernel records the list of all registered system calls in the system call table, stored in the sys_call_table.

3, user-space programs can not directly execute kernel code, because the kernel is stationed in the protected address space, the application through the soft interrupt mechanism to notify the kernel, by throwing an exception to the system to switch to the kernel state to execute the exception handler is the system call handler System_call (), System calls into the kernel to pass the system call number through the EAX register to the kernel, the parameters and return values are passed through registers. When the system call returns, System_call is responsible for switching to user space and allowing the user process to continue.

Sixth chapter: Kernel data structure

1, the difference between the kernel linked list and the common list: the list node of the common list contains the business content, while the kernel linked list separates the business content from the linked list, becomes a single node, and includes the linked list node.

i = (int) (& ((struct advadvteacher *) 0)->age); Gets the offset of the business content.

2, Core red and black tree: see http://blog.csdn.net/yang_yulei/article/details/26066409

Seventh chapter: Interruption and interrupt processing

1, interrupt is a hardware generated electrical signals, and directly into the interrupt controller, the interrupt controller will send an electrical signal to the processor, the processor notifies the operating system has been interrupted, the operating system to deal with the interruption.

2, the kernel may be interrupted at any time due to a new interruption. Hardware interrupts are intended to cause the kernel's attention.

3, each interrupt corresponds to an interrupt value called an IRQ Interrupt request line, and each IRQ is associated with a numeric quantity.

4, the difference between the exception and interrupt: The exception to the processor clock synchronization. Exceptions are caused by software, and interrupts are generated by hardware.

5, the specific function that the kernel responds to interrupts is called the interrupt handler function or interrupt service routine ISR, a device's ISR is part of its device driver, and the device driver is the kernel code used to manage the device.

6, when an ISR is executed, the kernel is in the interrupt context, the interrupt contexts are not related to the process, no current macros are independent, and no sleep is allowed because there is no fallback process so the scheduler cannot be invoked.

7, the interrupt handler is the top half, the interrupt handler interrupts other code (and may even interrupt another interrupt handler on another interrupt line) because of the nature of the asynchronous execution, so all interrupt handlers must be as quick and concise as possible, separating the work from the interrupt handler. In the lower half to perform, because the lower half can be run at a more appropriate time.

8, the interrupt handler has its own stack, one per processor, and one page in size. That is, interrupt stack.

Eighth chapter: The work of the lower part and the deferred execution

1, try to minimize the amount of work that needs to be done in the interrupt handler, because when it is running, the current disconnect is blocked on all processors. Shortening the time that interrupts are blocked is critical to the responsiveness and performance of the system. The key to performing the lower half is to allow all interrupts to be responded to when they are running.

2, soft interrupt: A soft interrupt does not preempt another soft interrupt, the only thing that can preempt a soft interrupt is an interrupt handler, and other soft interrupts of the same type can be run concurrently on other processors.

3,tasklet

4, the Work queue carries the work to a kernel thread, which is always executed in the context of the process, which allows for rescheduling and sleep.

The Nineth Chapter: introduction of the kernel synchronization

The 1,linux kernel is a preemptive kernel and, in the absence of protection, the scheduler can preempt the running kernel code at any time and reschedule other process execution.

2, the main difference between the various locking mechanisms is the behavior of the lock when it is already held by another thread and is not available--some locks are simply busy waiting when they are contention, while others cause the current task to sleep until the lock is available.

3 In fact, synchronization is the calling module to wait for a called body to return, and then proceed to the next step, and asynchronous is the call module to initiate the call, do not wait for the call to return to continue the next step.

4, the cause of concurrent execution in the kernel: A, interrupt; B, soft interrupt and tasklet;c, kernel preemption; d, sleep and synchronization with user space; E, symmetric multiple processing.

5, most kernel data structures need to be locked to give data rather than code locks.

6, since the deadlock, if an executing thread tries to get a lock that it already owns, it will have to wait for the lock to be released. ABBA deadlock each thread waits for a lock held by another thread, but no thread releases the lock they held at the beginning. Rules to avoid deadlocks: a lock in order, B to prevent hunger, c do not repeat the request of the same lock, D designed to be simple. To release the lock in the reverse order obtained.

Tenth Chapter: Kernel synchronization method

1, a spin lock can only be held by an executable thread, and if a thread attempts to acquire a spin lock held by another thread, the thread will be busy looping--rotating--waiting for the lock to be available again. This particularly wastes processor time, so spin locks should not be held for long periods of time.

2, the spin lock can be used in the interrupt handler, the semaphore can not, because the semaphore will cause sleep, in the interrupt handler when using a spin lock, be sure to prevent the local interruption before acquiring the lock (interrupt request on the current processor), otherwise, the interrupt handler will interrupt the kernel code holding the lock, It is possible to try to compete with this already held spin lock so that the interrupt handler spins and waits for the lock to be available again, but the lock holder cannot run until the interrupt handler is executed. This is the double request deadlock.

3, when the lower half and the process context share data, because the lower half can preempt the process context, it is necessary to protect the shared data in the process context, while locking the lower half, while the interrupt handler and the lower half share the data, because the interrupt handler can preempt the lower part, You must also prevent interruptions while acquiring the proper locks. Similar tasklet cannot be run at the same time, so there is no need to protect shared data in similar tasklet. When data is shared by two different kinds of tasklet, it is necessary to obtain a spin lock prior to accessing the data in the lower half, and there is no need to ban the lower half, because Tasklet on the same processor will not preempt each other. Data is shared with soft interrupts like tasklet.

4, read-write spin lock: One or more read tasks can be concurrent holding the reader lock, used for writing locks can only be held by a write task, and at this time can not have concurrent read operations.

5, Semaphore: If a task attempts to obtain an unusable semaphore, the semaphore pushes it up a waiting queue and then sleeps it, which is the processor's freedom to execute other code, and when the semaphore is available, the task in the waiting queue wakes up and gets the semaphore.

6, the signal quantity applies Yu Yu the situation which is held for a long time. Semaphore locks can be acquired only in the context of a process.

7, read and write semaphore, as long as there is no writer, concurrent hold read lock reader number is not limited, on the contrary, only the only writer can get write lock when no reader.

8, mutual exclusion lock, use count is always 1 mutually exclusive signal quantity.

9, complete variable, sequential lock, barrier;

Nineth Chapter: Memory Management

1, the program code generated is a logical address, the CPU to convert a logical address to physical address, need two steps: First, CPU using segment-type memory management unit, the logical address into a linear address, and then use the page-type memory management unit, the linear address eventually converted to physical address.

2,linux uses the page storage management mechanism, because the I386CPU is backward-compatible, so the Linux kernel is only in dealing with the unnecessary but must be so routine, that is, each segment is starting from 0 address the entire 4GB virtual storage space, The mapping of the virtual address to the linear address keeps the original value unchanged.

3, each process has 4G bytes of virtual storage space, the lower 3G bytes for their own user space, the highest 1G byte for all processes and the kernel share of the system space. Although the system space occupies the highest 1G byte in each virtual storage space, it starts with the lowest address (0) in physical memory.

For system space, the address mapping is a simple linear mapping, given a virtual address X, whose physical address is subtracted from x page_offset=0xc0000000, corresponding, given a physical address X, whose virtual address is x+page_offset; no matter what process, Once you enter the system space, you have the same page map.

For user space, its address mapping is the essence of page management. The Linux page mapping mechanism is divided into three tiers: page directory PGD, intermediate directory PMD, table entries in page table Pt,pt become Pte. Each process has its own pgd,pmd,pt, all three of which are arrays.

One address is 0000 1000 0000 0100 1000 0101 0110 1000, the highest 10 digit is decimal 32, so i386cpu the index to the page directory entry in 32, the high 20 bit of the entry points to a page table, The CPU gets a pointer to the page table after adding 12 0 to the 20 digits. (Each page table occupies a page, so nature is 4K byte boundary alignment, its starting address of the lower 12 bits must be 0,), find the page table, the CPU again look at the middle of the linear address 10, that is 72, The CPU is using this as the subscript to find the corresponding table entry in the page table that has been found, similar to the catalog entry, the high 20 bits in the 32-bit page table entry point to a physical memory page, and then add 12 0 to get the starting address of the physical memory page. The final physical memory address is obtained by adding the minimum 12 bits of the linear address to its starting address.

4, Cross-border access: 1, the corresponding page entries or page table entries are empty, 2, the corresponding physical page is not in memory, 3, the instructions specified in the Access method and the page does not match the permissions, at this time the CPU will produce a page fault exception pages error exception, It then executes the scheduled page exception handler and sends the SIGSEGV signal to the process, and every time the process returns from the break/exception, it checks to see if the current process has an outstanding signal to process, that is, the output segment Fault, and the process ends.

5, the extension of the user stack:

Assuming that the process is running, the stack space allocated for this process has been exhausted, that is, the stack pointer has pointed to the starting address esp of the stack space, assuming that a subroutine needs to be invoked now, and the CPU needs to push its return address into the stack, which is to write the return address to the Esp-4 place where it is empty ; the bounds caused by the stack operation are treated as special cases, need to check if the address where the exception occurred is next to the point where the stack pointer refers to the standard is esp-32, if not, that is illegal cross-border access, if it is that on the top of the hole to start assigning a number of pages to create a map and merge it into the stack space, To make it expand.

6, the difference between break and exception: when the interruption and the self trapping occurred, the CPU will put the next instruction, that is, should be executed the address of the instruction into the stack as the return address of the interrupt service program, when the exception occurs, the CPU will be unable to complete the aborted instruction itself address (not the address of the next instruction) into the stack, This completes the unfinished business when returning from exception handling.

7, memory allocation algorithm: http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=28820980&id=3848787

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More