After this period of time to learn, I also have a certain understanding of Linux, today this blog on the previous knowledge of a summary it.
The previous Linux learning blog, from the top down is learning in-depth process, my blog link is as follows:
First: Analysis of the Linux operating system how the computer starts and how it operates
Chapter Two: Analysis of Linux operating system analysis of streamlined Linux kernel interrupts and time slices polling
The third article: "Linux Operating system analysis" tracking analysis of the boot process of the Linux kernel
Fourth: "Linux operating system Analysis" using the Library function API and C code embedded in the assembly code two ways to use the same system call
Fifth: "Linux operating system Analysis" Analysis system call System_call processing process
Sixth: Analysis of Linux operating system analysis of the process of creating a new process by the Linux kernel
Seventh: Analysis of Linux operating system analysis of how the Linux kernel loads and starts an executable program
Eighth: "Linux operating System Analysis" Understanding process scheduling time tracking analysis process scheduling and process switching process
Any form of learning is not an overnight process, to undergo a painful period of discipline.
At the beginning of the contact with Linux, a variety of commands, but after a period of time you can remember. As for the order, I will write another blog to summarize some of the basic commands that I have learned about bird's private dishes.
To get to the next, now I learn to understand the Linux kernel this book, do a periodic summary.
I'll summarize the following aspects of Linux. This includes computer startup, kernel loading, processes, interrupts and exceptions, process scheduling, system calls, and program execution. Other sections are analyzed and summarized after the next study is completed.
First, the start of the computer
As I said on my Linux first blog, we'll wonder how the computer started. How to turn from a lump of iron into a machine that calculates the speed of seconds to kill humans. I borrowed a word from the first blog: "Pull oneself up by one's bootstraps" literally means "pulling your shoelaces up." This process must not be done by everyone. In fact, the start of the computer we need the first instruction, and then wait for the input instructions. So when was the first instruction to be entered? In the BIOS, the computer is written to the first instruction in a fixed position. After the first instruction is loaded, subsequent behavior can be performed.
Simply speaking the computer's operation, is a person has been waiting to do things, there are things he went to do, no matter, on their own daze. If there's more than one thing coming at the same time. This person is going to judge the order of execution according to his own needs.
For a detailed analysis, see my first blog post.
Second, kernel loading
The process of kernel loading is as follows: Execute asmlinkage __visible void __init start_kernel (void) This function, in which there is a set_task_stack_end_magic (&init_ Task), a function in which the struct (init_task) is set to Current_task at Linux startup. (The idle process is now started) and the other information is also initialized. Then proceeded to the Rest_init (); this place. Called Kernel_thread (Kernel_init, NULL, Clone_fs) when initialized to the Rest_init function, and the function starts the first kernel thread kernel_init. Initiated by Kernel_init again through Do_execve/sbin/init. This is the init process we saw, with the process number 1. Linux calls Scheule the entire system is running after initialization is complete.
So we can see that idle is a process with a PID of 0. Is the first process created in Linux boot, and after the system is loaded, it evolves into process scheduling, switching, and storage management processes. Idle on the main processor evolves from the original process (pid=0). Idle from the processor is obtained by the Init process fork, but their PID is 0. The idle process is the lowest priority and does not participate in scheduling, but is dispatched only when the running queue is empty. The idle loop waits for the need_resched to be placed. Process 1th is the init process, created by the 0 process, to complete the initialization of the system. Is the ancestor process of all other user processes in the system
For a detailed analysis, see my third blog post.
Third, the process.
Generally speaking, a process is an instance of the execution of a program, which can be seen as a collection of data structures that adequately describe the degree to which a program has been executed.
Processes are similar to human beings: they are created, and then more or less effective lives, which can produce one or more sub-processes and ultimately die. Of course the program has no sex, only one father.
From the point of view of the kernel: The purpose of a process is to assume the entity that allocates system resources.
A process can share code with other processes, but they have separate copies of the data (stacks and heaps) that do not affect each other.
Each process has its own process descriptor (task_struct type structure), which contains the basic information of the process, a pointer to the memory area descriptor, a process-related TTY, the current directory, a pointer to the file descriptor, the received signal, and so on.
The process has the following states: The operational state, the interruptible wait state, the non-interruptible wait state, the paused state, the tracking status. There are also two processes that can be stored in the state field or in the Exit_state field. The state of a process becomes one of these two states only when the execution of the process is terminated. That is: Zombie state, zombie undo state.
Of course, each process has a unique process identifier.
Process switching: To control the execution of a process, the kernel must have the ability to suspend a process that is running on the CPU and resume execution of a previously suspended process. For a detailed analysis, see my eighth blog post. Here we simply say, employing for example, the process of switching is said: A person doing a thing a, when another thing happened B. At this point he needs to deal with another thing B. At this time he would like to switch two things, first of all, he must know that after the B thing to continue to do a thing, where should start, this will save a thing to do the information. Second, he wants to know where B things start, where to find tools, and that information needs to be read into. When the process of saving and reading is complete, he can do the b thing, which completes the process switching.
I have the following summary about this chapter:
For each process, Linux tightly stores two different data structures in a separate storage area allocated for the process: one is the kernel-state process stack, the other is the Thread_info, which is next to the process descriptor, called the thread descriptor. (The data type is union, which is typically 8192 bytes or two page boxes). The thread descriptor resides at the beginning of this memory area, and the stack begins to grow from the end (high address). Because the Thread_info is 52 bytes long, the kernel stack can scale to 8140 bytes.
Second, when a process is created, it is the same as the parent process. It takes a (logical) copy of the parent process's address space and starts executing the same code as the parent process from the next instruction that the process creates the system call. Although parent-child processes can share the pages of program code, they each have separate copies of the data (stacks and heaps), so the child process's modifications to one memory unit are not visible to the parent process (or vice versa).
Third, the process of the creation process is probably the case: Sys_clone->do_fork->copy_process->dup_task_struct->copy_thread->ret_from_fork.
For a detailed analysis, see my sixth blog post.
Iv. interrupts and anomalies
For a detailed analysis, see my second blog post.
The interruption simply means that when you do something, you are interrupted to do something else. To give an example of everyday life, for example, I am cooking a pot of water with gas in the kitchen, so I can only stay in the kitchen, waiting for the water to open-if the water spill out of the gas, it is possible that a disaster will occur. Wait, wait, the outside suddenly came a surprise cry "why not turn off the faucet?" "So I am ashamed to find, just after the water to complain about this boring job, incredibly forget it, so panic rushed to the water pipe, nimbly off the faucet, voice and to the ear," How to do is so sloppy? ”。 Stretched out his tongue, this little thing passed, and my lonely eyes fell on the kettle. Outside the door suddenly came a sonorous song, my favorite costume play to start, really want to door, however, listen to the kettle issued "bubbling" voice, I know: Unless until the water open, otherwise I enjoy life.
There are two types of interrupts:
Synchronous interrupts: When the instruction is executed by the CPU control unit, it becomes synchronous because the CPU will not interrupt until the execution of an instruction terminates.
Asynchronous interrupts: generated randomly by other hardware devices according to the CPU always signals.
Interrupt signal: Provides a special way for the processor to switch to code outside the normal control flow. When an interrupt signal is reached, the CPU must stop what it is currently doing and switch to a new activity. (This requires storing the current value of the program counter in the kernel stack, i.e. the contents of the EIP and CS registers, and placing an address associated with the interrupt type in the program Calculator)
Classification of interrupts and exceptions:
Interrupt: Can shield interrupt, unshielded interrupt.
Abnormal:
Processor Detection exception: fault, Trap, abort.
Programming exceptions (also called soft interrupts): Occurs when a programmer makes a request, and is triggered by an int or int3 instruction.
V. Process scheduling
For a detailed analysis, see my eighth blog post.
In a nutshell, process scheduling involves the following process: the schedule () function selects a new process to run and invokes Context_switch for context switching, which calls switch_to for critical context switching
Next = Pick_next_task (RQ, prev); The process scheduling algorithm encapsulates this function, which selects a process as the next execution.
Context_switch (RQ, Prev, next); Process Context Switch
Switch_to takes advantage of the prev and next two parameters: Prev points to the current process, and next points to the scheduled process to switch the process.
After studying this paragraph, I have the following summary:
First, the timing of the process scheduling
1, interrupt processing (including clock interrupts, I/O interrupts, system calls and exceptions), call schedule () directly, or return to the user state according to the need_resched tag call schedule ();
2, kernel threads can directly call schedule () for process switching, can also be scheduled during interrupt processing, that is, kernel threads as a class of special processes can be active scheduling, can also be passive scheduling;
3, the user state process can not realize the active scheduling, only through the kernel state after a certain point in time to dispatch, that is, in the interrupt processing process scheduling.
Second, the process of switching
1, in order to control the execution of the process, the kernel must have the ability to suspend the process executing on the CPU, and restore the execution of a previously suspended process, which is called process switching, task switching, context switching;
2, hangs the process that is executing on the CPU, is different from the save scene in the interruption, before and after the interruption is in the same process context, only by the user state turns to the kernel state execution;
3. Process context contains all the information required for process execution
3.1. User address space: Including program code, data, user stack, etc.
3.2. Control information: Process descriptor, kernel stack, etc.
3.3, hardware context (note that interrupts are also to save the hardware context is only a different method of saving)
4, process switching execution of the popular
Schedule (), Pick_next_task ()->context_switch (), switch_to, __switch_to ()
Iii. several special cases
1, through the interrupt processing process scheduling time, the user state process and the kernel thread switch between each other and the kernel thread switch to each other, and the most common situation is very similar, but the kernel thread runs in the process of interruption without process user state and kernel state conversion;
2, the kernel thread actively calls schedule (), only the process context of the switch, there is no interruption context of the switch, and the most general situation is slightly abbreviated;
3, the creation of the child process system calls in the child process execution starting point and return user state, such as fork. (The 6th step in the SWITCH_TO process is returned to Ret_from_fork)
4. After loading a new executable program, return to the condition of the user state, such as Execve. (6th step in the switch_to process, where the static link returns to the start of the program, and the dynamic link goes back to the dynamic linker)
Six, System call
For a detailed analysis, see my fourth and fifth blog post.
In Linux, the system call is passed through the register%eax, so the process of calling Fork is to save 20 in the%eax and then make the system call. For parameter passing, Linux is done through registers. Linux allows a maximum of 6 parameters to be passed to a system call, which is completed in turn by the 6 registers of%ebx,%ecx,%edx,%esi,%edi and%EBP. The process runs in both the user and kernel states using different stacks, called the user stack and the kernel stack, respectively. Each of them is responsible for the function calls in the corresponding privilege level state. When the system calls, the process is not only the user state to the kernel state of the switch, but also to switch the stack, so that the kernel system calls on the kernel stack to complete the call. When the system call returns, it also switches back to the user stack and continues to complete the function call under the user's condition.
We can also see the system_call process as follows:
1) The order in which the system calls are initialized is: Start_kernel ()->trap_init ()->set_system_trap_gates (Syscall_vector,&system_call);
2) The user state to the kernel state through the 0x80 interrupt, during the kernel initialization call Trap_init (), with the function set_system_trap_gates (), set up the Interrupt Descriptor table entry corresponding to the vector 128, so as to enter the corresponding interrupt service.
3) the System_call () function first saves the system call number or all CPU registers required by the interrupt handler to the appropriate stack. The service is then processed. When the system invokes the service routine at the end, the System_call () function obtains its return value from EAX. Then a series of checks are performed, and finally the execution of the user-state process is resumed.
VII. implementation of Procedures
For a detailed analysis, see my Seventh blog post.
The following summary is available for the execution of the program:
One, the load of executable program is a system call. When the executable program executes, it is called into the kernel state by the EXECVE system, then loads the executables, overwrites the current process, and when the EXECVE system call returns, the new executable is returned (the execution starting point is at main).
Second, the EXECVE function loads and runs a new program in the context of the current process. It overwrites the address space of the current process, but does not create a new process. The new program still has the same PID and inherits all the file descriptors that were opened when the EXECVE function was called.
The EXECVE function loads and runs the executable target file filename, with the parameter list argv and the environment variable list ENVP. The argv variable points to a null-terminated array of pointers, where each pointer points to a parameter string to execute the target's name. The ENVP variable structure is the same as the argv variable, and the difference is that each environment variable string is a name-value pair with the form: "Name=value".
The difference between the EXECVE function and the fork function: only when an error occurs does EXECVE return to the calling program. That is, the difference between the EXECVE system call and the fork system call is that the former succeeds and does not return-1, which is returned two times.
Summarize:
After a period of study, I also have a certain understanding of Linux.
Also experienced the mountain is a mountain, see the mountain is not a mountain, see Mountain or mountain process. Finally, I think the more you learn, the less you know.
But fortunately I learned the relevant knowledge, but also strengthened their ability to overcome difficulties.
Of course, there is a certain regret in the process, that is, the dull learning caused oneself sometimes to their own requirements to relax, did not do perseverance.
Note:
Yang Junpeng + Original works reproduced please specify the source + "Linux kernel analysis" MOOC course http://mooc.study.163.com/course/USTC-1000029000
Linux OS Analysis Linux system understanding and learning the Linux kernel experience