Process Creation Process

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

------ Source code analysis based on linux0.11

1. Background

The process of Process Creation is undoubtedly one of the most important operating system processing processes. Many books and textbooks talk about the most about some principles and ignore many details. For example, a sub-process replicates resources owned by the parent process, or the sub-process shares the same physical page with the parent process, and has its own address space. After the sub-process is created, it accepts unified scheduling and execution.

The books on Principles focus more on the functions of each key part in the process of Process Creation. However, because they are too abstract and difficult to understand, it is very important to practice this process, which makes it easy to understand those seemingly abstract concepts, such as the so-called parent process resources and physical pages owned by the parent process, even the address space of the parent process. In fact, these abstract concepts can be more sensible as long as they are actually operated at one time. I have practiced the creation process and scheduling by referring to the linux0.11 source code. This process has benefited a lot. Here I will summarize the main learning achievements in combination with the practice.

2. process no. 0

The child process is created based on the parent process. Therefore, there is always a process that is original, that is, there is no parent process. The process number in Linux is 0, that is, the legendary 0 Process (unfortunately, many theoretical books do not mention this important process ).

If a sub-process can be created by using a standard function to create a process (for example, fork () based on the parent process replication, process 0 does not have any objects that can be copied and referenced, that is to say, all the information and resources owned by process 0 are forcibly set, instead of being copied. This process is called manual setting. That is to say, process 0 is purely manual ", this is the "most primitive" Process in the operating system. It is a model, and any process behind it is generated based on the process no. 0.

Manual process 0 consists of two main parts: creating all the information required when the process 0 is running, that is, filling the process 0 with "flesh and blood "; the second is to schedule the execution of the process no. 0, that is, to make it "dynamic". The process itself is actually a dynamic concept.

There may be slight differences in the meaning of process information of different operating systems or different versions of the same operating system, but the key part and logic are basically the same, here I only describe the key steps and key details of process creation based on the implementation of linux0.11.

1) Fill in process information of process no. 0

The content of a process is very complex, but in general, the process information is identified by the process descriptor, therefore, the process of filling process 0 is logically completed by filling its Descriptor (also known as process control block ). The following is the descriptor information structure of the linux0.11 version process:

Struct task_struct {

Long state, counter, priority, signal;

Struct sigaction [32];

Long blocked;

Int exit_code;

Unsigned long start_code, end_code, end_data, BRK, start_stack;

Long PID, Father, pgrp, session, leader;

Unsigned short uid, EUID, SUID, GID, EGID, SGID;

Long alarm;

Long utime, stime, cutime, cstime, start_time;

Unsigned short used_math;

Int tty;

Unsigned short umask;

Struct m_inode * Pwd;

Struct m_inode * root;

Struct m_inode * executable;

Unsigned long close_on_exec;

Struct file * filp [nr_open];

Struct desc_struct LDT [3];

Struct tss_struct TSS;

};

We can see that there is a lot of information in the process descriptor, which has the following parts:

A. the running information of a process, such as the current state of the process, the consumption records of various time slices of the process (utime, stime, etc.), the signal of the process (signal) and the priority (priority.

B. Basic Process Creation information, such as the process ID and uid.

C. process resource information, such as the TTY self-device number (TTY) used and the I node structure (Root) of the file root directory.

D. Key Information Required for process execution and CPU switching: Local Descriptor Table (LDT) and task status segment (TSS.

This information is not completely determined when the process is created. Most of the information is temporarily assigned an initial value, which will be dynamically changed during running, some of them must be set before the process runs to ensure that the process is correctly executed. In fact, the information we need to fill in is the information that allows the operating system to smoothly switch to process 0. The most important information is the LDT and TSS information of the process. Tss is the information that the CPU needs to use when switching tasks, and LDT is a local descriptor table. process 0 is the first process running in the user State and needs to use its own LDT. TSS and LDT are important mechanisms to ensure the isolation between different processes.

In fact, another important information is not included in the process descriptor, but in the Global Descriptor Table gdt, because all processes are managed by the operating system in a unified manner, therefore, the operating system must at least maintain their indexes. Information of this index type should be placed in the gdt of the operating system kernel. For linux0.11, each process has a LDT and a TSS descriptor, while linux2.4 is followed by a TSS descriptor for each CPU and stored in gdt, instead of one for each process. Of course, this difference will lead to some details in the Process of Process Creation and switching, but the essence of the part and the task switching process are not any different.

The following is a macro that manually fills process 0's process descriptor information in linux0.11:

# Define init_task/

{0, 15 ,/

0, {},}, 0 ,/

0, 0, 0, 0, 0 ,/

0,-1, 0, 0 ,/

0, 0, 0, 0, 0 ,/

0 ,/

-1,0022, null, 0 ,/

{Null ,},/

{0, 0 },/

{0x9f, 0xc0fa00 },/

{0x9f, 0xc0f200 },/

},/

{0, page_size + (long) & init_task, 0x10, 0, 0, 0, 0, (long) & pg_dir ,/

0, 0, 0, 0, 0, 0 ,/

0, 0, 0x17,0x17,0x17,0x17,0x17,0x17 ,/

_ LDT (0), 0x80000000 ,{}/

},/

}

In addition to filling in process descriptor information, you also need to set related items in gdt, that is, LDT and TSS selector of process 0. This work is completed in sched_init:

Void sched_init (void ){

...

Set_tss_desc (gdt + first_tss_entry, & (init_task.task.tss ));

Set_ldt_desc (gdt + first_ldt_entry, & (init_task.task.ldt ));

...

LTr (0 );

Lldt (0 );

}

As you can see, after the TSS and LDT descriptor information of process 0 is set to gdt, the TR and ldtr registers are set immediately to prepare for the upcoming running of process 0.

2) run process 0

Process 0 is a process running in the user State. Therefore, it means that the process of process 0 is actually a process of switching from the level 0 privilege level to the Level 3 privilege level, the CPU instruction iret is used to simulate the return process of the interrupted call. The specific execution process is completed by move_to_user_mode:

# Define move_to_user_mode ()/

_ ASM _ ("movl % ESP, % eax/n/t "/

"Pushl $0x17/n/t "/

"Pushl % eax/n/t "/

"Pushfl/n/t "/

"Pushl $ 0x0f/n/t "/

"Pushl $ 1f/n/t "/

"Iret/N "/

"1:/tmovl $0x17, % eax/n/t "/

...)

This macro sets the SS, ESP, and eflags when the process is executed 0. CS and EIP information are all pushed to the stack. When the iret command is executed, the CPU will pop up these information from the stack and load it to the corresponding register, in this way, process 0 is started and executed. It can also be seen from here that the information of several key registers at the beginning of process 0 is also set before it runs, from process descriptor information to execution information are set manually, therefore, I call it a purely manual process ".

3. Create a sub-process

With the original process of process 0, it is easier to understand the sub-process creation. Except process 0, all other processes are completed by using the system call fork (). The specific work is implemented by the kernel-State _ sys_fork:

_ Sys_fork:

Call _ find_empty_process

Testl % eax, % eax

JS 1f

Push % GS

Pushl % ESI

Pushl % EDI

Pushl % EBP

Pushl % eax

Call _ copy_process

Addl $20, % ESP

1: Ret

As you can see, there are two main steps to create a process: one is to find a idle Process Resource (find_empty_process), and the number of processes that can run simultaneously in linux0.11 is 64, which is limited, therefore, you must first obtain an idle progress table to index the information of the process to be created. The second main step is to copy (copy_process ), this function is used to create a child process based on the replication of the parent process.

The main steps and content are as follows:

1) allocate a physical page for the new process in memory, fill in the descriptor information of the new process at the beginning of the page, and set the information in the descriptor of the new process;

2) copy the page tables of the parent process so that they both point to the same physical page and change the attributes of each page table of the parent process to read-only. In the future, the replication mechanism can be used.

3) set the TSS and LDT delimiters for this process item in gdt.

These are the main content of the sub-process content settings in linux0.11. Of course, different versions may vary, and the execution performance will also be improved, however, the most basic creation process in this version basically reflects the main process of operating system creation.

4. Run sub-Processes

After the sub-process is created, it cannot be executed immediately. At least one scheduling is required, the running process of this sub-process does not need to manually set information on the stack as process 0 and then use the iret method, but rather the switching process of the executed task. Regardless of the algorithm and selection details of the process scheduling, the function responsible for completing the switching operation is as follows:

# Define switch_to (n ){/

Struct {long a, B;} _ TMP ;/

_ ASM _ ("CMPL % ECx, _ current/n/t "/

"Je 1f/n/t "/

"Movw % dx, % 1/N/t "/

"Xchgl % ECx, _ current/n/t "/

"Ljmp % 0/n/t "/

"CMPL % ECx, _ last_task_used_math/n/t "/

"JNE 1f/n/t "/

"Clts/N "/

"1 :"/

: "M" (* & __ TMP. A), "M" (* & __ TMP. B ),/

"D" (_ TSS (n), "C" (long) task [N]);/

}

The final switch executes an ljmp operation. Its operand is a task descriptor, which causes the CPU to execute a task switchover and load the related information into CS and EIP Based on the TSS information of the new process, the eflags, SS, and ESP registers start to execute new code. Of course, because the pages related to the previously copied parent process are set to read-only, the first time the child process runs on this page, the page protection exception will be triggered, and the copy operation will be triggered during write, allocate a corresponding page for the sub-process.

Operator: the difference between a task and a process

Tasks and processes are easily confused. Even in Linux, the process descriptor struct is represented by task_struct rather than process, which is even more confusing. I personally think that the concept of a task is more underlying and can be considered as CPU-based. The process is at a higher level and should be considered as an operating system level.

The focus of a task is a group of program operations. This group of Operations implements a certain function, which will eventually involve the command level. We say that the task switching will ultimately focus on the CPU-related commands.

The concept of a process usually refers to the execution of a program, which is a dynamic process. In addition to the program to be run, a process also contains a lot of information about the runtime, such as the running time and signal.

Source: http://www.51x86.com/article/09-12/1628861261320676.html? Sort = 899_0_0_0

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Process Creation Process

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support