Fork introduction:
Fork stands for the meaning of "branching and branching". In the operating system, fork is a famous Unix (or Unix-like, such as Linux or minix) to create a sub-process.
[Note1]
What is the role of Fork? In other words, what is the purpose of using fork?
-This is to create a new process, which everyone on Earth knows.
What kind of process is generated?
-The process that you used to call fork () is basically the same as the process, which is actually a copy of your original process;
Is it exactly the same?
-Of course, it cannot be exactly the same. What do you need to do with two identical processes except the PID?
Do you need to set a score like this?
What's different?
-Of course, the most important thing is that the Code executed after fork () is different. You know, I know :)
How can this problem be achieved?
-If it is Windows, it will let you provide a lot of things in fork (), specify the things ......
I use Unix.
-So it is very simple. UNIX will let two processes (yes, it turns out to be one. UNIX copies one for you, and now there are two processes)
Different values are generated after fork (): The returned values are different. In one of the processes (using the new PID), fork () returns zero,
This process is a "sub-process", and the fork () in another process (using the original PID) returns the previous sub-process's
PID, known as the "parent process"
What then?
-The person who writes the code is not stupid. Of course, it is determined based on whether the returned value is non-zero. Am I in a sub-process or
In the parent process? Execute the code of the sub-process in the sub-process, and execute the code of the parent process in the parent process ......
Some hardcore windows fans demonstrate that Windows is good. The sub-process uses the sub-process code and the parent process uses the parent process code,
Are you stupid about UNIX? The sub-process contains the code of the parent process and sub-process, and the parent process contains the code of the sub-process of the parent process. Isn't it too much memory?
-As far as I know, Unix code segments are reentrant code. That is to say, process replication does not copy code segments. Several processes
Sharing the same code segment adds global data sharing and file descriptor reference, and stack. You have a code
For a 10 m process, fork () generates three or four sub-processes, but adds a little memory usage (if you do not use many global variables)
Instead of occupying more than 40 MB of memory.
[Note2]
The program starts from fork Branch (not accurate Branch), one is the main process pid> 0 (PID is the sub-process ID) One is the sub-process pid = 0 since then divided into two tasks
In fact, fork has already had two branches, and the data segment has been copied, so the PID has two copies.
When pid = fork () is executed, the return value is assigned to PID for running in two processes,
Fork will return a value greater than 0 to the parent process, telling the caller to create the PID of the process
The fork returned value of the sub-process is 0.
Not to mention the comparison of IF... else is also done in both processes.
[Note3]
Analysis of Fork
The procedure is as follows:
# Include <unistd. h>;
# Include <sys/types. h>;
Main ()
{Pid_t PID;
PID = fork ();
If (PID <0) printf ("error in fork! ");
Else if (pid = 0)
Printf ("I am the child process, my process ID is % DN", getpid ());
Else
Printf ("I am the parent process, my process ID is % DN", getpid ());
}
The result is
[Root @ localhost C] #./A. Out
I am the child process, my process ID is 4286
I am the parent process, my process ID is 4285
I:
To understand the execution process of fork, we must first clarify the concept of "process" in the operating system. A process mainly contains three elements:
O. an executable program;
O. All data associated with the process (including variables, memory space, buffers, etc );
O. Execution context of the program ).
Simply put, a process represents a State in one execution process of an executable program. The management of processes by the operating system is typically completed through the process table. Each table entry in the table records a process in the current operating system. For a single CPU, only one process occupies the CPU at each specific time, but there may be multiple active (waiting for execution or continuing execution) processes in the system at the same time. A register called "program counter (PC)" indicates the position of the next instruction to be executed by the CPU-consuming process. When the CPU time allocated to a process is used up, the operating system saves the value of the process-related registers to the corresponding table items of the process in the progress table.
The context of the CPU process, read from the Progress table, and update the corresponding registers (this process is called "process context switch )", the actual context exchange requires more data, which is irrelevant to fork. Remember that the program register PC indicates where the program has been executed and is an important part of the process context, the process for switching out the CPU needs to save the value of this Register, and the process for switching into the CPU should also update this register according to the execution context information of this process saved in the progress table ).
Well, with these concepts, we can say fork. When your program runs the following statement:
PID = fork ();
The operating system creates a new process (child process) and creates a new table item for it in the process. The new process and the executable program of the original process are the same program. The context and data are mostly copies of the original process (parent process), but they are two independent processes! At this time, the program register PC claims in the context of the Parent and Child processes that the current fork call is about to return (at this time, the child process does not occupy the CPU, the PC of the sub-process is not actually saved in the register, but stored as the process context in the corresponding table in the progress table ). The problem is how to return it. In the Parent and Child processes, we will part with each other.
The parent process continues to execute. The operating system implements fork so that the call returns the PID (a positive integer) of the created child process in the parent process ), therefore, in the following if statement, neither of the two branches of PID <0, pid = 0 will be executed. So output I am the parent process...
The Subprocess is scheduled at a later time, and its context is swapped in to occupy the CPU. The fork Implementation of the operating system makes the fork call in the subprocess return 0. So in this process (note that this is not a parent process. Although it is the same program, it is another execution of the same program, in the operating system, this execution is represented by another process. From the execution point of view, it is said that the parent process is independent of each other.) pid = 0. When this process continues to run, if statement PID <0 does not meet, but pid = 0 is true. So output I am the child process...
Why does it seem that two mutually exclusive branches in the program are executed? This is of course impossible during one execution of a program; but the two lines of output you see come from two processes, which come from two executions of the same program.
After fork, the operating system will copy a child process that is exactly the same as the parent process. Although it is a parent-child relationship, in the operating system's view, they are more like siblings, the two processes share the code space, but the data space is independent of each other. The content in the data space of the child process is the complete copy of the parent process, and the command pointers are identical, but there is only one difference, if fork is successful, the return value of fork in the child process is 0, and the return value of fork in the parent process is the process number of the child process. If fork is not successful, the parent process returns an error.
As you can imagine, the two processes have been running at the same time, and the steps are consistent. After fork, they perform different jobs separately, that is, splitting. This is why fork is called fork.
After fork () is used in the program segment, the program has a branch and two processes are derived. The scheduling algorithm of the system depends on which one runs first.
If you need parent-child process coordination, you can solve the problem through the primitive method.
II:
Process Creation:
It is easy to create a system call for a process. You only need to call the fork function.
# Include <unistd. h>
Pid_t fork ();
After a process calls fork, the system creates a sub-process. this sub-process is different from the parent process only by its process ID and parent process ID. Others are the same. just like the parent process clone itself. of course it makes no sense to create two identical processes. to distinguish Parent and Child processes, we must track the return values of fork. when the fork fails to be used (the memory is insufficient or the maximum number of processes has reached), fork returns-1. Otherwise, the return value of fork plays an important role. for the parent process fork, return the child process ID, and for the Fork sub-process, return 0. the Parent and Child processes are differentiated based on the returned values. why should a parent process create a child process? We have already mentioned that Linux is a multi-user operating system. At the same time, many users compete for system resources.
Create Sub-processes to compete for resources. once a child process is created, the Parent and Child processes run from the fork together to compete for system resources. sometimes we want the sub-process to continue execution, and the parent process is blocked until the sub-process completes the task. in this case, we can call wait or waitpid.
Summary:
1. The PID of the parent process is unchanged;
2. For sub-processes, fork returns 0 to it, but its PID is definitely not 0. The reason fork returns 0 to it is that it can call getpid () at any time () to obtain your own PID;
3. After fork, the Parent and Child processes cannot determine who runs first or who ends first unless the synchronization method is used. The parent process returned from fork after the child process ends. This is not true for fork, but for vfork.
[Note4]
First, it must be clear that the return value of the function is stored in the register eax.
Second, when fork returns, the new process returns 0 because the eax is set to 0 when the task structure is initialized;
In fork, add sub-processes to a running queue, where the process scheduler schedules and runs at appropriate times. That is, from this point on, the current process is split into two concurrent processes.
No matter which process is scheduled to run, the remaining code of the fork function will be executed, and the respective values will be returned after the execution.
[Note5]
For fork, the Parent and Child processes share the same code space, so it seems that there are two responses. In fact, for the parent process that calls fork, if the child process from fork is not scheduled, the parent process will return the result from the fork system call and analyze sys_fork. Fork returns the ID of the child process. Let's look at the child process that comes out of fork. The copy_process function shows that the return address of the child process is ret_from_fork (returned on the same code point as the parent process), and the return value is directly set to 0. Therefore, when the sub-process is scheduled, it is also returned from fork, and the return value is 0.
Note: 1. The execution location of the parent process or child process after fork returns. (The eax value of the current process is used as the return value first) 2. The location where the returned PID is stored twice. (In eax)
The process calls copy_process to get the value of lastpid (put in eax, after fork returns normally, the value returned by the parent process is lastpid)
The eax of the sub-process task status segment TSS is set to 0,
Fork. c
P-& gt; TSS. eax = 0; (if the sub-process needs to be executed, process switching is required. When a switchover occurs, the eax value in the TSS of the sub-process is transferred to the eax register, when a sub-process is executed, the eax content is first returned)
When the sub-process starts execution, copy_process returns the value of eax.
After fork (), two tasks are performed simultaneously. The parent process uses its TSS, the child process uses its own TSS, and each uses the eax value during the switchover.
Therefore, "one call and two responses" are two different processes!
Look at this sentence: pid = fork ()
When this statement is executed, the current process enters fork () to run. At this time, fork () will use an Embedded Assembly for system calling: int 0x80 (for more information about the code, see the unistd in kernel version 0.11. the 133 rows _ syscall0 function of the H file ). Then the sys_fork system call will be run in the kernel according to the system call function number previously written to eax. Then, sys_fork first calls the C function find_empty_process to generate a new process, and then calls the C function copy_process to copy the content of the parent process to the child process, however, the eax value in the TSS of the sub-process is assigned 0 (this is why the sub-process returns 0). After the assignment is complete,
Copy_process returns the PID of the new process (this sub-process), which is saved to eax. At this time, the child process is generated. At this time, the child process and the parent process have the same code space, and the EIP of the procedure pointer register points to the same next instruction address, after fork returns its parent process normally, fork () returns the child process number and executes else (PID & gt; 0); when a process switches to run a sub-process, the running environment of the sub-process will be restored first, that is, the TSS task status segment of the sub-process will be loaded, where the eax value (set to 0 in copy_process) the eax register is also loaded. Therefore, when the sub-process is running, fork returns 0 and runs if (pid = 0 ).
[Note5]
The key to understanding it is to understand stack switching, stack pressure, and stack play!
Subprocess return:
The sub-process copies the stack content of the parent process, from high to low.
SS
ESP
Eflags
CS
EIP ----- This is the next instruction of int 0x80. It is also the place where sub-processes start to execute !!!!
DS
Es
FS
EdX
ECX
EBX
GS
ESI
EDI
EBP
Eax (0)
Because eax = 0, the sub-process returns 0 to fork.
Note: The user stack of the new process is set as the user stack of the parent process (the last pop-up SS, ESP ). If the parent and child processes share the user stack in copy_on_write mode
(In Linux), and before that, the parent Process modified the stack (if the parent process returns first, this is almost certainly), then, the system has created a copy of the user stack for the parent process, and the original user stack of the parent process is left to the child process. Then the system stack of the new process has been cleared, and the new process returns to the user State and to the fork function.
[Note6]
Fork discussion and evaluation:
Fork OK? Compared with other operating systems such as Windows, Windows has functions such as CreateProcess to create an independent New Process with empty hands. Then there are a lot of parameters that tell you exactly what this is...
Thank you !!!
K. I .s. S. (Keep it simple, stupid.) is the highest principle of UNIX.
Fork originated from the UNIX operating system. It was a genius invention of Bell's K & R (the father of UNIX and C !!! Linux is born with Unix blood, so it inherits the genius of its invention.
This method is highly efficient. Because the cost of replication is very low. In the implementation of computer networks and the implementation of the server in the client/server system, fork is often the most natural, effective, and appropriate means. Many people even doubt whether fork or client/server are available first, Because fork seems to be designed specifically for this purpose! The more important advantage is that it facilitates the establishment of a simple and effective inter-process communication pipeline between parent and child processes through pipe, and generates the user interface of the operating system, namely the shell Pipeline mechanism. This point is applicable to the development and application of UNIX and the formation of Uinx programming environment.
The formation of Unix programming styles has a profound impact. It can be said that this is a genius invention, which has greatly changed the development direction of the operating system.