Fork () and vfork () Learning
Through the previous sections, we learned about the concept of processes and the implementation of processes in Linux. In this section, we will learn how to create a process in Linux.
I. Preface:
By studying the principle knowledge, we know that each process is identified by the process ID. When a process is created, the system assigns a unique process ID to it. When a process sends its termination message to its parent process (the process that creates the process), it means that the entire lifecycle of the process has ended. In this case, all resources occupied by the process, including the process ID, are released.
How to create a process in Linux?
There are two ways to create a process: one is created by the operating system, and the other is created by the parent process (usually a child process ).
The system calls fork to create a new process. Vfork can also create a process, but it actually calls the fork function.
Tiger-John note:
1. The processes created by the operating system are equal and there is no resource inheritance relationship between them.
2. The processes created by the parent process generally have an inheritance relationship between child processes.
3. When the system is started, the OS creates some processes, which are responsible for managing and allocating system resources, that is, system processes.
For example, the idle process No. 0 is the first thread from scratch, which is mainly used for energy saving. For the idle process, the system initially guided the process No. 0, the corresponding PCB is init_task (). It should be noted that it is the head of the pcb of the No. 0 process, not the No. 1 INIT process. After the boot is complete, it becomes the idle process on the CPU. Each CPU has an idle process that is registered in the init_tasks [] array. The idle process does not enter the ready queue. After the system is stable, the idle process is scheduled only when the ready queue is empty. If no other process is running, it occupies CPU for a large amount of time.
Process 1 (INIT process) is a user-level process started by the kernel. It is the parent process of all user processes. In fact, in the initialization phase, linux2.6 first establishes it as a kernel thread kernel_init:
Kernel_thread (kernel_init, null, clone_fs | clone_sighand );
Clone_fs | clone_files | clone_sighand indicates that Thread 0 and thread 1 share the File System (clone_fs), opened files (clone_files), and signal processing programs (clone_sighand ). When the scheduler selects the kernel_init kernel thread, kernel_init starts to execute some kernel initialization functions to initialize the system.
So how does the kernel_init () kernel thread become a user process?
In fact, the kernel function of kernel_init () calls the execve () system call, which is loaded into the executable program Init (/sbin/init) in the user State ). Note that the kernel function kernel_init () and the executable file init in the user State are different codes and run in different States. Therefore, init is a common process started by the kernel thread, which is also the first process in the user State. The INIT process never stops because it creates and monitors the activity of all processes outside the operating system.
Learning of Fork () and vfork () Functions
1. Fork () function
After the fork function is called, the current process is split into two processes, one being the original parent process and the other being the newly created child process. After the parent process calls fork, the returned value is the child process ID, and the returned value is 0. If the process fails to be created, only-1 is returned. Generally, the cause of failure is that the number of sub-processes owned by the parent process exceeds the specified limit (eagain is returned) or the memory is insufficient (enomem is returned ). We can determine the process based on the returned value. Generally, after the fork function is called, the Parent and Child processes who execute the function are not fixed, depending on the scheduling algorithm used by the kernel. Generally, the OS grants all processes the same execution right unless some processes have a higher priority. If there is an orphan process, that is, the parent process dies before the child process, the child process will be adopted by the INIT process.
Function instance: Create a process using the fork () function:
1 # include <sys/types. h>
2 # include <unistd. h>
3 # include <stdio. h>
4
5 main ()
6 {
7 pid_t PID;
8 printf ("PID before fork (): % d/N", (INT) getpid ());
9
10 pid = fork ();
11 if (PID <0 ){
12 printf ("error in fork! /N ");
13}
14 else if (0 = PID ){
15 printf ("I'm the child process, curppid is % d, parentpid is % d/N", PID, (INT) getppid ());
16}
17 else {
18 printf ("I'm the parent process, child PID is % d, parentpid is % d/N", PID, (INT) getpid ());
19}
20
After the program is debugged, the result is as follows:
Think @ Ubuntu :~ /Work/process_thread/fork $./fork
PID before fork (): 4566
I'm the parent process, child PID is 4567, parentpid is 4566
I'm the child process, curppid is 0, parentpid is 4566
The execution result shows that after the fork () function is called, two values are returned. The return value of the child process is 0, and the return value of the parent process is the ID of the created child process.
Tiger-John note:
1> Linux processes generally include code segments, data segments, and stack segments. The code segment stores executable code of the program. The data segment stores global variables, constants, and static variables of the program. The stack stores dynamically allocated memory variables. The stack is used for function calls, stores function parameters and local variables defined within the function.
2> some people say that after the fork function is called, the fork () function returns two values. This is an incorrect statement. In fact, after the system calls the fork () function, fork () copies all the content of the calling process to the newly generated child process, the current process is split into two processes that are being executed separately and do not interfere with each other.
3>. Check a function instance to see how the fork () function is executed by the system.
1 # include <stdio. h>
2 # include <unistd. h>
3 # include <sys/types. h>
4
5 Int main ()
6 {
7 pid_t PID;
8 int COUNT = 0;
9 pid = fork ();
10
11 printf ("this is first time, pid = % d/N", pid );
12 printf ("this is the second time, pid = % d/N", pid );
13 count ++;
14 printf ("Count = % d/N", count );
15
16 if (pid> 0 ){
17 printf ("this is the parent process, the child has the PID: % d/N", pid );
18}
19 else if (! PID ){
20 printf ("this is the child process./N ");
21}
22 else {
23 printf ("fork failed./N ");
24}
25
26 printf ("this is third, pid = % d/N", pid );
27 printf ("this is four time, pid = % d/N", pid );
28 return 0;
29}
Program debugging result
Think @ Ubuntu :~ /Work/process_thread/fork1 $./fork
This is first time, pid = 4614
This is the second time, pid = 4614
Count = 1
This is the parent process, the child has the PID: 4614
This is third, pid = 4614.
This is four time, pid = 4614
This is first time, pid = 0
This is the second time, pid = 0
Count = 1
This is the child process.
This is third, pid = 0
This is four time, pid = 0
Think @ Ubuntu :~ /Work/process_thread/fork1 $
Tiger-John note:
From the execution result of the above program, we can see a strange phenomenon: why are the printf statements executed twice,
But the "count ++;" statement is only executed once?
After fork () is called, The system splits the function into two functions for separate execution without interfering with each other.
2. Differences between vfork and fork:
1> vfork can also create a process, but it actually calls the fork function.
2> let's take a look at the two programs and execution results before explaining their differences.
Function Example 1: use the fork () function to create a process
1 # include <stdio. h>
2 # include <sys/types. h>
3 # include <unistd. h>
4 # include <stdlib. h>
5 Int globvar = 5;
6
7 int main (void)
8 {
9 pid_t PID;
10 int Var = 1;
11 int I;
12 printf ("fork is different with vfork/N ");
13
14 pid = fork ();
15 if (! PID ){
16 I = 3;
17 while (I --> 0 ){
18 printf ("child process is running/N ");
19 globvar ++;
20 var ++;
21 sleep (1 );
22}
3 printf ("Child's globvar = % d, Var = % d/N", globvar, VAR );
24}
25 else if (PID ){
26 I = 5;
27 while (I --> 0 ){
28 printf ("parent process is running/N ");
29 globvar ++;
30 var ++;
31 sleep (1 );
32}
33 printf ("Parent's globvar = % d, VAR % d/N", globvar, VAR );
34 exit (0 );
35}
36 else {
37 perror ("process creation failed/N ");
38 exit (-1 );
39}
40}
After the program has been debugged;
Think @ Ubuntu :~ /Work/process_thread/fork3 $./fork
Fork is different with vfork
Parent process is running
Child process is running
Child process is running
Parent process is running
Child process is running
Parent process is running
Child's globvar = 8, Var = 4
Parent process is running
Parent process is running
Parent's globvar = 10, Var = 6
Function Example 2: Use the vfork () function to create a process
1 # include <stdio. h>
2 # include <sys/types. h>
3 # include <unistd. h>
4 # include <stdlib. h>
5 Int globvar = 5;
6 int main (void)
7 {
8 pid_t PID;
9 int Var = 1;
10 int I;
11
12 printf ("fork is different with vfork! /N ");
13
14 pid = vfork ();
15 if (! PID ){
16 I = 3;
17 while (I --> 0)
18 {
19 printf ("child process is running/N ");
20 globvar ++;
21 var ++;
22 sleep (1 );
23}
24 printf ("Child's globvar = % d, Var = % d/N", globvar, VAR );
25}
26 else if (PID ){
27 I = 5;
28 while (I --> 0)
29 {
30 printf ("parent process is running/N ");
31 globvar ++;
32 var ++;
33 sleep (1 );
34}
35 printf ("Parent's globvar = % d, VAR % d/N", globvar, VAR );
36 exit (0 );
37}
38 else {
39 perror ("process creation failed/N ");
40 exit (0 );
41}
42
43}
After the program is debugged, the result is:
Think @ Ubuntu :~ /Work/process_thread/fork3 $./vfork
Fork is different with vfork!
Child process is running
Child process is running
Child process is running
Child's globvar = 8, Var = 4
Parent process is running
Parent process is running
Parent process is running
Parent process is running
Parent process is running
Parent's globvar = 13, Var = 5
Tiger-John note:
We can see some differences through the execution of the above two functions:
1. When fork is used to create a child process, the child process inherits the global and local variables of the parent process. In the sub-process, the values of the global variables globvar and local variables var increase by 3, respectively 8 and 4. whether it is a global variable or a local variable, the modifications made by the child process and the parent process do not affect each other.
2. The parent process increments by 5. The final result is 10 and 6.
The running result of the above program proves that the Fork sub-process has its own independent address space.
3. The execution sequence of sub-processes and parent processes is random and there is no fixed sequence. The output of the Parent and Child processes is mixed.
--------------------------------------------
1. after a child process is created using the vfork () function, globvar and VaR in the parent process increase by 8. this is because the child process of vfork shares the address space of the parent process, and the modification variable of the child process is visible to the parent process.
2. after the vfork () function is used, the sub-process is printed before the sub-process. After the parent process is running, vfork () ensures that the sub-process is executed first, the parent process is in the blocking wait state before the child process calls exit to get Exec.
3> now let's take a look at the differences between fork () and vfork () functions:
(1) fork (): When fork () is used to create a sub-process, the sub-process only completely copies the resources of the parent process. In this way, the child process is independent of the parent process and has good concurrency.
Vfork (): When you use vfor to create a sub-process, the operating system does not completely copy the address space of the parent process to the sub-process. Instead, the child process shares the address space of the parent process, that is, the child process runs completely in the address space of the parent process. The sub-process modifies any data in the address space as seen by the parent process.
(2) fork (): The execution sequence of parent and child processes is not fixed;
Vfork (): ensure that the sub-process runs first and shares data with the parent process before exec or exit is called. Only after it calls exec or exit can the parent process be scheduled to run.
(3) vfork ensures that the sub-process runs first. After it calls exec or exit, the parent process can be scheduled to run. If the sub-process depends on the further action of the parent process before calling these two functions, a deadlock will occur.