Start with an interview question: fork

Source: Internet
Author: User

Post from: http://blog.csdn.net/yuwenliang/archive/2010/01/18/5209239.aspx

 

The following C program is provided and compiled using GCC in Linux:

1 # include "stdio. H"

2 # include "sys/types. H"

3 # include "unistd. H"

4

5 Int main ()

6 {

7 pid_t pid1;

8 pid_t pid2;

9

10 pid1 = fork ();

11 pid2 = fork ();

12

13 printf ("pid1: % d, pid2: % d/N", pid1, pid2 );

14}

 

The requirements are as follows:

It is known that no new process is executed during the period from the execution of this program to the completion of all processes of this program.

1. After the program is executed, several processes will run in total.

2. If the output result of one process is "pid1: 1001, pid2: 1002", write the output result of other processes (regardless of the Process execution sequence ).

Obviously, the purpose of this question is to test the fork execution mechanism in Linux. Next we will analyze this question and talk about the running mechanism of fork in Linux.

 

Prerequisites

Here we will first list some necessary preparations. If you are familiar with the process mechanism in Linux, you can skip this step.

1. A process can be seen as an execution process of a program. In Linux, each process has a unique PID to identify the process. PID is a positive integer from 1 to 32768, where 1 is usually a special process init, and other processes are numbered from 2. After 32768 is used up, start from 2 again.

2. in Linux, there is a structure called a process table to store the currently running process. You can run the "PS aux" command to view all running processes.

3,
Into
In Linux, the process is tree-like. init is the root node, and other processes have parent processes. The parent process of a process is the process that starts the process. This process is called the child process of the parent process.

4,
Fork
The role is to copy a process that is the same as the current process. All the data (variables, environment variables, program counters, etc.) values of the new process are the same as those of the original process, but it is a brand new process and serves as a sub-process of the original process.

 

Key to solving problems

With the above preparation knowledge, let's take a look at the key to solving the problem. In my opinion, the key to solving the problem is to realize that fork cut the program into two sections. See:

 

Indicates a program containing fork, and the fork statement can be considered to split the program into two parts: A and B. Then the entire program runs as follows:

Step 1. Set the shell to directly execute the program and generate the process P. P executes all the code of part..

Step 2: When pid = fork (); is executed, P starts a process Q, Q is a sub-process of P, and P is a process of the same program. Q inherits the current values of all the variables, environment variables, and program counters of P.

Step 3. In the p process, fork () returns the PID of Q to the variable PID and continues to execute part. B code.

Step 4. In Process Q, assign 0 to the PID and continue executing part. B code.

There are three key points:

1. P executes all programs, while Q only executes part. B, that is, the program after fork. (This is because Q inherits P's PC-program counter)

2. Q inherits the current environment when the fork () Statement is executed, rather than the initial environment of the program.

3. The fork () Statement in P starts the process Q and returns the q pid. The fork () Statement in Q does not start a new process and only returns 0.



4,
After fork, whether the parent process is executed first or the child process is executed first is uncertain, depending on the scheduling algorithm used by the kernel.

Problem Solving

The following uses the knowledge described above to solve the problem. Here I will put two questions together for analysis.

1. Execute this program from shell and start a process. Let's set this process to P0 and set its PID to XXX (the PID does not need to be known during the problem solving process ).

2. When pid1 = fork (); is executed, P0 starts a sub-process P1, And the PID of P1 is 1001. We do not care about P1.

3. Fork In P0 returns 1001 to pid1 and continues to run to pid2 = fork (); at this time, start another new process, set it to P2, And the PID of P2. P2.

4. The second fork in P0 returns 1002 to pid2 and continues to run the subsequent program. Therefore, the P0 result is "pid1: 1001, pid2: 1002 ".

5. When P2 is generated, pid1 = 1001 in P0, so pid1 in P2 inherits 1001 of P0, and pid2 = 0 as a sub-process. P2 starts execution after the second fork, and then outputs "pid1: 1001, pid2: 0 ".

6. Then let's look at P1. In P1, the first fork returns 0 to pid1, and then runs the following statement. The subsequent statement is pid2 = fork (). After execution, P1 generates a new process, which is set to P3. Leave it alone.

7. In P1, the second fork returns the PID of P3 to pid2. The PID of P3 is 1003, so pid2 of P1 is 1003. Pid1: 0, pid2: 1003 ".

8. As a child process of P1, P3 inherits pid1 = 0 in P1, and the second fork returns 0 to pid2. Therefore, P3 finally outputs "pid1: 0, pid2: 0 ".

9. Now, the entire execution process is complete.

Answer:

1. A total of four processes are executed. (P0, P1, P2, P3)

2. The output of several other processes is as follows:

Pid1: 1001, pid2: 0

Pid1: 0, pid2: 1003

Pid1: 0, pid2: 0

A process tree with P0 as the root is provided:

 

 

Verify

Next we will go to Linux to actually execute this program to verify our answer.

Programs such:

 

The results after GCC compilation and execution are as follows:

 

Because we are unlikely to just hit the situation where the PID is allocated to 1001, the specific value may be different from the answer. However, if we think of 2710 as the base number, the result is consistent with the above answer. In addition, since fork is followed by parent process execution or sub-process execution, it is uncertain, depending on the scheduling algorithm used by the kernel, therefore, the order of output result rows for executing the program multiple times varies.

 

Summary

It should be said that this is not a particularly difficult issue or a particularly difficult issue. However, due to the complexity of the fork function running mechanism, the problem becomes very complicated when two forks are side by side. The key to solving this problem is to have a certain understanding of the Process Mechanism in Linux, and to grasp the key points mentioned above about fork. My friend said that the question was given in five minutes. It should be said that the time was sufficient, but during the interview, it is still a test of a person's mastery of the Process, fork and on-site reasoning ability.

I hope this article will help my friends have a clear understanding of the fork execution mechanism.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.