Deep understanding of the fork function of Linux

Source: Internet
Author: User
Tags function definition message queue posix wrapper

First, the problem introduced
During work, a system designer throws the following question, the following code, outputs several "-"? :

/****************************************************************************** Copyright by Thomas Hu, all rights reserved! Filename    : fork01.c Author      : Thomas Hu Date        : 2012-8-5 Version     : 1.0 description:fork function problem prototype ************ /  #include <unistd.h>  #include <stdio.h>    int main ()  {      int i = 0;      for (i = 0; i < 2; i++)      {          fork ();          printf ("-");      }        return 0;  }  

After n long, no one answered the question (perhaps everyone was busy and didn't have time to ^_^ him).

If your answer is 2, it is recommended that you first look at the fork function usage instructions in Linux;
If your answer is 6, you have some understanding of the fork function, but you still need to read this document;
If your answer is 8, and you understand the rationale behind it (not the conclusion of executing the program), then you don't need to read this article, please take a detour to walk the ^_^.

I have a rough analysis, and then enter the code to compile the execution, the execution of the result is 8, feel incredible! (theoretically 6 Ah, to 8个百 think not its solution, later consulted the data, only to find that he has not understood the essence of fork, so essays this article, we discuss together. )

To understand the execution of the fork, you must first understand the process concept in the operating system. A process that consists mainly of three elements:
1. A procedure that can be implemented;
2. All data associated with the process (including variables, memory space, buffers, etc.);
3. Execution context of the program (execution contexts).
It is simple to understand that a process represents a state in the execution of an executable program. Operating system management of processes, typically, is done through the process table. Each table entry in the process table that records the situation of a process in the current operating system. In the case of a single CPU, only one process consumes the CPU at each particular time, but there may be multiple active (pending or continued) processes in the system.
A register called "program counter, PC," which indicates the position of the next instruction to be executed by the current CPU-consuming process.
When the CPU time allocated to a process is exhausted, the operating system saves the value of the register associated with the process to the corresponding table entry in the process table; The context of the process that will take over the CPU of the process, read from the process table, and update the corresponding register (this process is called "Context Exchange" ( Process context Switch) ", the actual contextual exchange needs to involve more data, which is not related to fork, no longer say, the main thing to remember is that the program register PC record where the program is currently executed, is the process context of important content, swap out CPU process to save the value of this register, the process of swapping into the CPU, but also according to the process table to save the process of executing context information, update this register.

Second, the fork function detailed

#include <unistd.h>
#include <sys/types.h>
function definition:
pid_t fork (void);
(pid_t is a macro definition whose essence is that int is defined in #include<sys/types.h>)
return value:
If a successful call returns two values, the child process returns 0, and the parent process returns the child process ID; otherwise, an error returns-1.
There are two possible reasons for fork errors: (1) The current number of processes has reached the system-specified limit, when the value of errno is set to Eagain. (2) The system memory is low, then the value of errno is set to Enomem.
Function Description:
An existing process can call the fork function to create a new process. A new process created by Fork is called a subprocess (child process). The fork function is called once but returns two times. The only difference of two returns is that the child process ID is returned in the parent process with a value of 0. The reason for returning the child process ID to the parent process is that because there can be more than one child process for a process, there is no function that enables a process to obtain the process ID of all its child processes. For a child process, fork returns 0 to it because it can call Getpid () to get its own PID at any time, or call Getppid () to get the ID of the parent process. (Process ID 0 is always used by the interchange process, so the process ID of a child process cannot be 0).
A child process is a copy of the parent process that obtains a copy of the parent process's data space, heap, stack, and so on. Note that the child process holds a "copy" of the above storage space, which means that these storage spaces are not shared between parent and child processes.
Linux copies the address space content of the parent process to the child process, so that the child process has a separate address space. As you can imagine, 2 processes are running at the same time, and Unison, after the fork, they do different jobs, that is, bifurcation. This is the reason why fork is called fork.
The Linux help manual, which has a very detailed description of the fork function, is as follows:

DESCRIPTION
Fork () Creates a new process by duplicating the calling process. The new process, referred to as the child, was an exact duplicate of the calling process, referred to as the parent, except For the following
Points

* The child have its own unique process ID, and this PID does not match the ID of any existing process group (Setpgid (2)).

* The child's parent process ID is the same as the parent ' s process ID.

* The child does isn't inherit its parent S memory locks (Mlock (2), Mlockall (2)).

* Process Resource Utilizations (Getrusage (2)) and CPU Time Counters (times (2)) is reset to zero in the child.

* The child's set of pending signals is initially empty (sigpending (2)).

* The child does not inherit semaphore adjustments from its parent (SEMOP (2)).

* The child does not inherit record locks from its parent (FCNTL (2)).

* The child does isn't inherit timers from its parent (Setitimer (2), Alarm (2), Timer_create (2)).

* The child does not inherit outstanding asynchronous I/O operations from its parent (Aio_read (3), Aio_write (3)), nor Doe S it inherit any asynchronous I/O contexts from its parent (Seeio_setup (2)).

The process attributes in the preceding list is all specified in posix.1-2001. The parent and child also differ with respect to the following linux-specific process attributes:

* The child does isn't inherit directory change notifications (Dnotify) from its parent (see the description of f_notify in Fcntl (2)).

* The Prctl (2) pr_set_pdeathsig setting is reset so, the child does not receive a signal when its parent terminates.

* Memory mappings that has been marked with the Madvise (2) Madv_dontfork flag is not inherited across a fork ().

* The termination signal of the SIGCHLD (see Clone (2)).
Note the following further points:

* The child process was created with a single thread?.  The one that called fork (). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition Vari Ables
and other pthreads objects; The use of Pthread_atfork (3) May is helpful for dealing with problems that this can cause.

* The child inherits copies of the parent s set of open file descriptors. Each file descriptor the "refers to the same" open file description (see open (2)) as the corresponding file Descrip Tor in the
Parent. This means is the descriptors share open file status flags, current file offset, and signal-driven I/O attributes (s EE the description of F_setown and F_setsig in Fcntl (2)).

* The child inherits copies of the parent's set of open message queue descriptors (see Mq_overview (7)). Each descriptor in the refers to the same open message queue description as the corresponding
Descriptor in the parent. This means, the descriptors share the same flags (MQ_FLAGS).

* The child inherits copies of the parent's set of Open directory streams (see Opendir (3)). posix.1-2001 says that the corresponding directory streams in the parent and child may share the directory stream
positioning; On LINUX/GLIBC they does not.

RETURN VALUE
On success, the PID of the child process was returned in the parent, and 0 are returned in the child. On failure, 1 was returned in the parent, No child process was created, and errno is set appropriately.

ERRORS
Eagain Fork () cannot allocate sufficient memory to copy the parent's page tables and allocate a task structure for the CHI Ld.

Eagain It was wasn't possible to create a new process because the caller's RLIMIT_NPROC resource limit was encountered. To exceed this limit, the process must has either the cap_sys_admin or the Cap_sys_resource
Capability.

Enomem Fork () failed to allocate the necessary kernel structures because memory is tight.

Conforming to
SVR4, 4.3BSD, posix.1-2001.

NOTES
Under Linux, fork () is implemented using Copy-on-write pages, so the only penalty that it incurs is the time and memory Required to duplicate the parent's page tables, and to create a unique task structure for
The child.

Since version 2.3.3, rather than invoking the kernel ' s fork () system call, the GLIBC fork () wrapper This is provided as PA RT of the NPTL threading implementation invokes clone (2) with flags that provide the same
Effect as the traditional system call. The glibc wrapper invokes any fork handlers that has been established using Pthread_atfork (3).

The above English content, I believe everyone can understand it ^_^, if not understand, it is really not suitable for the program AH. Next time, I'll give you a translation (if there is a need to ^_^).

Three, the problem analysis
The front said so much nonsense, in fact, is to solve the "weird" output of 8 "-" problem. The fork function, which causes the child process to replicate the entire virtual address space of the parent process (including the mutex state, condition variables, other pthread objects, and so on), inherits the open file descriptor collection of the parent process, opens the Message Queue descriptor collection, and opens a collection of directory streams, etc., but memory locks, CPU time slices, flags, Record locks, timers, etc. are not inherited from the parent process.
The following step from the for loop to analyze the source code.
1, when i = 0 o'clock, executes the fork function in the loop body, at which time the parent process (named P) creates a child process (named a). At this point, process A has the same condition variable as the parent process, in process A, I is also 0, and then two processes P and A execute the printf statement. Note that there are two processes in the system at this time, respectively, analyzed as follows.
2, in the P process, I plus 1, at this time I = 1, meet the cycle conditions, into the loop body execution. Execute the fork function and create a sub-process B again, at which time in process p and B, i = 1; then two processes p and a execute the printf statement.
3, in a process, I plus 1, at this time I = 1, meet the cycle conditions, into the loop body execution. Execute the fork function to create a child process of process a (named AA). At this point, in process A and AA, i = 1; then two processes A and AA execute the printf statements respectively.
4, in the process P, A, AA, b process, I again add 1, at this time i = 2; both do not meet the criteria of the loop body judgment, 4 processes jump out of the loop body, execute the return statement behind the loop body, the process ends.
The above analysis process, as shown (same color is the same process):


A careful reader may exclaim that 4 processes, not a total of only 6 printf statements executed? How can you print 8 "-"? Yes, only 6 printf Statements were executed, no doubt!
This is because printf ("-"); The statement is in mischief! We know that the device under the Linux/unix has the concept of "block device" and "character device", the so-called block device, is a piece of data access device, character device is a one-time access to a character device. disk, memory, and display are block devices, character devices such as keyboards and serial ports. Block devices generally have caches, and character devices are generally not cached.
So, for the above procedure, printf ("-"); put "-" into the cache, and there is no real output, when the fork, the cache is copied to the child process space, so, there are two more, it becomes 8, instead of 6.
If we modify the above printf statement as:
printf ("-\n");
Or
printf ("-");
Flush ();

There is no problem, the program will only output 6 "-", because the program encountered "\ n" or EOF, or slow central full, or file descriptor closed, or active flush, the data will be brushed out of the buffer.

The complete code is as follows:

/****************************************************************************** Copyright by Thomas Hu, all rights reserved! Filename    : fork02.c Author      : Thomas Hu Date        : 2012-8-5 Version     : 1.0 description:fork function issue, print process number, pstre E-p viewing process tree relationships ******************************************************************************/  #include < unistd.h>  #include <stdio.h>    int main ()  {      int i = 0;      for (i = 0; i < 2; i++)      {          fork ();                    /* Note: The following printf has "\ n" *          /printf ("ppid=%d, pid=%d, i=%d \ n", Getppid (), Getpid (), i);      }        Sleep (10); /* Let the process stay for 10 seconds so that we can use pstree-p to view the process tree */        return 0;  }  

The results of the implementation are as follows:


Viewed through the process tree as follows:


As shown, this is the shadow and double-sided box where the two sub-processes replicate the contents of the parent process standard output slow central, resulting in multiple outputs.


Note: The above process tree analysis of two pictures, excerpted from: http://coolshell.cn/articles/7965.html, the copyright belongs to the original author all, thank you here!

Iv. Summary
In the field of computer programming, there has never been a so-called "strange" event, there is a cause, there will be fruit! If there is a "strange" event, it shows that in a hidden corner, we did not think, or did not understand its essence, will lead to certain phenomena "incredible"!
We only through the phenomenon, see through the essence, some "strange" problem, can solve, finally found that "strange" phenomenon itself is a natural phenomenon, is our ignorance caused the "supernatural" event ^_^.

Deep understanding of the fork function of Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.