Unix programming learning notes (19)-deep learning of fork functions in Process Management

Source: Internet
Author: User
Tags glob posix

Lien000034
2014-10-07

In the process control trilogy, we learned that fork is the first part of the trilogy to create a new process. However, we haven't covered much more in-depth information about fork, such as the relationship between new processes and calling processes created by fork, and data sharing between parent and child processes. Can fork be called without restrictions? If not, what is the maximum limit? In addition, we will also learn a fork variant vfork.

1. Relationship between the new process created by fork and the called Process

The relationship between all processes in a UNIX operating system is in a tree structure. In addition to the process where the process ID is 0 (Swapper process) and 1 (INIT process), other processes will have a parent process.

By default, the parent process of the new process generated by the fork function call is the call process. The running time of parent and child processes generated by fork function calls is unknown. If the child process is terminated before the parent process, there is no problem. However, if the parent process is terminated before the child process, will the child process have no parent process and the process tree structure be destroyed? The UNIX system handles this problem: if a process is terminated, set the parent process of all non-terminated child processes of the process to the INIT process (the INIT process will never be terminated ). The operation process is roughly as follows: when a process is terminated, the kernel checks all active processes one by one (because Unix does not provide an interface to obtain all sub-processes of a process ), if it is a child process of the process being terminated, set its parent process as the INIT process.

2. Data Sharing of parent and child Processes

The child process created by the fork function obtains the data space, heap, and stack copies of the parent process. However, in most cases, after fork, it will call exec to execute a new program, thus overwriting the copies copied from the parent process, which causes the kernel to do a lot of useless work.

Nowadays, many implementations adopt the copy-on-write (COW) technology. After the fork function is called, the Parent and Child processes share these regions, and the kernel changes the permissions of these regions to read-only. If any of the Parent and Child processes tries to modify these regions, the kernel copies only the region to be modified to the process.

Here is an example of data sharing,

#include <stdlib.h>#include <stdio.h>#include <unistd.h>#include <string.h>#include <errno.h>int glob = 0;intmain(void){    int var;    pid_t pid;    var = 0;    if ((pid = fork()) < 0) {        printf("fork error: %s\n", strerror(errno));        exit(-1);    } else if (pid == 0) {        var++;        glob++;        printf("child: glob=%d, var=%d\n", glob, var);        exit(0);    }    wait(NULL);    printf("parent: glob=%d, var=%d\n", glob, var);    exit(0);}

The parent process of this program after fork waits for the child process to end, and the child process adds the Integer Variables glob and VAR 1. compile the program, generate and execute forkdemo. from the following running results, we can see that the glob and VAR variables modified by the sub-process have no impact on the parent process.

lienhua34:demo$ gcc -o forkdemo forkdemo.clienhua34:demo$ ./forkdemochild: glob=1, var=1parent: glob=0, var=0

Although the child process uses a copy of the data of the parent process, the modification of the child process has no impact on the parent process. But there is a special case: file I/O. Fork copies all open file descriptors of the parent process to the child process. The same file descriptor in the parent-child process shares the same file table item (for the relationship between the file descriptor and the file table item, see "kernel I/O Data Structure "). Let's take an example,

#include <stdlib.h>#include <stdio.h>#include <unistd.h>#include <string.h>#include <errno.h>intmain(void){    pid_t pid;    printf("before fork\n");    if ((pid = fork()) < 0) {        printf("fork error: %s\n", strerror(errno));        exit(-1);    } else if (pid == 0) {        printf("in child process\n");        exit(0);    }    wait(NULL);    printf("in parent process\n");    exit(0);}

Compile the program and generate and execute the forkdemo file,

lienhua34:demo$ gcc -o forkdemo forkdemo.clienhua34:demo$ ./forkdemobefore forkin child processin parent processlienhua34:demo$ ./forkdemo > foolienhua34:demo$ cat foobefore forkin child processbefore forkin parent process

Running forkdemo does not see any problems until standard output is redirected. When the redirection standard is output to a file (./forkdemo> Foo), we can see that the string printed by the parent process is after the string printed by the child process. This is because the standard output of the parent-child process shares the same file table item, that is, the same file offset.

In addition, we noticed that when the standard output is not redirected, the string "before fork" is output only once, but twice after the standard output is redirected to the file. This is because the standard I/O library function printf is a row buffer when the standard output is connected to the terminal device, so after the fork function, the data in the buffer has been washed. After the standard output is redirected to a file, the printf function is fully buffered. Before fork, the printf function is called to write the string "before fork" to the buffer, fork is still in the buffer, so a copy is copied to the sub-process. When both the parent and child processes call the exit function, the data in the buffer zone is flushed into the file, so two copies of "before fork" appear ".

3 typical fork application scenarios

Fork has two typical application scenarios:

• Create a new process to execute a new program. That is, after fork is called, the sub-process immediately calls the exec function to execute a new program, for example, example 2 in the process control trilogy.

• The parent process needs to copy itself so that the parent and child processes can execute different code segments at the same time. This is common in network service processes: the parent process waits for a service request from the client. After receiving a request, the parent process calls fork and then processes the request, the parent process continues to wait for the next service request. The code framework is as follows:

void serve(int sockfd){    int clfd;    pid_t pid;    for (;;) {        clfd = accept(sockfd, NULL, NULL);        if (clfd < 0) {            /* print error message */            continue;        }        if ((pid = fork()) < 0) {            /* fork error */            continue;        } else if (pid == 0) {            /* deal with clfd in child process */            close(clfd);            exit(0);        } else {            /* in parent process,            close the accepted socket "clfd",            then continues to listen next socket connection. */        }     }}
4. What is the maximum number of fork function calls?

Each actual user ID has the maximum number of processes at any time. Child_max specifies the maximum number of processes that an actual user ID can have at any time. Let's look at the following example,

#include <stdlib.h>#include <stdio.h>#include <unistd.h>#include <string.h>#include <errno.h>intmain(void){    pid_t pid;    int count;    printf("CHILD_MAX: %ld\n", sysconf(_SC_CHILD_MAX));    count = 1;    for (;;) {        if ((pid = fork()) < 0) {            printf("fork error: %s\n", strerror(errno));            break;        } else if (pid == 0) {            sleep(3);            exit(0);        }        count++;    }    printf("count: %d\n", count);    exit(0);}

Compile the program and generate and run the forkdemo file,

lienhua34:demo$ gcc -o forkdemo forkdemo.clienhua34:demo$ ./forkdemoCHILD_MAX: 15969fork error: Resource temporarily unavailablecount: 15737

From the preceding running results, we can see that my system specifies that each actual user ID can have a maximum of 15969 processes at any time. After fork creates 15737 processes (including the calling process itself) in the for loop, fork fails to create a new process because there are no available resources.

5 fork variant vfork

The vfork function is a variant of the fork function. Its call sequence and return value are the same as those of the fork function, but their semantics is different. The vfork description on Wikipedia is as follows (refer to fork (system_call )).

Vfork is a variant of fork with the same calling convention and much the same semantics; it originated in the 3bsd version of UNIX, [citation needed] The first UNIX to support virtual memory. it was standardized by POSIX, which permitted vfork to have exactly the same behavior as fork, but marked obsolescent in the 2004 edition, [4] and has disappeared from subsequent editions.

We can see that in POSIX 2004, The vfork function has been replaced, and the vfork function will no longer appear in later versions. However, since we have mentioned this in apue, let's take a look at the differences between the vfork function and the fork function.

There are two differences between the vfork function and the fork function:

1. Fork copies the address space of the parent process to the child process, but vfork does not. The child process runs in the address space of the parent process.

2. Fork cannot ensure the execution sequence of parent and child processes. vfork ensures that child processes are executed first, and the parent process will be blocked until the child process calls exit or exec. (Note: This feature of vfork may lead to deadlocks. If the child process is dependent on the further action of the parent process before calling exit or exec, and the parent process is waiting for the child process, as a result, the loop wait problem occurs.)

Let's compare the differences between vfork and fork in data processing,

#include <stdlib.h>#include <stdio.h>#include <unistd.h>#include <string.h>#include <errno.h>int glob = 0;intmain(void){    int var;    pid_t pid;    var = 0;    if ((pid = vfork()) < 0) {        printf("fork error: %s\n", strerror(errno));        exit(-1);    } else if (pid == 0) {        var++;        glob++;        printf("child: glob=%d, var=%d\n", glob, var);        exit(0);    }    printf("parent: glob=%d, var=%d\n", glob, var);    exit(0);}

The above program copies the example program for the fork function to process shared data, changes fork to vfork, and removes the wait (null) statement. Save as vforkdemo. C, compile the program, generate and execute the vforkdemo file,

lienhua34:demo$ gcc -o vforkdemo vforkdemo.clienhua34:demo$ ./vforkdemochild: glob=1, var=1parent: glob=1, var=1

From the preceding running results, we can see that the child process created by vfork has modified the glob and VAR variables, and the parent process has also seen this modification.

The vfork function may occur because the fork of the early system did not implement the write-time replication technology, resulting in a lot of useless work in each fork call (in most cases, it is called exec to execute a new program after fork) the efficiency is not high, so the vfork function is created. The current implementation basically uses the write-time replication technology, and when the vfork function is used improperly, there will be deadlocks, so the vfork function has no need to exist.

(Done)

Unix programming learning notes (19)-deep learning of fork functions in Process Management

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.