How to Prevent zombie processes-server development for multi-process concurrency.

Source: Internet
Author: User

In the concurrent server design, a common method is to use fork to create sub-processes for each connection to process client requests separately.

The flowchart is as follows:

It can be seen that the parent process directly executes accept to wait for the next connection and does not use wait or waitpid to wait for the child process to return. What are the consequences? When the sub-process exits, it is not actually destroyed, instead, a data structure is reserved to record information such as its exit status (because the parent process may obtain this information ). If the parent process does not need to wait for wait or waitpid, this information will be retained until the parent process exits. Unfortunately, the concurrent server runs normally and creates sub-processes for each connection, causing serious resource leakage over time. Therefore, concurrent servers must adopt some measures to prevent zombie processes.

When the child process exits, it will send a sigchld signal to the parent process. We can use signal to register a callback function for the sigchld before the parent process loops accept to wait for the child process to return.

Signal function declaration:

[CPP]
View plaincopyprint?
  1. Void (* signal (INT signo, void (* func) (INT );
void (*signal(int signo,void (*func)(int)))(int);

This is a function declaration containing function pointers. Its second parameter and return value are both function pointers. You can also use the following declaration method.[CPP]
View plaincopyprint?

  1. Typedef void (* sig_t) (INT );
  2. Sig_t signal (INT signo, sig_t func );
typedef void(*sig_t) ( int );sig_t signal(int signo,sig_t func);

Note: The two function declarations are the same.

The first parameter of signal is the signal to be processed (sigchld here), and the second parameter is the pointer to the callback function to be called when the signal is received.

The procedure is as follows:

[CPP]
View plaincopyprint?
  1. Void sid_child (INT signo) // callback function for processing sigchld Signals
  2. {
  3. Pid_t PID;
  4. Int Stat;
  5. While (pid = waitpid (-1, & stat, wnohang)> 0 );
  6. Return;
  7. }
Void sid_child (INT signo) // callback function for processing sigchld signal {pid_t PID; int Stat; while (pid = waitpid (-1, & stat, wnohang)> 0); return ;}

[CPP]
View plaincopyprint?
  1. // Skip socket and bind
  2. Listen (servfd, 10 );
  3. Signal (sigchld, sid_child); // It is the registration callback function sid_child of sigchld.
  4. While (1)
  5. {
  6. If (cliefd = accept (servfd, (sockaddr *) 0, 0) <0)
  7. {
  8. If (errno = eintr) continue;
  9. Else err_sys ("Accept call error ");
  10. }
  11. // Fork is omitted here to create a sub-process to process requests
  12. }
// The preceding socket, bindlisten (servfd, 10), signal (sigchld, sid_child), and sigchld registration callback function sid_childwhile (1) are omitted here) {If (cliefd = accept (servfd, (sockaddr *) 0 0) <0) {If (errno = eintr) continue; else err_sys ("Accept call error");} // here, fork is omitted to create a subprocess to process requests}

Explanation:

1,

[CPP]
View plaincopyprint?
  1. Signal (sigchld, sid_child );
signal(SIGCHLD,sid_child);  

After the call, when the parent process receives a sigchld signal from the child process, it will interrupt the call to the callback function sid_child.

2,

[CPP]
View plaincopyprint?
  1. While (pid = waitpid (-1, & stat, wnohang)> 0 );
while((pid=waitpid(-1,&stat,WNOHANG))>0);

In the sid_child function, waitpid is called cyclically to wait for the child process.

Why do we need to call waitpid cyclically? What is the use of waitpid?

Waitpid statement:

[CPP]
View plaincopyprint?
  1. Pid_t waitpid (pid_t PID, int * status, int options );
pid_t waitpid(pid_t pid,int * status,int options);

The waitpid function is to wait for a sub-process to return. The first parameter of waitpid is to wait for the sub-process ID to return. If-1 is entered, it indicates waiting for any sub-process to return. The second parameter is used to accept the termination status of the sub-process. The third option is the additional option. The wnohang option means that the function is not blocked and no sub-process exits.

The call characteristics of waitpid under the wnohang option:

2.1 if a child process exits and returns a normal result, the child process ID is returned;

2.2 If no sub-process exits, 0 is returned;

2.3-1 is returned when a call error occurs;

Because no matter how many sub-processes have exited as long as there are unprocessed sub-processes, the parent process will receive only one sigchld, so while (pid = waitpid (-1, & stat, wnohang)> 0); it means that only when the sigchld signal is received, the child process waiting to exit cyclically, until waitpid returns a non-zero value (indicating that no sub-process exits ). Then return to the normal loop, accept waiting for connection. When a child process exits next time, the parent process will receive sigchld again.

3,

[CPP]
View plaincopyprint?
  1. If (cliefd = accept (servfd, (sockaddr *) 0, 0) <0)
  2. {
  3. If (errno = eintr) continue;
  4. Else err_sys ("Accept call error ");
  5. }
 if((cliefd=accept(servfd,(sockaddr*)0,0))<0) {       if(errno==EINTR) continue;          else err_sys("accept call error"); }

Accept, write, and read are all slow system calls. When the server is blocked in accept, we know that the parent process will be interrupted when it receives sigchld and calls our registered callback function sid_child. When the callback function returns, the slow system call function may return an eintr error, indicating that the function is interrupted during the blocking wait. Of course, this error is unexpected, so we have to call continue to continue the loop.

This prevents zombie processes.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.