Talk about Linux signal processing and zombie process avoidance

Source: Internet
Author: User

 

What botnets?

 

Here is a brief introduction. You can find the details on the Internet:

BotnetsThis means that the child process has exited, but the parent process has not exited, and wait is not performed on the child process. As a result, the resources of the child process are not released and are still occupied in the memory, as a result, it becomes a process like a zombie (a zombie cannot act, but occupies the body; a process cannot execute, but occupies memory and other resources ).

This process is no longer active and does not process the signal. It is useless to send a signal to it by using kill, that is, it becomesNo killing.

 

 

What does wait do?

 

Wait is not for Soy Sauce. Let's take a look at the man's Manual description:

All of these system CILS are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed. A state change is considered to be: the Child terminated; the child was stopped by a signal; or the child was resumed by a signal. in the case of a terminated child, please Ming a wait allows the system to release the resources associated with the child; if a wait is not completed MED, then the terminated child remains in a "zombie" State (see notes below ).

Wait is the state change of the child process that the current process is waiting for and can obtain information about the state change of the child process. This state change usually refers to the completion or restoration of a sub-process.When the child process ends, the wait function notifies the system to release the resources of the child process. If the child process does not use the wait function, the child process will become "zombie.

This zombie state will remain until the parent process exits and is inherited to the ancestor Init (PID: 1) process. The INIT process is responsible for releasing their resources.

 

Zombie process avoidance

 

There are basically three ways to search online:

  1. Signal (sigchld, sig_ign), ignoring the sigchld signal, so that after the process ends, there is no need for the parent process to wait and release resources
  2. Fork two times, the first Fork sub-process directly exits after the fork is completed, so that the second Fork sub-process will no longer have a father (really poor ...), It will be automatically inherited by the ancestor INIT process, and init will be responsible for releasing its resources, so that it will not be produced by the "zombie ".
  3. Wait for sub-processes to release their resources. However, the parent process usually does not have time to guard it, and waits for the child process to exit. Therefore, it is generally processed using a signal. When receiving the sigchld signal, and then use the wait operation to release their resources temporarily.

 

From a personal perspective, let's briefly comment on these three methods:

  1. Dad does not care whether his son lives or not. The parent process cannot know the exit status of the child process.
  2. My son has committed suicide, and Grandpa does not care about his grandson. Like 1, the parent process cannot know the exit of the child process.
  3. Dad is doing his best. When his son dies, he will be credened (releasing resources ). The parent process can know the exit status of the child process, but the process is more complicated than 1 and 2.

 

I personally recommend 3rd solutions, which also raises the following problem.

In section 10.8 of "Advanced Programming in UNIX environment", there is a passage like this:

What happens if a blocked signal is generated more than once before the process unblocks the signal? Posix.1 allows the system to deliver the signal either once or more than once. if the system delivers the signal more than once, we say that the signals are queued. most Unix systems, however, do not queue signals unless they support the real-time extensions to posix.1. instead, the Unix kernel simply delivers the signal once.

This passage means:If the same signal occurs multiple times before the process is blocked, most Unix systems do not queue the signal, that is, this signal will be submitted only once.

 

This is a feature of Linux signal processing, that is, the same signal is submitted multiple times. If the first signal is still being processed, the subsequent signals will be discarded, but does not enter the queue to be processed. If we simply process this signal, we will lose the processing of the same signal.

 

This problem occurs when I write a file server. Each time the file server receives a client request, it is processed by a process fork. To conduct stress tests on this server, I have generated a lot of file transfer requests on the client side, and to test the server's fault tolerance capabilities, after these requests are generated, press Ctrl + C to interrupt the transmission of these files. It is found that a large number of zombie processes are generated on the server every time such operations are performed. After studying for a whole day, I checked a lot of information and tested it many times. I finally found that the sigchld signal of these botnets was ignored by the parent process! No wait is performed on them, causing them to become zombie processes!

 

The following is a simple example:

/* <Br/> * Main. CPP <br/> * created on: Jun 17,201 1 <br/> * Author: Boyce <br/> */<br/> # include <stdio. h> <br/> # include <stdlib. h> <br/> # include <unistd. h> <br/> # include <signal. h> <br/> # include <wait. h> <br/> # include <errno. h> <br/> int num_clients = 0; <br/> int dead_clients = 0; <br/> void sig_chld_handler (INT sig) {<br/> pid_t PID; <br/> If (Sig = sigchld) {<br/> pid = wait (null); <br/> printf ("A child dead, current child number: % d, ID: % d/N ", ++ dead_clients, pid); <br/>}< br/> int main (INT argc, char ** argv) {<br/> pid_t PID; <br/> signal (sigchld, sig_chld_handler); <br/> for (INT I = 0; I <30; I ++) {<br/> If (pid = fork () = 0) {<br/> exit (0 ); <br/>} else if (pid> 0) {<br/> printf ("A child created, current child number: % d, ID: % d/N ", + num_clients, pid); <br/>}< br/> sleep (10); <br/> return 0; <br/>}< br/> 

 

This Code creates 30 processes at the same time. These processes exit at almost the same time and send sigchld signals to the parent process when exiting. As mentioned above, if the previous signal is still being processed, the current signal will be discarded. If there is a signal that we wait for a sub-process, when the program runs, we will find that some processes are not waited by the parent process and become zombie processes.

As shown in, nine of the 30 processes have actually become zombie processes )! These processes occupy system resources and refuse to release them. Even kill cannot kill them or kill them. They will not kill you!

 

 

Cannot capture all signals. Does it mean that we cannot wait until all sub-processes exit? Of course not. query the wait function manual using man. We can see the following:

Return Value

Wait (): on success, returns the process ID of the terminated child; on error,-1 is returned.

...

Errors

Echild (for wait () The Calling process does not have any unwaited-for children.

 

When the wait function fails,-1 is returned. This is all nonsense. If the error code is echild in case of failure, for the wait function, this indicates that the current process does not have a child process to wait. What does this mean? This shows that we don't have to worry about how many signals we have. It's just a float cloud, and a signal. One thing we can only say is that a sub-process has exited. As for which sub-process is about to exit and how many sub-processes are there, you will know when you wait. As a result, we have the following code.

/* <Br/> * Main. CPP <br/> * created on: Jun 17,201 1 <br/> * Author: Boyce <br/> */<br/> # include <stdio. h> <br/> # include <stdlib. h> <br/> # include <unistd. h> <br/> # include <signal. h> <br/> # include <wait. h> <br/> # include <errno. h> <br/> int num_clients = 0; <br/> int dead_clients = 0; <br/> void sig_chld_handler (INT sig) {<br/> pid_t PID; <br/> If (Sig = sigchld) {<br/> while (1) {<br/> pid = wait (null ); <br/> If (PID <0 & errno = echild) {<br/> break; <br/>}< br/> printf ("A child dead, current child number: % d, ID: % d/N ", ++ dead_clients, pid ); <br/>}< br/> int main (INT argc, char ** argv) {<br/> pid_t PID; <br/> signal (sigchld, sig_chld_handler); <br/> for (INT I = 0; I <30; I ++) {<br/> If (pid = fork () = 0) {<br/> exit (0); <br/>} else if (pid> 0) {<br/> printf ("A child created, current child number: % d, ID: % d/N", ++ num_clients, pid ); <br/>}< br/> sleep (10); <br/> return 0; <br/>}< br/> 

 

Okay. Let's take a look at the test results:

 

Haha, you can see it. My son is all dead, and dad is happy (O (Xiao □xiao) O)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.