Why are there so many zombie processes in busybox?

Source: Internet
Author: User

Many zombie processes in busybox are obvious to all. Why? This should begin with the concept of a zombie process. The so-called zombie process is actually a process of no one to recycle, and there is nothing left. There is only the empty shell task_struct, all fields in task_struct are lost and released, but task_struct is still in use, occupying the space of sizeof (struct task_struct, its empty task_struct is still hanging in the global task_struct linked list, so that it can still be found when traversing the entire system process, and the zombie process can still be seen in the user space ps. But why is there such a process? This should start with process recycling. The process is recycled in the following circumstances:
1. The parent process calls the wait system call to wait for the child process;
2. The system recycles sigchld signals explicitly ignored by the parent process.
In other cases, this process will become a zombie process. How can this problem be solved? Generally, when a process ends, it sends a sigchld signal to its parent process. What is the situation? That is, the parent process does not set the sigchld signal to sig_ign and is not set to sig_dfl. If the above conditions are met, the parent process must call wait for recovery after receiving the signal. If there is no wait, then the child process will become a zombie process. If the parent process sets the signal to sig_dfl, the exit process will still signal to the parent process, but the parent process will not process it, the child process will become a zombie. This is case 1; Case 2: the parent process sets the sigchld signal to sig_dfl, so that when the child process ends, it will not send the sigchld signal to the parent process, and the kernel will also
The process will not be recycled. In this case, the child process to be terminated will surely become a zombie process. In the third case, the parent process explicitly displays the sigchld signal, that is, set it to sig_ign, in this case, the kernel will recycle the child process, so the child process will not become a zombie process. Why is it so complicated? Well, this is a POSIX Convention. Ask them. We can check the kernel source code. when the process is exit, the call is do_exit:

Asmlinkage noret_type void do_exit (long code)

{

Struct task_struct * tsk = current;

Profile_task_exit (TSK );

...

Tsk-> flags | = pf_exiting;

Del_timer_sync (& tsk-> real_timer );

...

Exit_policy (TSK); // This function tells the cause of the zombie process.

Schedule ();

Bug ();

/* Avoid "noreturn function does return ".*/

For (;); // The process will never return data from schedule.

}

Static void exit_notify (struct task_struct * TSK)

{

Int state;

Struct task_struct * t;

Struct list_head ptrace_dead, * _ p, * _ N;

Init_list_head (& ptrace_dead );

Forget_original_parent (tsk, & ptrace_dead );

Bug_on (! List_empty (& tsk-> Children ));

Bug_on (! List_empty (& tsk-> ptrace_children ));

T = tsk-> real_parent;

...

If (tsk-> exit_signal! =-1 & thread_group_empty (TSK )){

Int signal = tsk-> parent = tsk-> real_parent? Tsk-> exit_signal: sigchld;

Do_policy_parent (tsk, signal); // tell the parent process that the process has exited. If possible, send the child process exit signal to the parent process.

} Else if (tsk-> ptrace ){

Do_policy_parent (tsk, sigchld); // This is related to tracking and debugging. We will not discuss it for the moment. For details, refer to my previous article on debugging "about Linux kernel debugging implementation".

}

...

State = task_zombie; // by default, a process is a zombie process.

If (tsk-> exit_signal =-1 & tsk-> ptrace = 0)

State = task_dead; // if there is no parent process wait, the Process status is changed to task_dead, and the kernel is responsible for recycling

Tsk-> state = State;

If (State = task_dead)

Release_task (TSK); // The kernel recycles the task_dead process.

Preempt_disable ();

Tsk-> flags | = pf_dead; // note that release_task does not actually release the memory of task_struct, because schedule must be called at the end of do_exit, and task_struct of the exiting process must be used in schedule, the finish_task_switch in schedule is released. This function reduces the task_struct counter by one. If it is 0, the memory is released.

}

Let's take a look at do_policy_parent:

Void do_policy_parent (struct task_struct * Tsk, int sig)

{

Struct siginfo Info;

Unsigned long flags;

Struct sighand_struct * psig;

...

Info. si_signo = SIG;

Info. si_errno = 0;

Info. si_pid = tsk-> PID;

Info. si_uid = tsk-> uid;

Info. si_utime = tsk-> utime + tsk-> signal-> utime;

Info. si_stime = tsk-> stime + tsk-> signal-> stime;

...

Psig = tsk-> parent-> sighand;

Spin_lock_irqsave (& psig-> siglock, flags );

If (Sig = sigchld &&

(PSIG-> action [SIGCHLD-1]. SA. sa_handler = sig_ign |

(PSIG-> action [SIGCHLD-1]. SA. sa_flags & sa_nocldwait) {// if the parent process sig_ign has sigchld, set some flags and then recycle them by the kernel. See the above function.

Tsk-> exit_signal =-1;

If (PSIG-> action [SIGCHLD-1]. SA. sa_handler = sig_ign)

Sig = 0;

}

If (SIG> 0 & sig <= _ nsig) // if there is no sig_ign, the signal is sent to the parent process, and the sig_dfl signal of the parent process is also sent, but the parent process does not process it, without wait, sub-processes certainly become zombie processes.

_ Group_send_sig_info (SIG, & info, tsk-> parent );

_ Wake_up_parent (tsk, tsk-> parent );

Spin_unlock_irqrestore (& psig-> siglock, flags );

}

The above function description is enough to explain the cause of the zombie process, but another interesting thing is the forget_original_parent function, this function is used to pass the children who exit the process to a new process that has been selected. The typical old-age care will not end, and the father will take care of his son when he dies, the zombie process is a typical tragedy of sending black hair in white. Who should I pass? It is generally another process that passes through to this thread group. If not, it passes through to a global variable child_reaper. This variable is set to the INIT process No. 1 during kernel initialization, specifically, it is set in the rest_init function, and rest_init is the predecessor of the INIT process No. 1 from the start_kenenl function fork. All processes of process No. 1 have been initialized.
Later, exec will be converted into/sbin/init. The specific code is very clear and I won't say much about it. Why? Because the INIT process is responsible for recycling most zombie processes, many processes have been handed over to the INIT process. In principle, the INIT process must have wait sub-process calls, that is to say, the sigchld signal processor must be set, and then the wait sub-process in the processor should have sigchld signal in the INIT process sig_ign, but if the INIT process sig_dfl has signal, it will be troublesome, the INIT process will not recycle sub-processes, resulting in a large number of zombie processes. Let's take a look at how the INIT process of busybox is implemented: the INIT process of busybox starts from the init_main function. Note that it does not have
Main function, which is determined by the busybox system. In busybox, all processes are busybox, and different parameters decide to execute different processes. After a specific study, we will understand that it is not mentioned here, take a look at init_main:

Int init_main (INT argc, char ** argv)

{

... // The first step is to parse/etc/inittab and run the initialization script. There is no essential difference with system v init, so it passes through

While (1 ){

...

/* Wait for a child process to exit */

Wpid = wait (null); // do you Want to shirk the cause of many zombie processes in busybox to the init of busybox?

While (wpid> 0 ){

/* Find out who died and clean up their corpse */

For (A = init_action_list; A = A-> next ){

If (a-> pid = wpid ){

/* Set the PID to 0 so that the process gets

* Restarted by run_actions ()*/

A-> pid = 0;

Message (log, "Process '% s' (PID % d) exited ."

"Scheduling it for restart .",

A-> command, wpid );

}

}

/* See if anyone else is waiting to be reaped */

Wpid = waitpid (-1, null, wnohang); // if you do not understand it, check the kernel's sys_wait4 call, the system calls all sub-processes in the "zombie" State. If the system passes the process without a father to init, there is no problem in busybox, all of them are recycled here.

}

}

}

Who is it not because of init? Imagine who else exists in the Linux Kernel apart from the kernel and the INIT process? The answer is shell. We know that when you get a shell, all the processes under the shell are sub-processes of the shell. If the shell is not wait, the zombie will still appear, let's look at shell. Let's look at MSH. file C. You cannot find wait (-1 ,...) but there are wait calls, all of which are wait-specific PID processes, that is, the direct sub-processes of wait, that is, the sub-processes that pass through to it, therefore, if you pass a process to MSH, don't expect MSH to recycle it. It is very likely to pass through to shell. After all
Shell is the parent process of many processes. It is not normal to recognize the grandfather as the father in Linux (the kernel means to recognize the uncle as the father, which is normal ).
As a result, most of the botnets in busybox are shell design problems, but not necessarily. I am sure most of them are like this, this is because this is the case when I debug the shell. There may be other killers, so I am too lazy to find them.
Maybe some of the above texts will be used to see the Code related to the passing of the process at once, so I 'd like to elaborate on it, not forget_original_parent:
Static inline void forget_original_parent (struct task_struct * father, struct list_head * to_release)
{
Struct task_struct * P, * reaper = father;
Struct list_head * _ p, * _ N;
Do {
Reaper = next_thread (reaper); // find a new father in this thread group, that is, find an uncle.
If (reaper = Father ){
Reaper = child_reaper; // if it is not followed by the INIT process
Break;
}
} While (Reaper-> state> = task_zombie );
// Stop it, and then it will be over. Just understand that the reaper here is the new father of all sub-processes that exit the process.
}
According to the above reasoning, the shell of busybox should be in the same thread group as its sub-processes (it is obviously not the INIT process), so let's look at MSH. in file C, as long as the new fork process is used, vfork is used throughout the article. The so-called vfork is to share the virtual storage space with the current process. In sys_vfork, it is clearly indicated that the clone_vm flag is required, in this way, the shell is not necessarily in the same thread group as the child process, but is closely related to the parent process. Vfork runs completely in the space of the parent process before calling exec. This reduces the replication overhead until exec is separated from the parent process, but it is closely related to the parent process.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.