Service; zygote killed by signal, zygotekilled
I. Problem Description
01-07 21:57:03.228 1690 2829 D ActivityManager: cleanUpApplicationRecord -- 576201-07 21:57:03.232 1690 1702 W WindowManager: Attempted to remove non-existing token: android.os.Binder@333a88801-07 21:57:03.233 1690 2829 W ActivityManager: Scheduling restart of crashed service com.android.statementservice/.DirectStatementService in 60923ms01-07 21:57:03.234 2892 3105 E WtProcessController: Error pid or pid not exist01-07 21:57:03.234 1690 3683 D ActivityManager: cleanUpApplicationRecord -- 532401-07 21:57:03.235 1690 3683 I AutoStartManagerService: MIUILOG- Reject RestartService packageName :com.android.email uid : 1006401-07 21:57:03.236 2892 3105 E WtProcessController: Error pid or pid not exist01-07 21:57:03.236 1690 2876 D ActivityManager: cleanUpApplicationRecord -- 530301-07 21:57:03.237 1690 2876 I AutoStartManagerService: MIUILOG- Reject RestartService packageName :com.miui.personalassistant uid : 1004001-07 21:57:03.303 10500 10500 W init : type=1400 audit(0.0:2272): avc: denied { write } for name="zygote64_pid" dev="debugfs" ino=12729 scontext=u:r:init:s0 tcontext=u:object_r:debugfs_ktrace:s0 tclass=file permissive=001-07 21:57:03.311 739 739 I cnss-daemon: RTM_NEWNEIGH message received: 2801-07 21:57:03.311 739 739 E cnss-daemon: Stale or unreachable neighbors, ndm state: 3201-07 21:57:03.314 553 553 I ServiceManager: service 'media.camera' died01-07 21:57:03.314 553 553 I ServiceManager: service 'media.player' died01-07 21:57:03.314 553 553 I ServiceManager: service 'media.resource_manager' died01-07 21:57:03.317 730 980 E OMXNodeInstance: !!! Observer died. Quickly, do something, ... anything...01-07 21:57:03.373 553 553 I ServiceManager: service 'media.radio' died01-07 21:57:03.373 553 553 I ServiceManager: service 'media.sound_trigger_hw' died01-07 21:57:03.373 553 553 I ServiceManager: service 'media.audio_flinger' died01-07 21:57:03.373 553 553 I ServiceManager: service 'media.audio_policy' died01-07 21:57:03.388 553 553 I ServiceManager: service 'fingerprints_service' died
1690 is the pid of system_server before the restart. From the above log, we can see that system_server is still operating normally in the previous step. In the next step, various services will be suspended and the system will start to restart, there is no error message for system_server in the middle. In this case, we suspect that the system_server is restarted directly or indirectly due to the failure of other services. For example, the restart of SurfaceFlinger causes system_server to restart. You can check the log to find that the pid of SurfaceFlinger has not changed, and:
u:r:zygote:s0 root 1375 1 1613000 25752 20 0 0 0 fg poll_sched 0000000000 S zygoteu:r:zygote:s0 root 10500 1 2175000 92232 20 0 0 0 fg poll_sched 0000000000 S zygote64
The pid of service zygote has changed. It is easy to infer that zygote64 is restarted, and system_server is restarted. You can search the log and find that:
<13>[ 9907.324247] init: Service 'zygote' (pid 1374) killed by signal 1<13>[ 9907.324349] init: Service 'zygote' (pid 1374) killing any children in process group
Zygote64 is killed by signal 1. What is signal 1? We can view the information through "kill-l:
Signal 1 should be SIGHUP
Ii. SIGHUP
From the above analysis, we can see that zygote64 is killed by SIGHUP. Let's take a look at how SIGHUP is generated.
Kernel/msm-4.4/kernel/exit. c
/* * Check to see if any process groups have become orphaned as * a result of our exiting, and if they have any stopped jobs, * send them a SIGHUP and then a SIGCONT. (POSIX 3.2.2.2) */static voidkill_orphaned_pgrp(struct task_struct *tsk, struct task_struct *parent){ struct pid *pgrp = task_pgrp(tsk); struct task_struct *ignored_task = tsk; if (!parent) /* exit: our father is in a different pgrp than * we are and we were the only connection outside. */ parent = tsk->real_parent; else /* reparent: our child is in a different pgrp than * we are, and it was the only connection outside. */ ignored_task = NULL; if (task_pgrp(parent) != pgrp && task_session(parent) == task_session(tsk) && will_become_orphaned_pgrp(pgrp, ignored_task) && has_stopped_jobs(pgrp)) { __kill_pgrp_info(SIGHUP, SEND_SIG_PRIV, pgrp); __kill_pgrp_info(SIGCONT, SEND_SIG_PRIV, pgrp); }}
When a series of conditions are met, _ kill_pgrp_info (SIGHUP, SEND_SIG_PRIV, pgrp) is called to send a SIGHUP signal to each process in pgrp. How should we interpret these conditions?
The system calls _ kill_pgrp_info (...) For functions, see:
Let's take a look at the call scenarios of kill_orphaned_pgrp in five places:
Kernel/msm-4.4/kernel/exit. c
/** This does two things: **. make init inherit all the child processes * B. check to see if any process groups have become orphaned * as a result of our exiting, and if they have any stopped * jobs, send them a SIGHUP and then a SIGCONT. (POSIX 3.2.2.2) */static void forget_original_parent (struct task_struct * father, struct list_head * dead) {struct task_struct * p, * t, * reaper; if (unlikely (! List_empty (& father-> ptraced) exit_ptrace (father, dead); // find the reaper = find_child_reaper (father) of the child process for the exiting process ); // if no sub-process exists, if (list_empty (& father-> children) return; // find the new reaper = find_new_reaper (father, reaper); list_for_each_entry (p, & father-> children, sibling) {for_each_thread (p, t) {t-> real_parent = reaper; BUG_ON ((! T-> ptrace )! = (T-> parent = father); if (likely (! T-> ptrace) t-> parent = t-> real_parent; if (t-> pdeath_signal) group_send_sig_info (t-> pdeath_signal, SEND_SIG_NOINFO, t );} /** If this is a threaded reparent there is no need to * Every y anyone anything has happened. */if (! Same_thread_group (reaper, father) reparent_leader (father, p, dead);} list_splice_tail_init (& father-> children, & reaper-> children );}
Note that the passed-in parameter father is actually a pointer to the task being exited in do_exit, so the main function is:
Find their new father (real_parent) for each sub-process of father (that is, the task being exited) and the thread of each sub-process) if the new reaper and father do not belong to the same thread group, call reparent_leader (father, p, dead) for each sub-process p of father) (Note that the father here is not the reaper we found, but the task we are exiting)
Kernel/msm-4.4/kernel/exit. c
/* * Send signals to all our closest relatives so that they know * to properly mourn us.. */static void exit_notify(struct task_struct *tsk, int group_dead){ bool autoreap; struct task_struct *p, *n; LIST_HEAD(dead); write_lock_irq(&tasklist_lock); forget_original_parent(tsk, &dead); if (group_dead) kill_orphaned_pgrp(tsk->group_leader, NULL); if (unlikely(tsk->ptrace)) { int sig = thread_group_leader(tsk) && thread_group_empty(tsk) && !ptrace_reparented(tsk) ? tsk->exit_signal : SIGCHLD; autoreap = do_notify_parent(tsk, sig); } else if (thread_group_leader(tsk)) { autoreap = thread_group_empty(tsk) && do_notify_parent(tsk, tsk->exit_signal); } else { autoreap = true; } tsk->exit_state = autoreap ? EXIT_DEAD : EXIT_ZOMBIE; if (tsk->exit_state == EXIT_DEAD) list_add(&tsk->ptrace_entry, &dead); /* mt-exec, de_thread() is waiting for group leader */ if (unlikely(tsk->signal->notify_count < 0)) wake_up_process(tsk->signal->group_exit_task); write_unlock_irq(&tasklist_lock); list_for_each_entry_safe(p, n, &dead, ptrace_entry) { list_del_init(&p->ptrace_entry); release_task(p); }}
Group_dead is to call exit_policy (...) The passed parameter indicates that it is the last task in the thread group to exit. tsk-> group_leader is the tgid, which is the second place where kill_orphaned_pgrp is called.
Let's compare the differences between the two calls to kill_orphaned_pgrp. Assume that the task we are exiting is A (and A is group_leader and the last task in the thread group to exit ), B is A sub-process of A (from A fork), then the two places where kill_orphaned_pgrp is called are:
Kill_orphaned_pgrp (B, A) kill_orphaned_pgrp (A, NULL)
Therefore, the calling of kill_orphaned_pgrp in these two cases is actually for two different scenarios:
Scenario 1 kill_orphaned_pgrp (B, A) is shown in:
Therefore, according to this figure, we can understand the four conditions that must be met before _ kill_pgrp_info (SIGHUP, SEND_SIG_PRIV, pgrp:
Task_pgrp (parent )! = Pgrp, process B and its parent A are not in the same process group task_session (parent) = task_session (tsk), and A and B are in the same session will_become_orphaned_pgrp (pgrp, ignored_task) (ignored_task is NULL here.) This condition can be visually expressed as in pgrp2, except that process B serves as a bridge between process Groups, no other process can serve as such a bridge (except for the process whose parent process is init) has_stopped_jobs (pgrp ), processes in pgrp2 are in the stop State (p-> signal-> flags & SIGNAL_STOP_STOPPED is true)
Scenario 2 kill_orphaned_pgrp (A, NULL) is shown in:
According to this figure, we can understand the four conditions that must be met before _ kill_pgrp_info (SIGHUP, SEND_SIG_PRIV, pgrp:
Task_pgrp (parent )! = Pgrp, process A and its parent are not in the same process group task_session (parent) = task_session (tsk), process A and its parent are in the same session will_become_orphaned_pgrp (pgrp, ignored_task) (ignored_task is A). This condition can be visually represented as A bridge between process A and process Groups in pgrp2, no other process can serve as such a bridge (except for the process whose parent process is init) has_stopped_jobs (pgrp ), processes in pgrp2 are in the stop State (p-> signal-> flags & SIGNAL_STOP_STOPPED is true)
Based on the above two scenarios, we can conclude that, when process A exits, it will consider whether it will make the pgrp of its child process into an orphan process group (scenario 1) and whether or not to change the pgrp you exited to an orphan process group (scenario 2)
Iii. Practice
Through the above analysis, we have come to a clear conclusion for the two scenarios of _ kill_pgrp_info (SIGHUP, SEND_SIG_PRIV, pgrp). Below we will use some examples to verify our conclusion, combining theory with practice, you can experience the pleasure of zingte.
In many logs, the system clears com. tencent. tmgp. when the speedmobile app crashes, let's take a look at what exists in this app (We will first practice it on a 32-bit machine ):
1. 32-bit Machine
Five processes with the same uid exist, then we can use the cat/proc/pid/stat and cat/proc/pid/status commands to view their respective pid, ppid, pgid, sid and other information. Take 5999 as an example, as shown below:
The first digit is pid, and the three digits after S are ppid, pgid, and sid.
The preceding two commands can be used to list the relationships between the preceding five processes:
Comm |
Pid |
Ppid |
Tgid |
Pgid |
Sid |
Com. tencent. tmgp. speedmobile |
5999 |
310 |
5999 |
5999 |
0 |
Xg_service_v2 |
6076 |
310 |
6076 |
310 |
0 |
Libxguardian. so |
6199 |
1 |
6199 |
310 |
0 |
Debugadh |
6274 |
5999 |
6274 |
6274 |
0 |
Debugadh |
6276 |
6274 |
6276 |
310 |
0 |
By default, the pgid of the sub-process of the zygote process and that of the Sun Tzu process (that is, all java processes) should be equal to the pid of the zygote process. Here, process 5999 and 6274 should have their own pgid reset; shows the relationship between the five processes:
From the relationship between the above five processes and the conclusion obtained from the previous analysis, we can speculate that:
Pgrp 310 and other pgrp communication bridges can be considered to have two, namely Process c and process e send a stop signal (kill-19) to a process in pgrp 310 ), then kill process B to simulate the above scenario. 1. Send a stop signal (except Process C) to a process in pgrp 310 ), then kill Process C can simulate the above scenario 2 kill process B or C and send a stop signal (except Process C and E) to a process in pgrp 310 ), then kill process e to simulate scenario 2.
You can manually verify the three situations mentioned above. The conclusion is perfect ~
2. 64-bit Machine
Relationships between seven processes:
Comm |
Pid |
Ppid |
Tgid |
Pgid |
Sid |
Zygote |
709 |
1 |
709 |
709 |
0 |
Zygote64 |
708 |
1 |
708 |
708 |
0 |
Com. tencent. tmgp. speedmobile |
5013 |
709 |
5013 |
5013 |
0 |
Xg_service_v2 |
5101 |
709 |
5101 |
708 |
0 |
Libxguardian. so |
5201 |
1 |
5201 |
708 |
0 |
Debugadh |
5232 |
5013 |
5232 |
5232 |
0 |
Debugadh |
5234 |
5232 |
5234 |
708 |
0 |
By default, the pgid of all java processes (except the zygote process) should be the same as the pid of the zygote64 process. The zygote process itself is in a pgrp, as shown in the relationship between the above seven processes:
We can infer from the relationship between the above processes and the conclusions obtained from the previous analysis:
Pgrp zygote64 and other pgrp communication bridges can be considered to have multiple, namely, process C and process E, and the sub-process kill all sub-processes of the zygote process (excluding ), send A stop signal (kill-19) to A process in pgrp zygote64, and then kill process B to simulate all the sub-processes (except A) of the preceding scenario ), send a stop signal (excluding Process C) to a process in pgrp zygote64, and then kill Process C to simulate the above scenario 2. kill all sub-processes of the zygote process, kill process B or C, send a stop signal (except Process C and E) to a process in pgrp zygote64, and then kill process e to simulate scenario 2. Note that this is theoretical, in actual operation. tencent. tmgp. speedmobile settings, kill process D at the same time, process E will also crash all sub-processes of the kill zygote process (except D or any other process X), kill process B or C, send a stop signal to a process in pgrp zygote64 (except for the process that will be killed), and then kill the process D or X to simulate scenario 2.
You can manually verify the above several situations. The conclusion is perfect. ^ _ ^
To sum up, the difference between 32-bit and 64-bit machines is actually the process group zygote. If we kill all the sub-processes of zygote, in fact, the process relationship of a 64-bit system is equivalent to that of a 32-bit system;
In fact, on 64-bit machines, there are very few sub-processes of the zygote process. Most java processes are subprocesses of zygote64, in this way, it is easy to see that all sub-processes of the zygote process have exited;