) Analysis of the problem that the tomcat process unexpectedly exits, and the tomcat process exits

Source: Internet
Author: User

(Switch) analysis of the problem that the tomcat process unexpectedly exits, and the tomcat process exits

The test environment of a department prior to this section reports that tomcat will exit unexpectedly. After checking the actual environment, we find that it is not a jvm crash. The log contains a process destruction record, from pause to destory:

org.apache.coyote.AbstractProtocol pausePausing ProtocolHandlerorg.apache.catalina.core.StandardService stopInternalStopping service Catalinaorg.apache.coyote.AbstractProtocol stopStopping ProtocolHandlerorg.apache.coyote.AbstractProtocol destroyDestroying ProtocolHandler

Judging from the above logs:

1) tomcat is not properly closed through the script (viaport: Send the shutdown command through port 8005)

If you close the service normally (viaport), there will be a warn log before pause:

Org. apache. catalina. core. StandardServer await A valid shutdown command was stored ed via the shutdown port. Stopping the Server instance. Then pause-> stop-> destroy
2) The shutdownhook of tomcat is triggered and the destroy logic is executed.

There are two situations. One is that the application code is used in some places.System.exitTo exit jvm. Second, the signal sent by the system (kill -9Except that the SIGKILL signal JVM will not have the opportunity to execute shutdownhook)

First, the Application Team and the middleware team checked the code.System.exitPossible use in this application. Then there is only the Signal case. After some troubleshooting, we found that the time for the unexpected exit of tomcat is exactly the same as the end time of the ssh session.

With this clue, Yin Shi immediately looked at the scripts of the Peer testing environment, which was simplified as follows:

$ cat test.sh#!/bin/bashcd /data/server/tomcat/bin/./catalina.sh starttail -f /data/server/tomcat/logs/catalina.out

After tomcat is started, the current shell process does not exit, but hangs in the tail process and outputs the log content to the terminal. In this case, if you close the window of the ssh terminal (with the mouse or shortcut key), the java Process also exits. Ifctrl-cTerminationTest. shAnd then shut down the ssh terminal, the java Process will not exit.

This is an interesting phenomenon,catalina.sh startTomcat started in the same way will mount the java Processinit(Process id is 1) under the parent processtest.shThe process is out of the parent-child relationship and has nothing to do with the ssh process. Why does closing the ssh terminal window cause the java Process to exit?

We assume that when the ssh window is closed, the shell for the current interaction and the test that is running are. sh and other sub-processes send an exited Signal and find a machine with systemtap for verification. The stap script used is copied from yunquan:

function time_str: string () {    return ctime(gettimeofday_s() + 8 * 60 * 60);}probe begin {    printdln(" ", time_str(), "BEGIN");}probe end {    printdln(" ", time_str(), "END");}probe signal.send {    if (sig_name == "SIGHUP" || sig_name == "SIGQUIT" ||         sig_name=="SIGINT" || sig_name=="SIGKILL" || sig_name=="SIGABRT") {        printd(" ", time_str(), sig_name, "[", uid(), pid(), cmdline_str(),                 "] -> [", task_uid(task), sig_pid, pid_name, "], ");        task = pid2task(pid());        while (task_pid(task) > 0) {            printd(" ", "[", task_uid(task), task_pid(task), task_execname(task), "]");            task = task_parent(task);        }        println("");    }}

During the simulation, the process level (pstree) is roughly as follows. After tomcat is started, the java Process has been separated from test. sh and hung under init:

|-sshd(1622)-+-sshd(11681)---sshd(11699)---bash(11700)---test.sh(13285)---tail(13299)

With the help of kernel group Bai Yu, we found that

A) When ctrl-c is used to terminate the current test. sh process, the system events process sends SIGINTSignal
SIGINT [0 11]-> [0 20629 tail] SIGINT [0 11]-> [0 20628 java] SIGINT [0 11]-> [0 20615 test. sh] Note pid 11 is the events process
B) when the ssh terminal window is closed, sshd sends messages to downstream processes SIGHUPWhy does the java Process receive the message?
SIGHUP [ 0 11681 sshd: hongjiang.wanghj [priv] ] -> [ 57316 11700 bash ] SIGHUP [ 57316 11700 -bash ] -> [ 57316 11700 bash ]SIGHUP [ 57316 11700 ] -> [ 0 13299 tail ] SIGHUP [ 57316 11700 ] -> [ 0 13298 java ] SIGHUP [ 57316 11700 ] -> [ 0 13285 test.sh ] 

But Bai Yu was very busy and did not continue to help analyze the problem (he gave some guesses, but later proved that it was not the case ).

After I confirmed it was caused by signal, my doubts changed:

1) Why? SIGINT(Kill-2) won't the tomcat process be exited? 2) Why? SIGHUP(Kill-1) Will the tomcat process exit?

My first response may be that the jvm processes OS signals differently under some parameters (or because of some jni). I checked the jvm parameters of the application and did not see the problem, it also ruled out the use of apr/tcnative by tomcat.

By default, the jvm ProcessSIGINTAndSIGHUPHow to handle it, use scala's repl to simulate it:

scala> Runtime.getRuntime().addShutdownHook(            new Thread() { override def run() { println("ok") } })

For this java Process, usekill -2Andkill -1Jvm process exited and triggeredshutdownhook. This is also in line with oracle's description of processing Signal on the hotspot Virtual Machine. For details, refer to here,SIGTERM,SIGINT,SIGHUPAll three signals are triggered.shutdownhook

It seems that it is not about jvm. Continue to guess whether it is related to the Process status? The catalina. sh script is not used.start-stop-daemonTo start the java Process. After the execution method of the start parameter is simplified, the script is equivalent:

eval '"/pathofjdk/bin/java"' 'params' org.apache.catalina.startup.Bootstrap start '&'

It simply puts java in the background for execution. After the catalina. sh process exits, the ppid of the java Process changes to 1.

It took a lot of time to guess the reason may be at the OS level, and later found that it does not matter. When I came back after the Spring Festival, Shao Ming and Li Quan also analyzed the problem because they had a c background and knew more about the underlying system. It took me a long time to guess and verify the problem, finally, we confirmed the cause of the Shell.

SIGINT(Kill-2) does not cause the background java Process to exit

To simplify the process, we use sleep to simulate the process. When we are in the interactive mode:

$ sleep 1000 & $ ps -opid,pgid,ppid,stat,cmd -C sleep  PID  PGID  PPID STAT CMD 9897  9897  9813 S    sleep 1000   

Note: Processsleep 1000The pid is the same as the pgid (process group ).kill -2Yes, it can be killed.sleep 1000Process.

Now we put the sleep process in a script and run it in the background:

$ cat a.sh#!/bin/shsleep 4400 &echo "shell exit"

After running the. sh script,sleep 4400The pid of a process is different from that of pgid. pgid is the id of its parent process, that is, the. sh process that has exited.

$ ps -opid,pgid,ppid,comm -p 63376  PID  PGID  PPID COMM63376 63375     1 sleep

In this case, we usekill -2No.sleep 4400Process.

At this point, it is very close to the cause. It must be caused by shellsignal_handlerWhat did you do. Shaoming implemented a Custom handler command to check whetherkill -2Valid:

#include <stdio.h>#include <signal.h>#include <stdlib.h>void my_handler(int sig) {    printf("handler aaa\n");    exit(0);}int main() {    signal(SIGINT, my_handler);    for(;;) { }    return 0;}

We run the compiled a. out command in the script later:

$ cat a.sh#!/bin/sh/tmp/a.out &

Try again this timekill -2It is okay to kill the. out process. This shows that the shellsignal_handlerThe script was set before the user logic was executed, that is, when the fork output sub-process. Based on this clue, we learned from google that shellNon-Interactive ModeProcess background processesSIGINTThe signal is setIGNORE.

The interactive mode and non-interactive mode have different default job control methods.

Why does shell not process Background processes in interactive mode?SIGINTWhen the signal is set to ignore, but not in interactive mode, is it set to ignore? For example, it takes too long for a front-end process to run.ctrl-zStop and then passbg %nPut this process in the background, you can also putcmd &To start the background process throughfg %nBack to the front-end, and thenctrl-cStop it. You cannot ignore it.SIGINT.

Why do background processes in interactive mode set a process group ID? By default, if the process group ID of the parent process is used, the parent process will take the received Keyboard Events suchctrl-cAnd so on.SIGINTSpread to each member in the process group. Assume that the background process is also a member of the parent process group, because the job control needs cannot be ignored.SIGINTYou are at Will on the terminalctrl-cThis may cause all background processes to exit. Obviously, this is unreasonable. To avoid such interference, the background process is set to its own pgid.

In non-interactive mode, Job control is usually not required, so job control is disabled by default in non-interactive mode (of course, you can also selectset -mOpen the job control option ). If Job control is not enabled, the background processes in the script can be ignored by settingSIGINTSignal to prevent the parent process from spreading to members in the group, because it is meaningless.

Return to the example of tomcat. When the catalina. sh script is started using the start parameter, It is started in the background in non-interactive mode, and the java Process is ignored by shell settings.SIGINTThereforectrl-cWhen the test. sh process ends, the system sendsSIGINTIt does not affect java.

SIGHUP(Kill-1) causes tomcat process to exit

In non-interactive mode, shell sets the java ProcessSIGINT,SIGQUITThe signal settings are ignored, but there is noSIGHUPSet the signal to ignore. Let's take a look at the current process level:

|-sshd(1622)-+-sshd(11681)---sshd(11699)---bash(11700)---test.sh(13285)---tail(13299)

SshdSIGHUPAfter being passed to the bash process, bash willSIGHUPThe sub-process that is passed to it, and for its sub-process test. sh, bash will also spread all the members in the process group of test. sh.SIGHUP. Because the java background process inherits the pgid from the parent process catalina. sh (also inherited from its parent process test. sh), the java Process still belongs to the member in the test. sh process group and receivesSIGHUPAnd then exit.

If we enable job control in test. sh, the java Process will not be exited.

#!/bin/bashset -m  cd /home/admin/tt/tomcat/bin/./catalina.sh starttail -f /home/admin/tt/tomcat/logs/catalina.out

At this time, the java background process inherits the parent process catalina. sh pgid, While catalina. sh no longer uses test. sh process group, but its own pid as pgid, catalina. after the sh process exits, the java Process is down to init, and java and test. THE sh process is completely disconnected, and bash will not send signals to it.

Address: http://hongjiang.info/why-kill-2-cannot-stop-tomcat/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.