Tomcat downtime analysis and thread Handling Methods

Source: Internet
Author: User
Tags stop script

Tomcat downtime analysis and thread Handling Methods

During work, we often encounter various inexplicable errors caused by the failure to stop the threads created during Tomcat shutdown. This article will sort out the Tomcat shutdown process, discuss the causes of these errors and propose two feasible solutions.

Tomcat downtime Analysis

A Tomcat process is essentially a JVM process. Its internal structure is shown in:

(Picture from network)

Server, service, and connector | Engine, host, and context.

In implementation, Engine and host are only an abstraction, and more core functions are implemented in context. There can be only one Server on the top layer. One Server can contain multiple services, and one Service can contain multiple ctor and one Continer. Continer is an abstraction of the Engine, Host, or Context. Not strictly speaking, a Context corresponds to a Webapp.

When Tomcat is started, the main work of the main thread is summarized as follows:

Public void start (){

Load (); // config server and init it

GetServer (). start (); // start server and all continers belong to it

Runtime. getRuntime (). addShutdownHook (shutdownHook); // register the shutdown hook

Await (); // wait here util the end of Tomcat Proccess

Stop ();

}

By scanning the configuration file (the default value is server. xml), we can build containers from the top-level Server to the Service, Connector, and so on (including building the Context ).

Call the start method of Catalina, and then call the start method of Server. The start method will start the entire container.

Containers such as Server, Service, Connector, and Context all implement the Lifecycle interface, and these components maintain a strict tree structure from top to bottom. Tomcat manages all other containers in the tree structure by managing the lifecycle of the root node (Server.

Block yourself to the await () method. The await () method waits for a network connection request. When a user connects to the corresponding port and sends a specified string (usually 'shutdown '), await () returns, and the main thread continues to execute.

The main thread executes the stop () method. The stop () method calls the stop method of all containers under the Server. After the stop () method is executed, the main thread exits. If no problem exists, the Tomcat container stops running at this time.

It is worth noting that the stop () method is executed asynchronously from the layer below the Service. The Code is as follows:

Protected synchronized void stopInternal (){

/* Other code */

Container children [] = findChildren ();

List> results = new ArrayList> ();

For (int I = 0; I <children. length; I ++ ){

Results. add (startStopExecutor. submit (new StopChild (children [I]);

}

Boolean fail = false;

For (Future result: results ){

Try {

Result. get ();

} Catch (Exception e ){

Log. error (sm. getString ("containerBase. threadedStopFailed"), e );

Fail = true;

}

}

If (fail ){

Throw new LifecycleException (

Sm. getString ("containerBase. threadedStopFailed "));

}

/* Other code */

}

In these disabled children, the standard should be a layered structure such as Engine-Host-Context, that is, the stop () method of Context will be called at last. The stopInternal method of Context calls these three methods:

FilterStop ();

ListenerStop ();

(Lifecycle) loader). stop ();

(Note: This is only part of the list. It is listed because it is related to the analysis process. Other process-independent methods are not listed .)

Here, filterStop will clean up our website. listenerStop, a filter registered in xml, further calls the web. onDestory method of Listener registered in xml (if there are multiple Listener registrations, the call order is the opposite to the registration order ). Here, loader is WebappClassLoader. The important operations (trying to stop the thread, clearing the referenced resources, and detaching the Class) are all done in the stop function.

If SpringWeb is used, the Listener registered in web. xml will be:

Org. springframework. web. context. ContextLoaderListener

Looking at the ContextLoaderListener code, it is not difficult to find that the Spring framework initializes the Bean through the contextInitialized method of Listener and cleans the Bean through the contextDestroyed method.

Public class ContextLoaderListener extends ContextLoader implements ServletContextListener {

Public ContextLoaderListener (){

}

Public ContextLoaderListener (WebApplicationContext context ){

Super (context );

}

Public void contextInitialized (ServletContextEvent event ){

This. initWebApplicationContext (event. getServletContext ());

}

Public void contextDestroyed (ServletContextEvent event ){

This. closeWebApplicationContext (event. getServletContext ());

ContextCleanupListener. cleanupAttributes (event. getServletContext ());

}

}

Here is an important thing: our thread is stopped in loader, while the stop method of loader is after the listenerStop method, that is, even if the loader successfully terminates the thread started by the user, it is still possible to use the Sping framework before the thread is terminated. At this time, the Spring framework has been disabled in Listener! Besides, only when the clearReferencesStopThreads parameter is configured during the loader Thread cleanup process will the user-initiated Thread be forcibly terminated (using Thread. in most cases, this parameter is not configured to ensure data integrity. That is to say, in the WebApp, the threads (including Executors) started by the user will not be terminated because of the exit of the container.

We know that there are two main reasons for JVM Exit:

The System. exit () method is called.

All non-daemon threads exit

However, Tomcat does not actively call System at the end of stop execution. exit () method. So if a non-daemon thread is started by the user and the user does not close the thread synchronously with the container, Tomcat will not end automatically! This problem has been put on hold for now. Let's talk about the various problems encountered during the shutdown.

Exception analysis during Tomcat downtime

IllegalStateException is a serious synchronization problem between the closure of the Spring framework and the end of the user thread when Tomcat exits in the Webapp using the Spring framework. During this period of time (before the end of the Spring framework and the end of the user thread), many unpredictable problems may occur. The most common problems are IllegalStateException. When such an exception occurs, the standard code is as follows:

Public void run (){

While (! IsInterrupted ()){

Try {

Thread. sleep (1000 );

GQBean bean = SpringContextHolder. getBean (GQBean. class );

/* Do something with bean... */

} Catch (Exception e ){

E. printStackTrace ();

}

}

}

This type of error is easy to reproduce and is common.

ClassNotFound/NullPointerException

This type of error is uncommon, and it is troublesome to analyze it.

In the previous analysis, we identified two things:

The thread created by the user will not stop with the destruction of the container.

ClassLoader uninstalls the loaded Class during the stop of the container.

It is easy to determine that this is caused by the thread not ending.

When the ClassLoader is uninstalled and the user thread tries to load a Class, ClassNotFoundException or NoClassDefFoundError is reported.

During ClassLoader uninstallation, because Tomcat does not strictly synchronize the stop container, if you try to load a Class, it may cause NullPointerException. The reason is as follows:

// Part of load class code, may be executed in user thread

Protected ResourceEntry findResourceInternal (...) {

If (! Started) return null;

Synchronized (jarFiles ){

If (openJARs ()){

For (int I = 0; I <jarFiles. length; I ++ ){

JarEntry = jarFiles [I]. getJarEntry (path );

If (jarEntry! = Null ){

Try {

Entry. manifest = jarFiles [I]. getManifest ();

} Catch (IOException ioe ){

// Ignore

}

Break;

}

}

}

}

/* Other statement */

}

From the code, we can see that the jarEntry access is very cautiously synchronized. In other aspects of use of jarEntry, there is a very careful synchronization, except in the stop:

// Loader. stop () must be executed in stop thread

Public void stop () throws LifecycleException {

/* Other statement */

Length = jarFiles. length;

For (int I = 0; I <length; I ++ ){

Try {

If (jarFiles [I]! = Null ){

JarFiles [I]. close ();

}

} Catch (IOException e ){

// Ignore

}

JarFiles [I] = null;

}

/* Other statement */

}

As you can see, in the above two sections of code, if the user thread enters the synchronous code block (this will cause the thread cache to refresh), started changes to false, if you skip updating jarFiles, or jarFiles [0] is not empty yet, when the returned result from openJARs is that jarFiles [0] = null has been executed by stop, NullPointerException is triggered.

This exception is hard to understand because it triggers the loadClass operation, especially when there is no new class in the code. In fact, a class initialization check is often triggered. (Note that the class initialization is not the class instance initialization. The two days are different)

The initialization check of the class will be triggered as follows:

This type of instance is created for the first time in the current thread.

The static method of the class called for the first time in the current thread

The first time a static member of the class is used in the current thread

The first time the current thread assigns a value to a class static member

(Note: If the class has been initialized, it will be returned directly. If the class has not been initialized yet, the class initialization operation will be executed)

When these conditions occur in a thread, the initialization check will be triggered (a thread can check at most once). This class must be obtained before the class initialization is checked, call the loadClass method.

Generally, the following code is easy to trigger the above exception:

Try {

/** Do something **/

} Catch (Exception e ){

// ExceptionUtil has never used in the current thread before

String = ExceptionUtil. getExceptionTrace (e );

// Or this, ExceptionTracer never appears in the current thread before

System. out. println (new ExceptionTracer (e ));

// Or other statement that triggers a call of loadClass

/** Do other thing **/

}

Some suggested solutions

According to the above analysis, the main cause of the exception is that the thread is not terminated in time. The key to the solution is how to gracefully terminate the user-initiated thread before the container ends.

Create your own Listener as the notification to terminate the thread

According to the analysis, the project mainly uses user-created threads, including four types:

Thread

Executors

Timer

Scheduler

Therefore, the most direct idea is to establish a management module for these components, which can be divided into two steps:

Step 1: Create a Listener-based management module and hand over the four types of class instances mentioned above to the module for management.

Step 2: When the Listener listens to Tomcat shutdown, it triggers the end method corresponding to the instance it manages. For example, Thread triggers the interrupt () method, ExecutorService triggers shutdown () or shutdownNow () method (depending on the specific policy selection.

It is worth noting that the Thread created by the user needs to respond to the Interrupt event, that is, the Thread exits after isInterrupted () returns true or after InterruptException is captured. In fact, creating threads that do not respond to Interrupt events is a very bad design.

The advantage of creating your own Listener is that you can actively block the destruction process when listening to the event, and strive for some time for the user thread to clean up, Because Spring has not been destroyed yet, the state of the program is normal.

The disadvantage is that the Code is too invasive and dependent on the user's code.

Use TaskExecutor provided by Spring

Spring provides a TaskExcutor tool to manage its own threads in webapps. Among them, ThreadPoolTaskExecutor is very similar to ThreadPoolExecutor in Java 5, but the lifecycle will be managed by Spring. When the Spring framework is stopped, Executor will also be stopped, and the user thread will receive an exception of interruption. At the same time, Spring also provides ScheduledThreadPoolExecutor, which can be used for scheduled tasks or to create their own threads. Spring provides a wealth of support for thread management. For details, refer to the following:

Https://docs.spring.io/spring/docs/current/spring-framework-reference/integration.html?scheduling.

The advantage of using the Spring framework is that the Code is less invasive and less dependent on the code.

The disadvantage is that the Spring framework does not guarantee the time sequence of thread interruption and Bean destruction. That is, if a thread captures InterruptException and then uses Spring to getBean, it will still trigger IllegalSateException. At the same time, the user still needs to check the thread status or trigger the interrupt in Sleep, otherwise the thread will not be terminated.

Other reminders

In the above solution, whether it is to block the stop operation of the main thread in the Listener, or do not respond to the interrupt status in the Spring framework, the thread can continue to do something for some time. However, this time is not infinite. In catalina. sh, we can see in the stop script (here we can simply delete it ):

# Tomcat downtime script excerpt

# The first normal stop

Eval "\" $ _ RUNJAVA \ "$ LOGGING_MANAGER $ JAVA_OPTS \

-Djava. endorsed. dirs = "\" $ JAVA_ENDORSED_DIRS \ "-classpath" \ "$ CLASSPATH \""\

-Dcatalina. base = "\" $ CATALINA_BASE \""\

-Dcatalina. home = "\" $ CATALINA_HOME \""\

-Djava. io. tmpdir = "\" $ CATALINA_TMPDIR \""\

Org. apache. catalina. startup. Bootstrap "$ @" stop

# Use kill-15 if the termination fails

If [$? ! = 0]; then

Kill-15 'cat "$ CATALINA_PID" '>/dev/null 2> & 1

# Set the wait time

SLEEP = 5

If ["$1" = "-force"]; then

Shift

# Force stops if any of the parameters contain

FORCE = 1

Fi

While [$ SLEEP-gt 0]; do

Sleep 1

SLEEP = 'expr $ SLEEP-1'

Done

# Force kill-9 termination if necessary

If [$ FORCE-eq 1]; then

Kill-9 $ PID

Fi

From the above stop script, we can see that if force termination is configured (our server is configured by default), it takes only five seconds for you to block the termination process to do your own thing. During this period, other threads are doing some tasks and the time when the thread actually starts to terminate and finds the termination (for example, from the current time to the next time to call isInterrupted, the maximum blocking time should be shorter.

From the above analysis, we can also see that if there are important and time-consuming tasks in the service and you want to ensure consistency, the best way is to record the current execution progress in the precious 5 seconds of blocking, wait until the service is restarted to check the previous execution progress, and then recover from the previous progress.

It is recommended that the execution granularity of each task (the interval between two isInterrupted detection) be controlled at least within the maximum blocking time, so as to leave enough time for record work after termination.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.