-linux-tomcat downtime solutions for production environments

Source: Internet
Author: User

For small and medium-sized companies using Tomcat as a Java container, there is no system tuning is very easy to appear tomcat in the running process of service outage, and in the Tomcat log is generally unable to see the useful information, and this instance of Tomcat after the outage tuning, Is by the company's framework to adjust, he is the JVM tuning has a very deep understanding, and the author of the JVM tuning understanding is relatively shallow, so this example will not be too much to explain the principle of tuning, only record analysis and tuning process, I hope to give you encounter tomcat outage of the operation of friends bring a little thought.

First, Tomcat is down to a preliminary analysis

The production environment of Tomcat will be in a few days of service outage, there is no regularity, there is a regular tomcat outage every time after the version upgrade restart 10 minutes ~ 60 Minutes, not every reboot Tomcat will be down, if tomcat boot more than a day, It will not go down during the run, and after rebooting Tomcat again, there will be no more tomcat outages, which will initially exclude the main cause of the line code (the new code is actually a part of the reason, which will be said later).

To view Tomcat's Catalina.out logs:

2015-1-5 13:35:41 Org.apache.coyote.http11.Http11NioProtocol Pause info: Pausing coyote http/1.1 on http-88902015-1-5 13:35:42 Org.apache.catalina.core.StandardService Stop info: Stopping service catalina2015-1-5 13:35:42 Org.apache.coyote.http11.Http11NioProtocol Destroy info: Stopping Coyote http/1.1 on http-8890exception in thread "Timer-1 "Java.lang.NullPointerException at Com.qhfax.invest.balanceAccount.common.util.TaskBalanceAccount.run ( taskbalanceaccount.java:75) at Java.util.TimerThread.mainLoop (timer.java:512) at Java.util.TimerThread.run (T imer.java:462)

As you can see from the log, Tomcat pauses the HTTP service's port 8890 and then stops the core service.

Obviously, from the above log, we can not see any useful information, what should be done?

Second, Tomcat memory analysis tuning-turn on GC log

Since it is not possible to see how Tomcat is going down from the log, we can start by analyzing tomcat memory, and we will be able to turn on the GC log to record the daily JVM's cleanup of memory, as follows:

First show the configuration of Tomcat before tuning (the core of Tomcat tuning is the JVM parameter setting):

Original configuration: Edit {tomcat_home}/bin/catalina.sh, you can add the following parameters directly after the top # comment is finished, export Jre_home=/usr/java/jdk1.6.0_38export Catalina_ Home=/home/resin/tomcatjava_opts= "-xms1024m-xmx1024m-xx:permsize=512m-xx:maxpermsize=512m"

JVM Startup parameter Description:

-xms1024m set JVM minimum free memory to 1024M

-xmx1024m set JVM maximum available memory is 1024M

-xx:permsize=512m JVM Initial allocation of non-heap memory

-xx:maxpermsize=512m Setting the persistent generation size to 512M


Modified configuration: Export Jre_home=/usr/java/jdk1.6.0_38export catalina_home=/home/resin/tomcatjava_opts= "-server-xms2048m- xmx2048m-xmn512m-xx:+useparalleloldgc-xx:+printgcdatestamps-xx:+printgcdetails-xloggc:/home/resin/tomcat/logs/ Gc.log "

Description of the JVM startup parameters after tuning:

-server server mode is slow to start, but once it is running, performance will be greatly improved

    -xmn512m  Set the young generation size to 2G. entire Heap size = young generation size + old generation size + persistent generation size The permanent average fixed size is 64m, so increasing the younger generation will reduce the size of older generations. This value has a large impact on system performance, and Sun's official recommendation is 3/8 for the entire heap.

-XX:+USEPARALLELOLDGC execution garbage collector, typically used on multi-threaded multiprocessor machines

-xx:+printgcdatestamps Logging of GC logs does not affect Java program performance in particular

-xx:+printgcdetails additional information collected by print

-xloggc:/****/****/tomcat/logs/gc.log GC Log path address

2.tomcat configuration file server.xml Original configuration: <listener classname= " Org.apache.catalina.core.JreMemoryLeakPreventionListener " />    <connector  port= "8890"  protocol= "http/1.1"               uriencoding= "UTF-8"    connectionuploadtimeout= "36000000"  disableuploadtimeout= "False"  connectiontimeout= "60000"                     redirectport= "8443"  /> modified configuration: <!-- listener  classname= "Org.apache.catalina.core.JreMemoryLeakPreventionListener"  gcdaemonprotection= "false"/  -->    #此语句注释掉     <connector executor= "Tomcatthreadpool"                port= "8890"   Protocol= "Org.apache.coyote.http11.Http11NioProtocol"     connectionuploadtimeout= "36000000"                  disableuploadtimeout= "false"                  connectiontimeout= "60000"                  redirectport= "8443"  />

Description: Commented out statements are intended to reduce the periodic full GC Frequent problems of the JVM.

After a preliminary tuning, and then another tomcat outage, it appears that Tomcat tuning does not need to continue, by looking at Gc.log:

2015-01-07t14:29:45.658+0800: 169496.327: [gc [psyounggen: 5069k->4064k (514304K)]  40126k->39121k (2087168K),  0.0034770 secs] [times: user=0.04 sys=0.00, real= 0.01 secs] 2015-01-07t14:29:45.661+0800: 169496.331: [full gc  (System)  [ psyounggen: 4064k->0k (514304K)] [paroldgen: 35057k->38587k (1572864K)] 39121k-> 38587K (2087168K)  [pspermgen: 51269k->51269k (72128K)], 0.3473340 secs] [times:  user=1.42 sys=0.00, real=0.35 secs] 2015-01-07t14:55:31.114+0800: 171041.784:  [gc [psyounggen: 212923k->9959k (514048K)] 251510k->54259k (2086912K),  0.0073020  secs] [times: user=0.06 sys=0.00, real=0.00 secs] 2015-01-07t14:55:31.122+ 0800: 171041.791: [full gc  (System)  [psyounggen: 9959k->0k (514048K)] [ Paroldgen: 44299k->34703K (1572864K)] 54259k->34703k (2086912K)  [pspermgen: 51276k->51274k (69440K)],  0.2653740 secs] [times: user=0.79 sys=0.01, real=0.27 secs] 2015-01-07t14 : 55:31.531+0800: 171042.200: [gc [psyounggen: 5345k->2304k (507776K)] 40048K-> 37007K (2080640K), 0.0024170 secs] [times: user=0.02 sys=0.00, real=0.00  secs] 2015-01-07t14:55:31.533+0800: 171042.203: [full gc  (System)  [PSYoungGen:  2304k->0k (507776K)] [paroldgen: 34703k->36747k (1572864K)] 37007k->36747k ( 2080640K)  [pspermgen: 51364k->51363k (67264K)], 0.2703940 secs] [times: user= 0.83 sys=0.00, real=0.28 secs] 2015-01-07t14:55:37.374+0800: 171048.044: [gc  [psyounggen: 10021k->10878k (508416K)] 46768k->47625k (2081280K),  0.0026770 secs ] [times: user=0.02 Sys=0.00, real=0.01 secs] 2015-01-07t14:55:37.377+0800: 171048.046: [full gc   (System)  [psyounggen: 10878k->0k (508416K)] [paroldgen: 36747k->34645k (1572864K )] 47625k->34645k (2081280K)  [pspermgen: 51364k->51363k (65216K)], 0.2716380 secs ] [times: user=0.86 sys=0.00, real=0.27 secs] 2015-01-07t14:55:37.670+0800:  171048.339: [gc [psyounggen: 3948k->2880k (510016K)] 38594k->37525k (2082880K),  0.0025790 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]  2015-01-07t14:55:37.672+0800: 171048.342: [full gc  (System)  [psyounggen: 2880k- >0k (510016K)] [paroldgen: 34645k->37317k (1572864K)] 37525k->37317k (2082880K)  [ pspermgen: 51363k->51363k (63104K)], 0.2714200 secs] [times: user=0.84 sys= 0.00, real=0.27 secs]  2015-01-07t15:33:44.205+0800: 173334.875: [gc [psyounggen: 153420k->14231k (505728K)]  190738k->51548k (2078592K), 0.0048450 secs] [times: user=0.04 sys=0.00,  real=0.00 secs] 2015-01-07t15:33:44.210+0800: 173334.880: [full gc  (System)  [psyounggen: 14231k->0k (505728K)] [paroldgen: 37317k->34594k (1572864K)]  51548k->34594k (2078592K)  [pspermgen: 51364k->51359k (61440K)], 0.2908710 secs] [ times: user=0.89 sys=0.00, real=0.29 secs] 2015-01-07t15:33:44.514+0800:  173335.183: [gc [psyounggen: 3254k->2560k (506816K)] 37849k->37154k (2079680K),  0.0024490 secs] [times: user=0.03 sys=0.00, real=0.00 secs] 2015-01-07t15 :33:44.516+0800: 173335.186: [full gc  (System)  [psyounggen: 2560k->0k (506816K )] [paroldgen: 34594k->36883k (1572864K)] 37154k->36883k (2079680K)  [pspermgen: 51360k->51360k (60224K)],  0.2721010 secs] [times: user=0.82 sys=0.00, real=0.27 secs]

It is found that the Tomcat cycle is frequently performed in full GC over a period of time, indicating that the Tomcat tuning direction is right from the JVM memory.

Second, Tomcat memory analysis tuning-open Heapdump

only Gc.log is still not able to analyze the specific reasons for Tomcat outage, what caused it, this time to the outofmemory when the tomcat of the memory of the analysis, so the architect and the second tuning:

Tuning JVM Startup Parameters

Original: java_opts= "-server-xms2048m-xmx2048m-xmn512m-xx:+useparalleloldgc-xx:+printgcdatestamps-xx:+printgcdetails- Xloggc:/home/resin/tomcat/logs/gc.log "Modified: java_opts="-SERVER-XMS2048M-XMX2048M-XMN768M-XX:PERMSIZE=128M-XX: Maxpermsize=256m-xx:+useparalleloldgc-xx:+printgcdatestamps-xx:+printgcdetails-xx:+heapdumponoutofmemoryerror- Xx:heapdumppath=/home/resin/tomcat/dumpfile/heap.bin-xloggc:/home/resin/tomcat/logs/gc.log "

Description of the JVM startup parameters after tuning :

-xx:permsize=128m method area memory from the default 20M, increased to 128M

-xx:maxpermsize=256m Limit method Area memory Max 256M

-XX:+USEPARALLELOLDGC equivalent to "Parallel scavenge" + "Parallelold", are multithreaded parallel processing

-xx:+heapdumponoutofmemoryerror Open Heapdump

-xx:heapdumppath=/*****/****/tomcat/dumpfile/heap.bin heapdump Output Path


Third, how to analyze the Heapdump file

After the last tuning, Tomcat ran for 2 months, but after a service outage and claiming the Heapdump file, we were going to analyze the Heapdump file.

    1. Transfer the Heapdump file from the server to your local computer first.

    2. Using memoryanalyzer-1.4.0.20140604-win32.win32.x86 This software, the use of the process I will be below, the software is framed to me, 51cto:

    3. Open MemoryAnalyzer.exe, import heapdump file

650) this.width=650; "Src=" http://home.51cto.com/thumb.php?w=600&h=600&t=f&url=http://s3.51cto.com/ Wyfs02/m02/5b/2e/wkiom1ubcrlchlbxaanhl5pi-dg754.jpg "width=" "height=" 442 "alt=" Wkiom1ubcrlchlbxaanhl5pi-dg754.jpg "/>

650) this.width=650; "Src=" http://home.51cto.com/thumb.php?w=600&h=600&t=f&url=http://s3.51cto.com/ Wyfs02/m00/5b/28/wkiol1ubc9whh60haasdwxuskgc966.jpg "width=" "height=" 442 "alt=" Wkiol1ubc9whh60haasdwxuskgc966.jpg "/>


650) this.width=650; "Src=" http://home.51cto.com/thumb.php?w=600&h=600&t=f&url=http://s3.51cto.com/ Wyfs02/m00/5b/2e/wkiom1ubcrrywamxaazqiybg1me870.jpg "width=" "height=" 344 "alt=" Wkiom1ubcrrywamxaazqiybg1me870.jpg "/>

650) this.width=650; "Src=" http://home.51cto.com/thumb.php?w=600&h=600&t=f&url=http://s3.51cto.com/ Wyfs02/m01/5b/28/wkiol1ubc9riinniaaqhyl4nrq0013.jpg "width=" "height=" 347 "alt=" Wkiol1ubc9riinniaaqhyl4nrq0013.jpg "/>

At this point, we have roughly found the cause of this tomcat outage, the rest is to give the report to the boss, the work behind is developed, you can see because the code or application infinite loop caused by memory leaks, and then tuning the JVM is meaningless.

Iv. Summary

The above is only for you encounter this kind of problem of operation and maintenance of children's shoes some ideas, if you encounter this thorny problem lead to find the final cause, which involves the JVM tuning of East, the author temporarily involved in not deep, there may be errors in the place, go to the big God!

Specific about the JVM tuning, the author of a number of mobile phones compared to the force, special contributions come out:

JVM principle and tuning--Web link collection





This article is from the "Ops Road" blog, please be sure to keep this source http://vekergu.blog.51cto.com/9966832/1619640

-linux-tomcat downtime solutions for production environments

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.