5 minutes wrong--Save the Tomcat with the lying gun

Source: Internet
Author: User
Tags system log


monitoring detects the background management system based on the Tomcat application can not access the page, sent a warning message, the problem arises, the first thing to do -restart the application. That's right! is not a barrier, but first restart the application, online product problems First principle, no matter what problem to try to recover in the first time, and restart the application is often the most effective means!

Restart , the background management system can be used normally, the heart even put down, the general situation is a small problem can be restored, if not restart it? Can not recover, operation and maintenance personnel this will be emergency troubleshooting, if you try to stop or predict ten minutes can not solve, then quickly follow the disaster recovery manual emergency replacement Restore it, don't tell me there is no rapid disaster tolerance plan ....

back to the subject, in the business temporarily back to normal, do not disregard, we must find out the cause of the problem, there is no reason to cause the business strike, it is just you did not see it.

share the next landlord's troubleshooting steps, for reference only:

1.   troubleshoot errors caused by code

Check tomcat logs, found the error message, but with tomcat socket error, from this error message is basically excluded is the development of code problems, in fact, there are many times the development of code will Tomcat Run to death, what memory overflow is a little thing, good memory GCC and heapdump on it.

clientabortexception:  java.net.sockettimeoutexception        Atorg.apache.catalina.connector.OutputBuffer.realWriteBytes (outputbuffer.java:369)         atorg.apache.tomcat.util.buf.bytechunk.append (bytechunk.java:368)         atorg.apache.catalina.connector.outputbuffer.writebytes (outputbuffer.java:392)         atorg.apache.catalina.connector.outputbuffer.write (OutputBuffer.java:381)         atorg.apache.catalina.connector.coyoteoutputstream.write ( coyoteoutputstream.java:89)        at  Org.apache.catalina.connector.CoyoteOutputStream.write (coyoteoutputstream.java:83)         atcom.qhfax.thrivefa.web.filecontroller.showimage (filecontroller.java:60)         atcom.qhfax.thrivefa.common.basecontroller.showimage (BaSECONTROLLER.JAVA:54)        at  Sun.reflect.GeneratedMethodAccessor40.invoke (Unknownsource)         Atsun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:25)         at java.lang.reflect.method.invoke (method.java:597)         atorg.springframework.web.bind.annotation.support.handlermethodinvoker.invokehandlermethod ( handlermethodinvoker.java:175)         Atorg.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod ( annotationmethodhandleradapter.java:446)        at  Org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle ( annotationmethodhandleradapter.java:434)         Atorg.springframework.web.servlet.DispatcherServlet.doDispatch (Dispatcherservlet.java:938)         Atorg.springframework.web.servlet.DispatcherServlet.doService (dispatcherservlet.java:870)         at org.springframework.web.servlet.frameworkservlet.processrequest ( frameworkservlet.java:961)         Atorg.springframework.web.servlet.FrameworkServlet.doGet (frameworkservlet.java:852)         atjavax.servlet.http.httpservlet.service (httpservlet.java:617)         at org.springframework.web.servlet.frameworkservlet.service (FrameworkServlet.java:837)         atjavax.servlet.http.httpservlet.service (HttpServlet.java:717)         atorg.apache.catalina.core.applicationfilterchain.internaldofilter ( applicationfilterchain.java:290)        at  Org.apache.catalina.core.ApplicationFilterChain.doFilter (applicationfilterchain.java:206)         Atorg.springframework.web.filter.CharacterEncodingFilter.doFilterInternal (characterencodingfilter.java:88)        atorg.springframework.web.filter.onceperrequestfilter.dofilter ( onceperrequestfilter.java:107)         Atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter (applicationfilterchain.java:235)         atorg.apache.catalina.core.applicationfilterchain.dofilter ( applicationfilterchain.java:206)        at  Org.apache.catalina.core.StandardWrapperValve.invoke (standardwrappervalve.java:233)         atorg.apache.catalina.core.standardcontextvalve.invoke (standardcontextvalve.java:191)         atorg.apache.catalina.core.standardhostvalve.invoke (StandardHostValve.java : 127)        atorg.apache.catalina.valves.errorreportvalve.invoke (errorreportvalve.java:102)         atorg.apache.catalina.core.standardenginevalve.invoke (standardenginevalve.java:109)         atorg.apache.catalina.connector.coyoteadapter.service (CoyoteAdapter.java : 293)        at org.apache.coyote.http11.http11nioprocessor.process ( http11nioprocessor.java:889)         Atorg.apache.coyote.http11.http11nioprotocol$http11connectionhandler.process (Http11NioProtocol.java:744)         atorg.apache.tomcat.util.net.nioendpoint$socketprocessor.run ( nioendpoint.java:2274)        atjava.util.concurrent.threadpoolexecutor$ Worker.runtask (threadpoolexecutor.java:886)         Atjava.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:908)    &NBsp;   at java.lang.thread.run (thread.java:662) caused by:  java.net.sockettimeoutexception        Atorg.apache.tomcat.util.net.NioBlockingSelector.write (nioblockingselector.java:123)         atorg.apache.tomcat.util.net.nioselectorpool.write (nioselectorpool.java:156)         atorg.apache.coyote.http11.internalniooutputbuffer.writetosocket ( internalniooutputbuffer.java:460)         Atorg.apache.coyote.http11.InternalNioOutputBuffer.flushBuffer (internalniooutputbuffer.java:800)         atorg.apache.coyote.http11.internalniooutputbuffer.addtobb ( internalniooutputbuffer.java:644)        at  org.apache.coyote.http11.internalniooutputbuffer.access$000 (internalniooutputbuffer.java:46)         atorg.apache.coyote.http11.internalniooutPutbuffer$socketoutputbuffer.dowrite (internalniooutputbuffer.java:825)         atorg.apache.coyote.http11.filters.identityoutputfilter.dowrite (identityoutputfilter.java:118)         atorg.apache.coyote.http11.internalniooutputbuffer.dowrite ( internalniooutputbuffer.java:610)        at  Org.apache.coyote.Response.doWrite (response.java:560)         Atorg.apache.catalina.connector.OutputBuffer.realWriteBytes (outputbuffer.java:364)         ... 38 more
2.   the significance of the monitoring

At the same time hurriedly check all kinds of alarm text messages and mails, to exclude the health care of web monitoring, there is really harvest:

Alarm Host: **********

Alarm Time: ********

Alarm level: Warning

Alarm information: Too many processes on ******

Warning Item: proc.num[]

Question details: number of processes:106

Current Status: ok:106

Event id:16294

Then look at the monitoring system:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/70/0C/wKiom1WwVlHg2oOFAAD2Vf_HqK4218.jpg "title=" 001. PNG "alt=" wkiom1wwvlhg2oofaad2vf_hqk4218.jpg "/>

This is to seize the culprit of the small tail, and initially found the cause of the business to get down on the suspect so the next step in the console accurate positioning is the trouble-making.

3.   method of Troubleshooting

Finding is the cause of too many processes, then the solution is too simple

Pstree Find out which process is too high a look, the amount, the original is a regular task of the pit ...

Application of Pkill batch End timer task

4.   accurate analysis of the cause of failure

Alarm to see the knowledge process too much, this and the Tomcat has nothing to do with the food, it depends on the system log, this time the log centralized management and the role of visualization is reflected.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/70/09/wKioL1WwWEXRpSCFAAHpN9VjIpI205.jpg "title=" 002. PNG "alt=" wkiol1wwwexrpscfaahpn9vjipi205.jpg "/>

5.   Knowledge points

The Linux kernel has a mechanism called OOM killer(out-of-memory killer), which monitors those processes that consume too much memory, especially those that consume large amounts of memory instantly, in order to prevent Memory runs out and the kernel kills the process. Typical situation is: one day a machine suddenly ssh telnet, but can Ping, that is not the network fault, because the sshd process was OOM killer Killed (many times, such as the situation of suspended animation). Viewing the system log after restarting the machine/var/log/messages will find an out -of-Memory:kill process 1865(sshd) similar error message.

It is important to note that:

1.OOM not suitable for resolving memory leak problems.

2. Sometimes free view also has sufficient memory, but still triggers oom, because the process may occupy a special memory address space.

So don't expect Oom to be able to prevent memory overflow, and free can not find the problem of Oom, operations partners, enterprise-level monitoring and log centralized management is your eyes!


This article is from the "Ops Road" blog, please be sure to keep this source http://vekergu.blog.51cto.com/9966832/1677437

5 minutes wrong--Save the Tomcat with the lying gun

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.