Locating online running java system faults through jstack _ case 1

Source: Internet
Author: User
Tags ftp connection socket connect

Locating online running java system faults through jstack _ case 1

Problem description:

In an online java web system, a FTP upload task is scheduled to run. One day, it is found that the file is not uploaded after it is generated normally.

Preliminary analysis:

1. view log files

It is found that this task only prints logs that start to be processed by FTP, but does not print logs that have been processed by FTP.

From the code point of view, the FTP upload processing code exception protection is very good, if an exception occurs, it will print, but there is no relevant information in the log file, it is strange. It is suspected that the FTP process is a problem, such as the problem on the FTP server of the other party, but no evidence can be found.

It is hard to snoop on the internal information of the java operating system and sacrifice the killer-jstack.

2. jstack Analysis

On the running system, run the jps command (or in other ways, such as ps) to view the process ID of the running java program, and use jstack pid> jstack. log exports the thread stack information to jstack. find the following useful information in the log file.

Through code confirmation, the UploadFtpTask below is indeed the Execution Code of our file upload task.

According to the stack information, the thread status is RUNNABLE, not BLOCKED, indicating that the thread is not BLOCKED because of the lock, but is BLOCKED in the Network reading.

"DefaultQuartzScheduler_Worker-5" prio=10 tid=0x00002aaaf4382801 nid=0x1874 runnable [0x000000004133b000..0x000000004133bda0]   java.lang.Thread.State: RUNNABLE        at java.net.SocketInputStream.socketRead0(Native Method)        at java.net.SocketInputStream.read(SocketInputStream.java:129)        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)        - locked <0x00002aaac3cdd061> (a java.io.InputStreamReader)        at sun.nio.cs.StreamDecoder.read0(StreamDecoder.java:107)        - locked <0x00002aaac3cdd061> (a java.io.InputStreamReader)        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:93)        at java.io.InputStreamReader.read(InputStreamReader.java:151)        at it.sauronsoftware.ftp4j.NVTASCIIReader.readLine(NVTASCIIReader.java:105)        at it.sauronsoftware.ftp4j.FTPCommunicationChannel.read(FTPCommunicationChannel.java:142)        at it.sauronsoftware.ftp4j.FTPCommunicationChannel.readFTPReply(FTPCommunicationChannel.java:187)        at it.sauronsoftware.ftp4j.FTPClient.connect(FTPClient.java:1034)        - locked <0x00002aaac3cdd109> (a java.lang.Object)        at com.xx.FtpClientImpl.connect(FtpClientImpl.java:56)        at com.xx.UploadFtpTask.execute(UploadFtpTask.java:88)        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)

The referenced jar package confirms that this FTP function is implemented using the open-source package ftp4j. The version used is 1.5.1.

Write a test program to check the call stack during FTP connection:

Socket. connect (SocketAddress) line: 469
Socket. (SocketAddress, SocketAddress, boolean) line: 366
Socket. (String, int) line: 180
DirectConnector. connect (String, int) line: 35
DirectConnector. connectForCommunicationChannel (String, int) line: 40
FTPClient. connect (String, int) line: 1024
FTPClient. connect (String) line: 991
Test. main (String []) line: 19

What is the 469 line of Socket?

Connect (endpoint, 0 );

This function is defined as public void connect (SocketAddress endpoint, int timeout). The above call is equivalent to setting timeout to 0, this means that when network packet loss occurs or the peer service is faulty, the connection will not wait. This is the cup.

Let's see if this open-source project has been modified in the future? Download version 1.7.2, test again, and view the call Stack:

Socket. connect (SocketAddress, int) line: 490
DirectConnector (FTPConnector). tcpConnectForCommunicationChannel (String, int) line: 208
DirectConnector. connectForCommunicationChannel (String, int) line: 39
FTPClient. connect (String, int) line: 1036
FTPClient. connect (String) line: 1003
Test. main (String []) line: 19

When the Socket connect method is called through tcpConnectForCommunicationChannel, the timeout value is 10 seconds (10*1000 ). This introduces the timeout mechanism. If the above problem occurs, it will not die.

Summary:

1. The jstack tool is a powerful tool for locating the online running of java systems. It can view the thread stack information, which is very important for analyzing the problem, especially when the log analysis and code analysis cannot determine the problem.

2. When connecting to the network, you must set a timeout and do not have to wait. When developing a system, you must consider exceptions. To use that sentence, you always have to pay it back.

Zookeeper

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.