Problem Description:
In an online running Java Web System, the task of running an FTP upload is timed, and one day it is found that the file has not been uploaded after it was generated normally.
Preliminary analysis of the problem:
1. View Log files
found that this task only prints the log to start the FTP processing, but does not print the log that the FTP processing completes.
From the code, the FTP upload processing code exception protection is very good, if there is an exception, it will be printed, but the log file does not have the relevant information, it is very strange. Suspect is an FTP process problem, such as the other side of the FTP server problem caused, but can not find evidence.
unable to spy on the internal information of the Java operating system, the killer- Jstack .
2. Through Jstack analysis
On the operating system, the JPS command (also available in other ways, such as PS) to view the process ID of a running Java program, using Jstack pid > Jstack.log to export the thread stack information to a jstack.log file, find the following useful information.
With the code confirmation, the uploadftptask below is really the code of execution for our file upload task.
From the stack information, the thread state is runnable, not the blocked state, stating that it is not because the lock caused the thread to block but instead blocks on the network read.
<span style= "FONT-SIZE:14PX;" > "defaultquartzscheduler_worker-5" prio=10 tid=0x00002aaaf4382801 nid=0x1874 runnable [0x000000004133b000. 0X000000004133BDA0] Java.lang.Thread.State:RUNNABLE at java.net.SocketInputStream.socketRead0 (Native Method) At Java.net.SocketInputStream.read (socketinputstream.java:129) at Sun.nio.cs.StreamDecoder.readBytes (Streamdeco der.java:264) at Sun.nio.cs.StreamDecoder.implRead (streamdecoder.java:306) at Sun.nio.cs.StreamDecoder.read ( streamdecoder.java:158)-Locked <0x00002aaac3cdd061> (a java.io.InputStreamReader) at Sun.nio.cs.Stre AMDECODER.READ0 (streamdecoder.java:107)-Locked <0x00002aaac3cdd061> (a java.io.InputStreamReader) at Sun.nio.cs.StreamDecoder.read (streamdecoder.java:93) at Java.io.InputStreamReader.read (Inputstreamreader.java : 151) at It.sauronsoftware.ftp4j.NVTASCIIReader.readLine (nvtasciireader.java:105) at it.sauronsoftware.ftp4j . FtpcommuNicationchannel.read (ftpcommunicationchannel.java:142) at It.sauronsoftware.ftp4j.FTPCommunicationChannel.readFTPReply (ftpcommunicationchannel.java:187) at It.sauronsoftware.ftp4j.FTPClient.connect (ftpclient.java:1034)-Locked <0x00002aaac3cdd109> (a JAVA.LANG.OBJ ECT) at Com.xx.FtpClientImpl.connect (ftpclientimpl.java:56) at Com.xx.UploadFtpTask.execute (Uploadftptask.ja va:88) at Org.quartz.core.JobRunShell.run (jobrunshell.java:216) at Org.quartz.simpl.simplethreadpool$workert Hread.run (simplethreadpool.java:549) </span>
confirmed by the referenced jar package, the open source package used by this FTP feature ftp4j to be implemented, the version used is 1.5.1 .
Write a test program to see the call stack when FTP is connected:
Socket.connect (socketaddress) line:469
Socket.<init> (SocketAddress, SocketAddress, Boolean) line:366
Socket.<init> (String, int) line:180
Directconnector.connect (String, int) line:35
Directconnector.connectforcommunicationchannel (String, int) line:40
Ftpclient.connect (String, int) line:1024
Ftpclient.connect (String) line:991
Test.main (string[]) line:19
And what is the 469 line of the socket?
Connect (endpoint, 0);
This function is defined as: public void Connect (socketaddress endpoint, int timeout), the above call is equivalent to setting a timeout of 0, which means that there is a network packet loss or a problem with the Terminal Services, This connection will wait indefinitely. This is the cup.
take a look at this open source project follow-up whether this issue has been modified? Download the 1.7.2 version, test again, and look at the call stack:
Socket.connect (socketaddress, int) line:490
Directconnector (Ftpconnector). Tcpconnectforcommunicationchannel (String, int) line:208
Directconnector.connectforcommunicationchannel (String, int) line:39
Ftpclient.connect (String, int) line:1036
Ftpclient.connect (String) line:1003
Test.main (string[]) line:19
When the socket's Connect method is called via Tcpconnectforcommunicationchannel, the timeout time is passed, which is 10 seconds (10*1000). This introduces the time-out mechanism, and if the above problem occurs, it will not death.
Summarize:
1.jstack tools are a powerful tool for locating Java systems on-line and can view thread stack information, which is important for analyzing problems, especially when log analysis and code analysis cannot determine the problem.
2. When the network is connected, you must set a timeout and cannot wait indefinitely. When developing a system, you must consider various anomalies. To apply that sentence, out of the mix, always have to.