Thread blocking caused by FileUtils. copyURLToFile in a multi-thread task and Java

Source: Internet
Author: User
Introduction: Java multi-threaded development brings benefits to the program. At the same time, problems caused by multi-threaded programs are also increasing. Finding and analyzing the problems is a headache for cainiao programs. Next, I will give a detailed analysis of the problems encountered during the multi-threaded development process in the project and share my ideas. I am also a cainiao.

Introduction: Java multi-threaded development brings benefits to the program. At the same time, problems caused by multi-threaded programs are also increasing. Finding and analyzing the problems is a headache for cainiao programs. Next, I will give a detailed analysis of the problems encountered during the multi-threaded development process in the project and share my ideas. I am also a cainiao.

Introduction: Java multi-threaded development brings benefits to the program. At the same time, problems caused by multi-threaded programs are also increasing. Finding and analyzing the problems is a headache for cainiao programs. Next, I will give a detailed analysis of the problems encountered during the multi-threaded development process in the project and share my ideas. I am also a cainiao. I forget my advice.

Project Description:

Compile a program to crawl a large number of images on a website. Traverse all crawling tasks from HBase and enable a fixed-size thread pool Executors. newFixedThreadPool (100): Submit the thread. Each thread in the thread uses FileUtils. copyURLToFile: Download the image from the Url and save it to the local device. The Code is as follows:

Main thread:

Public static void getAllRecord (String tableName, String prefix, String dir) {HTable table = null; try {table = new HTable (conf, tableName); Scan s = new Scan (); s. setFilter (new PrefixFilter (prefix. getBytes (); resultprocessing ss = table. getScanner (s); ExecutorService executor = Executors. newFixedThreadPool (100); for (Result r: ss) {try {TimeUnit. SECONDS. sleep (2);} catch (InterruptedException e) {// TODO Auto-generated catch blocke. printStackTrace ();} Thread task = new Thread (new DownLoadPicTask (r, dir, tableName); executor. submit (task);} executor. shutdown ();} catch (IOException e) {} finally {... close resources }}

Task thread:

public static String downloadFromUrl(String url,String dir,String cityName,String id) {          try {            URL httpurl = new URL(url);            String fileName = getFileNameFromUrl(url);            System.out.println(fileName);            File f = new File(dir + File.separator+ cityName+ File.separator+id+File.separator+ fileName);            FileUtils.copyURLToFile(httpurl, f);            FileInputStream fis = new FileInputStream(f);            BufferedImage bufferedImg = ImageIO.read(fis);            int imgWidth = bufferedImg.getWidth();            int imgHeight = bufferedImg.getHeight();            bufferedImg = null;            fis.close();            if(imgWidth<500&&imgHeight<500){            FileUtils.deleteQuietly(f);            return null;            }            return imgWidth + "," + imgHeight;        } catch (Exception e) {          return null;        }     }
Problem:

It has been executed for a long time. Theoretically, if all tasks are completed, the thread pool will be destroyed and the main thread will end, but the result is no. The first idea is that there must be a deadlock. Open Java VisualVM.

We can see that most of the pool-4 (our own thread) is in the waiting state.

Jstack calls up all stack information:

"pool-4-thread-21" prio=6 tid=0x000000000d662800 nid=0x2364 waiting on condition [0x000000001175f000]   java.lang.Thread.State: WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for  <0x0000000780faa230> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)   Locked ownable synchronizers:- None"pool-4-thread-20" prio=6 tid=0x000000000d662000 nid=0x32f8 waiting on condition [0x00000000114ef000]   java.lang.Thread.State: WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for  <0x0000000780faa230> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)   Locked ownable synchronizers:- None

The preceding figure shows the stack information of about 100 threads. Only two stacks are listed here.

For details about the stack information analysis of virtual machines, refer to the article: "three instances demonstrate Java Thread Dump log analysis".

Preliminary analysis:

Most threads (main: "most", not all reasons are mentioned here, which will be explained in my analysis) are in the WAITING state. All are waiting for such a resource as <0x0000000780faa230> (a java. util. concurrent. locks. AbstractQueuedSynchronizer $ ConditionObject.

Since threads are waiting for a resource, what is this resource?

With questions, we searched the entire stack information and found that only the pool-4-thread-** thread had this 0x0000000780faa230 thing at the same position. Then what is this thing?

AbstractQueuedSynchronizer provides a FIFO queue-based basic framework for building locks or other related synchronization devices.
Condition is a class of conditional functions. It must be placed in the Lock block, just like wait, where the y method is placed in the synchronized block.
The conditional (await, signal) provides a more general and flexible solution than the object (wait, notify) to allow threads with multiple conditions to communicate with each other.

For more details about AbstractQueuedSynchronizer, refer:
Introduction and Principle Analysis of AbstractQueuedSynchronizer
AbstractQueuedSynchronizer

Having said so much, I have referenced a lot of the above documents, or is it confused?

The thread is in idle State and cannot obtain a certain resource/condition (java. util. concurrent. locks. AbstractQueuedSynchronizer $ ConditionObject), so it must be occupied by a thread.

FileUtils. copyURLToFile

Based on this idea, I checked the status of all the threads and finally found the three running threads I missed in the first thread:
One of them is as follows:

"pool-4-thread-15" prio=6 tid=0x000000000d65e000 nid=0x28e4 runnable [0x00000000109fe000]   java.lang.Thread.State: RUNNABLEat java.net.SocketInputStream.socketRead0(Native Method)at java.net.SocketInputStream.read(SocketInputStream.java:152)at java.net.SocketInputStream.read(SocketInputStream.java:122)at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)at java.io.BufferedInputStream.read(BufferedInputStream.java:334)- locked <0x00000007810fa6c0> (a java.io.BufferedInputStream)at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)- locked <0x00000007810fa770> (a sun.net.www.protocol.http.HttpURLConnection)at java.net.URL.openStream(URL.java:1037)at org.apache.commons.io.FileUtils.copyURLToFile(FileUtils.java:1460)at com.esf.crawler.bootsStrap.DownLoadPicTask.downloadFromUrl(DownLoadPicTask.java:139)at com.esf.crawler.bootsStrap.DownLoadPicTask.run(DownLoadPicTask.java:101)at java.lang.Thread.run(Thread.java:745)at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)at java.util.concurrent.FutureTask.run(FutureTask.java:262)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)   Locked ownable synchronizers:- <0x00000007810e2060> (a java.util.concurrent.ThreadPoolExecutor$Worker)

That's right !!! This is the culprit. The three threads are running ).

Based on the stack information, we can find that FileUtils is currently being downloaded. copyURLToFile method, which reads the socket stream but does not end (at java.net. socketInputStream. read (SocketInputStream. java: 152 ))

This is the problem!

So why can't the read end? Network reading should be timed out if it cannot be completed. With this problem, I opened the method for downloading network images (see the above ).

FileUtils. copyURLToFile (httpurl, f );

Let's look at the FileUtils. copyURLToFile api:

public static void copyURLToFile(URL source,File destination)throws IOExceptionpublic static void copyURLToFile(URL source,File destination,int connectionTimeout,int readTimeout)throws IOException

We can see that I am using the first method.
It has a warning:

Warning: This method does not set a connection or read timeout and thus might block forever. UsecopyURLToFile(URL, File, int, int)With reasonable timeouts to prevent this.

That is, if the connection timeout and read timeout are not set, it may be permanently blocked due to exceptions.

Well, change to a function with timeout settings.

Execute again! The result is not what I expected >_<

HBase timeout

After the restart, 100 threads are still all sleep after a period of time. Refer to the debug log and find the following exception:

Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.ScannerTimeoutException: 200243ms passed since the last invocation, timeout is currently set to 60000at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:94)at com.esf.crawler.bootsStrap.AjkPicDownload.getAllRecord(AjkPicDownload.java:32)at com.esf.crawler.bootsStrap.AjkPicDownload.main(AjkPicDownload.java:75)Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 200243ms passed since the last invocation, timeout is currently set to 60000at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:370)at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:91)... 2 moreCaused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 1679, already closed?at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3053)at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)at java.lang.Thread.run(Thread.java:745)at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:526)at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:354)... 3 moreCaused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): org.apache.hadoop.hbase.UnknownScannerException: Name: 1679, already closed?at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3053)at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)at java.lang.Thread.run(Thread.java:745)at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453)at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1657)at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1715)at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29900)at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)... 7 more

The preceding exception occurs when HBase uses Scan timeout, which exceeds the default 6000 value. Then an exception is thrown and the program ends. That is, records are not traversed from the HBase Scaner to generate Task tasks. As a result, 100 threads are empty without tasks.
It turns out that I added the thread sleep code in the above program, resulting in Scaner timeout. Originally, the sleep code was used to prevent tasks from starting too many tasks too early, resulting in this exception.

The general problem has been found.

Summary:From problem locating to analysis and solution. It may be a false conjecture during the process. We still need to think differently, not only through analysis and reasoning, as much as possible. Step 1: search for clues. The emergence of Multithreading is nothing more than the deadlock and resource competition caused by synchronization and concurrency.

Use the tools provided in JDK to detect and export the corresponding stack information. Analyzes the causes of various states of threads in the dump log and finds the corresponding solutions to this problem.

The thread status in dump is roughly as follows:

  1. Deadlock,Deadlock (important )?
  2. Running, Runnable? ?
  3. Waiting for resources,Waiting on condition (important)?
  4. Waiting to get the monitor,Waiting on monitor entry (important)
  5. Paused, suincluded
  6. The Object is waiting. The Object. wait () or TIMED_WAITING
  7. Blocking,Blocked (important )??
  8. Stop, Parked

After analyzing the cause, locate the corresponding code. Change it !!!

Refer:

Three instances demonstrate Java Thread Dump Log Analysis
Introduction and Principle Analysis of AbstractQueuedSynchronizer
AbstractQueuedSynchronizer
Java thread pool instructions

Original article address: thread blocking caused by FileUtils. copyURLToFile in a multi-thread task and Java VM stac. Thank you for sharing it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.