Use ThreadPoolExecutor to execute independent single-threaded tasks in parallel,

Source: Internet
Author: User

Use ThreadPoolExecutor to execute independent single-threaded tasks in parallel,

Java SE 5.0 introduces the task execution framework, which is a major improvement in simplifying the design and development of multi-threaded programs. This framework allows you to easily manage tasks: Manage task lifecycles and execution policies.

In this article, we use a simple example to demonstrate the flexibility and simplicity brought about by this framework.

Basic

The execution framework introduces the Executor interface to manage the execution of tasks. Executor is an interface used to submit Runnable tasks. This interface isolates task submission from task execution: executors with different execution policies implement the same commit interface. Changing the execution policy does not affect the task submission logic.

If you want to submit a Runnable object for execution, it is very simple:

12 Executor exec = …;exec.execute(runnable);
Thread Pool

As described above, how executor executes the submitted runnable task is not specified in the Executor interface, depending on the specific type of executor you use. This framework provides several different executors. The execution policies vary with different scenarios.

The most common executor type you may use is the thread pool executor, which is an instance of the ThreadPoolExecutor class (and its subclass. ThreadPoolExecutor manages a thread pool and a working queue. The thread pool stores the working threads used to execute tasks.

You must have understood the concept of "pool" in other technologies. One of the biggest advantages of using a "pool" is to reduce the Resource Creation overhead, which can be reused after being used and released. Another indirect benefit is that you can control the amount of resources used. For example, you can adjust the thread pool size to reach the desired load without compromising system resources.

This framework provides a factory class called Executors to create a thread pool. You can use this project to create thread pools with different features. Although the underlying implementation is often the same (ThreadPoolExecutor), factory classes allow you to quickly set a thread pool without using complex constructors. Engineering factory methods include:

  • NewFixedThreadPool: This method returns a thread pool with a fixed maximum capacity. It creates new threads as needed, and the number of threads is not greater than the configured quantity. When the maximum number of threads is reached, the thread pool will remain unchanged.
  • NewCachedThreadPool: This method returns an unbounded thread pool, that is, there is no maximum number of threads. However, when the workload is reduced by an hour, such thread pools will destroy useless threads.
  • NewSingleThreadedExecutor: This method returns an executor, which ensures that all tasks are executed in a single thread.
  • NewScheduledThreadPool: This method returns a fixed-size thread pool that supports latency and scheduled task execution.

This is just the beginning. Executor has some other usage beyond the scope of this article. I strongly recommend that you study the following:

  • Lifecycle management methods, which are declared by the ExecutorService interface (such as shutdown () and awaitTermination ()).
  • You can use CompletionService to query the task status and obtain the return value, if any.

The ExecutorService interface is especially important because it provides methods to close the thread pool and ensures that resources are no longer used. It is gratifying to note that the ExecutorService interface is quite simple and clear, and I suggest you fully learn its documentation.

Generally speaking, when you send a shutdown () message to ExecutorService, it will not receive new submitted tasks, but the tasks still in the queue will be processed. You can use isTerminated () to query the termination status of ExecutorService, or use awaitTermination (...) To wait for the termination of ExecutorService. If a maximum timeout value is input as the parameter, the awaitTermination method will not wait forever.

Warning:There are some errors and confusions in understanding that the JVM process will never exit. If you do not close executorService, but destroy the underlying thread, the JVM will not exit. When the last normal thread (non-daemon thread) exits, the JVM also exits.

Configure ThreadPoolExecutor

If you decide not to use the Executor factory class, but to manually create a ThreadPoolExecutor, you need to use the constructor to create and configure it. The following is the most widely used constructor of this class:

1234567 public ThreadPoolExecutor(    int corePoolSize,    int maxPoolSize,    long keepAlive,    TimeUnit unit,    BlockingQueue<Runnable> workQueue,    RejectedExecutionHandler handler);

As you can see, you can configure the following:

  • Core pool size (the size that the thread pool will use)
  • Maximum pool size
  • Survival time. Idle threads are destroyed after this time.
  • Job Queue for storing tasks
  • Policy to be executed after the task is rejected
Limit the number of tasks in a queue

Limiting the number of concurrent tasks and the size of the thread pool can greatly benefit the predictability and stability of applications and program execution results. Endless thread creation will consume runtime resources. Your application may cause serious performance problems and even cause unstable programs.

This only solves some problems: it limits the number of concurrent tasks, but does not limit the number of tasks submitted to the waiting queue. If the task submission rate is always higher than the task execution rate, the application will eventually suffer a shortage of resources.

Solution:

  • Provides Executor with a blocking queue for storing tasks to be executed. If the queue is full, the tasks submitted in the future will be "REJECTED ".
  • The RejectedExecutionHandler is triggered when the task is rejected, which is why the class name references the verb "rejected ". You can implement your own denial policies or use the built-in policies of the framework.

The default deny policy allows executor to throw a RejectedExecutionException. However, there are other built-in policies:

  • Quietly discard a task
  • Discard the oldest task and resubmit the latest Task
  • Execute the denied task in the caller's thread.

When and why do we configure thread pools like this? Let's look at an example.

Example: Execute independent single-threaded tasks in parallel

Recently, I was called to solve the problem of a task that was a long time ago. My customer ran this task before. Generally speaking, this task contains a component that listens to file system events generated by the directory tree. Each time an event is triggered, a file must be processed. A dedicated single thread for file processing. To be honest, according to the characteristics of the task, even if I can parallel it, I don't want to do that. In some days, the event arrival rate is very high, and files do not need to be processed in real time. You can complete the pre-processing on the next day.

The current implementation adopts some mixed and matched technologies, including using unix shell scripts to scan the directory structure and check for changes. After implementation, we adopt a dual-core execution environment. Similarly, the event arrival rate is quite low: So far, millions of events have to be handled for a total of 1 ~ 2 t bytes of raw data.

The host running the processing program is a 12-Core Machine: it is a good opportunity to parallelize these old single-threaded tasks. Basically, we have all the ingredients for recipes, and all we need to do is to set up and adjust the program. Before writing code, we must understand the load of the program. Let me list the detected content:

  • A large number of files need to be periodically scanned: each directory contains 1 ~ 2 million files
  • Scan algorithms are fast and can be parallelized
  • Processing a file takes at least 1 s, or even 2 s or 3 s
  • When processing files, the main performance bottleneck is the CPU
  • The CPU utilization must be adjustable. Different load configurations are used based on the time of the day.

I need such a thread pool. Its size is set through the load configuration when the program is running. I tend to create a fixed thread pool based on the load policy. Because the thread performance bottleneck lies in the CPU, its core usage is 100% and it does not wait for other resources, so the load policy is well calculated: multiply the number of CPU cores in the execution environment by one load factor (to ensure that the computing result is at least one core at the peak ):

123 int cpus = Runtime.getRuntime().availableProcessors();int maxThreads = cpus * scaleFactor;maxThreads = (maxThreads > 0 ? maxThreads : 1);

Then I need to create a ThreadPoolExecutor using the blocking queue to limit the number of submitted tasks. Why? In this way, the scan algorithm runs very fast and will soon generate a large number of files to be processed. How large is the quantity? It's hard to predict, because the change is too big. I don't want the executor internal queue to selectively fill up the task instance to be executed (these instances contain large file descriptors ). I would rather reject these files when the queue is full.

In addition, I will use ThreadPoolExecutor. CallerRunsPolicy as the denial policy. Why? When the queue is full, the thread pool thread is busy processing files. I asked the thread that submitted the task to execute it (the rejected task ). In this way, the scan will stop and then process a file. After the processing, the scan will immediately scan the directory.

The code for creating executor is as follows:

12345678 ExecutorService executorService =    new ThreadPoolExecutor(        maxThreads, // core thread pool size        maxThreads, // maximum thread pool size        1, // time to wait before resizing pool        TimeUnit.MINUTES,         new ArrayBlockingQueue<Runnable>(maxThreads, true),        new ThreadPoolExecutor.CallerRunsPolicy());

The following is the framework of the program (extremely simplified version ):

123456789101112131415161718192021222324252627282930313233343536373839404142434445 // scanning loop: fake scanningwhile (!dirsToProcess.isEmpty()) {    File currentDir = dirsToProcess.pop();     // listing children    File[] children = currentDir.listFiles();     // processing children    for (final File currentFile : children) {        // if it's a directory, defer processing        if (currentFile.isDirectory()) {            dirsToProcess.add(currentFile);            continue;        }         executorService.submit(new Runnable() {            @Override            public void run() {                try {                    // if it's a file, process it                    new ConvertTask(currentFile).perform();                } catch (Exception ex) {                    // error management logic                }            }        });    }} // ...// wait for all of the executor threads to finishexecutorService.shutdown();try {    if (!executorService.awaitTermination(60, TimeUnit.SECONDS)) {        // pool didn't terminate after the first try        executorService.shutdownNow();    }     if (!executorService.awaitTermination(60, TimeUnit.SECONDS)) {        // pool didn't terminate after the second try    }} catch (InterruptedException ex) {    executorService.shutdownNow();    Thread.currentThread().interrupt();}
Summary

As you can see, Java concurrent APIs are very easy to use, flexible, and powerful. I hope I can spend more time writing such a simple program many years ago. In this way, I can solve the scalability problems caused by traditional single-threaded components within a few hours.

Original article: javacodegeeks Translation: ImportNew.com-wen xuemin
Http://www.importnew.com/12940.html.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.