Using the thread pool for optimal resource utilization the most common problem in the Java Multithreaded Programming Forum is the various versions of the "How can I create a thread pool?" "In almost every server application, there are problems with thread pools and work queues. In this article, Brian Goetz shares the thread pool principle, basic implementation and tuning techniques, and some common misconceptions that need to be avoided.
Why use a thread pool? There are many server applications, such as WEB servers, database servers, file servers, or mail servers, that are faced with small tasks that handle a large number of remote requests. Requests may reach the server in many ways, such as through a network protocol (such as HTTP,FTP or POP), through a JMS queue, or by polling the database. Regardless of how the request arrives, for service applications it is usually a short process for each individual task but a large number of requests.
A very simple pattern for building a server application is to create a thread at the time each request arrives and process the request in that thread. This approach works well for sample-level applications, but when you want to deploy it as a server application, many obvious flaws begin to emerge. One drawback to this "thread-per-request" approach is that the cost of creating a new thread for each request is huge; creating a new thread for each request will take more time, and the resource overhead of server threads creation and destruction will be greater than the cost of the actual process of user requests.
In addition to the cost of creating and destroying threads, the active thread consumes some system resources. Creating too many threads in a single JVM can cause system memory to overflow, or overload can result from excessive memory consumption. To prevent resource overloading, server applications require some means to limit how many requests in a given time will be processed concurrently.
The thread pool provides a solution for both the threading life cycle overhead and the lack of resources. With multi-tasking of threads, the creation overhead of threads is spread over multiple tasks. More benefits, the delay caused by thread creation is eliminated because the thread already exists when the request arrives. As a result, requests can be serviced immediately, which makes the application more responsive. Also, by properly adjusting the number of threads in the thread pool, you can wait until a thread is idle to handle it by forcing more requests to exceed the threshold value to prevent the resource from overloading.
The thread pool's alternative thread pool is not the only scenario where the service app uses multithreading. As mentioned above, it is perfectly reasonable to create a new thread for each new task in some scenarios. However, if the task is created at a high frequency and the duration of the task is short, creating a new thread for each task will cause performance problems.
Another common threading model is to set up a separate background thread and task queue for a particular type of task. AWT and Swing Use this model, which has a GUI event thread, and any work that causes the user interface to change must be executed in that thread. However, because there is only one AWT thread, it is undesirable to perform a task that may take a long time to complete in an AWT thread. As a result, Swing programs often require additional worker threads for long-executing tasks associated with the UI.
In a given scenario, both threads-per-task and single-background-thread can work perfectly. The thread-per-task approach works well in scenarios with a small number of long-running tasks. The "single-background-threading" approach works great in scenarios where scheduling predictability is not very important, because it is mostly low-priority tasks that run in the background. However, most server applications are geared toward dealing with a large number of short-lived tasks or subtasks, and want to have a mechanism to handle these tasks in a low-overhead and efficient manner, with some measures of resource management and time predictability. There is no doubt that the thread pool can provide us with these advantages.
Work queue based on the actual implementation of the thread pool, the word "thread pool" is somewhat misleading, and a thread pool "obvious" implementation, in most cases, does not produce exactly the results we expect. In fact, the word "thread pool" appears earlier than the birth of the Java platform, and it is likely to be a product of a lack of object-oriented environments. But the term is still widely used.
When the client class waits for an available thread, we can simply implement a thread pool class, throw the task to that thread, and then return the thread to the thread pool after its execution, which has several potentially undesirable effects. For example, what happens when the thread pool is empty? Any caller who wants to pass a task to the thread pool will find that the thread pool is empty, and the caller thread blocks in a thread that waits for the thread pool to give it a usable one. In general, one of the reasons we want to use a background thread is to block the commit thread from blocking. This causes the caller thread to block, such as the "obvious" implementation of a thread pool, which brings us to a similar problem that we want to solve (the translator note: The caller thread wants to prevent blocking by a new thread, but it blocks when the request thread pools a thread).
What we usually want is a work queue that combines a set of worker threads, which uses wait () and notify () to notify the waiting thread whether or not a new job arrives. This task queue is typically implemented as a list of some kind that has a related monitoring object. The following code demonstrates a simple pooling of work queues. This pattern, which uses a Runnable object queue, is a common specification for scheduling and work queues, although the Thread API does not impose this special requirement.
public class WorkQueue {private final int nthreads;private final poolworker[] threads;private final LinkedList Queue;publi c WorkQueue (int nthreads) {this.nthreads = Nthreads;queue = new LinkedList (); threads = new Poolworker[nthreads];for (int i = 0; i < nthreads; i++) {Threads[i] = new Poolworker (); Threads[i].start ();}} public void Execute (Runnable r) {synchronized (queue) {queue.addlast (R); Queue.notify ();}} Private class Poolworker extends Thread {public void run () {Runnable R;while (true) {synchronized (queue) {while (queue.is Empty ()) {try {queue.wait ();} catch (Interruptedexception ignored) {}}r = (Runnable) Queue.removefirst ();} If we don ' t catch runtimeexception,//the pool could leak threadstry {r.run ();} catch (RuntimeException e) {//You Migh t want to log something-here}}}}
You may have noticed that the implementation of the above code uses notify () instead of Notifyall (). Most experts recommend using Notifyall () instead of notify () because there is a bit of risk when using notify (), which is only suitable for special conditions. In other words, the performance of Notify () is better than Notifyall () when used properly, especially if notify () leads to less context switching, which is important in server applications.
The work queues in the above code meet the requirements for safe use of notify (). So feel free to use it in your code, but be careful when using notify () in other situations.
Some pitfalls of using the thread pool Although the thread pool is a powerful mechanism for building multithreaded applications, it is not without drawbacks. Applications built using the thread pool face the same concurrency risks that other multithreaded applications face, such as synchronization errors and deadlocks, and this pool has other unique flaws, such as thread pooling-association deadlocks, insufficient resources, and thread leaks.
Deadlocks can be a risk of deadlock for any multithreaded application. Both sides are waiting for an event, and this event can only be provided by the other party, so that a pair of processes or threads we call a deadlock. The simplest case of deadlock is that thread a holds an exclusive lock on object x, thread a waits for the lock of object y, and thread B holds the exclusive lock of object Y, and thread B waits for the lock of Object X. Unless there is some way to break this lock wait (the Java lock mechanism cannot support this), the pair of deadlock threads will wait forever.
Since deadlocks are a risk to all multithreaded programming, the thread pool introduces us to another deadlock: all threads in the thread pools are blocking the execution of another task in the wait queue, but the other task cannot be executed because the pool is simply useless and free of available threads. In this case, the online pool may appear in the simulation implementations of some interacting objects that send queries to each other and then execute as task queues, and the object that initiates the query waits for a response synchronously.
One of the advantages of the resource-poor thread pool is that they have better performance than other scheduling mechanisms in most cases, such as those we discussed above. But this depends on whether you have properly configured the thread pool size. Threads consume a large amount of resources, including memory and other system resources. In addition to the memory required for thread objects, each thread requires two execution call stacks, which can be very large. In addition, the JVM may also create a local thread for each Java thread, which will consume additional system resources. Finally, although the scheduling overhead of switching between threads is small, a large number of thread context switches can also affect your application performance.
If the thread pool is too large, the resources consumed by these many threads will significantly affect system performance. Time can be wasted on switching between threads, and there is a shortage of resources that can be configured to be more expensive than you actually need because the resources used by the thread in the pool can be more efficient in other tasks. In addition to the resources used by these threads themselves, the work done by the service request may require additional resources, such as JDBC connections, sockets, or files. These are also limited resources, and excessive concurrent requests to them can lead to invalidation, such as the inability to allocate a JDBC connection.
The concurrency error thread pool and other queuing mechanisms depend on the use of the Wait () and notify () methods, which can become tricky. If it is improperly coded, it is likely to cause the notification to be lost, and the result is that the threads in the pool are in an idle state, and there are actually tasks in the queue that need to be handled. Use these tools with a 120,000-point spirit, and even experts often make mistakes when they use them. Fortunately, you can use some out-of-the-box implementations that are tried and tested, such as the java.util.concurrent package that you do not need to encode yourself to implement later in this article.
Thread leaks a significant risk in a wide variety of thread pools is thread leaks, which can occur when a thread is removed from the thread pool to perform a task that is not returned to the thread pool after the execution of the task has ended. One way this happens is when a task throws a runtimeexception or an Error. If the thread pool class does not capture these, it will be innocently present in the thread pool, and the thread pool's number of threads will be permanently reduced by one. When this happens enough times, the thread pool will eventually be empty (no threads available) and the system will be paralyzed because there are no threads to handle the task.
Tasks that are paralyzed, such as those that are permanently waiting for a user to enter without guaranteeing available resources or waiting for their home, can also cause the same consequences as a thread leak. If a thread is permanently occupied by such a task, it has the same effect as removing it from the pool. Tasks like this should either give them a thread outside the thread pool, or control their wait times.
Requesting an overloaded server is likely to be overwhelmed by overwhelming requests. In this case, we may not want each incoming request to be put into our work queue because the queue of tasks waiting to be executed can also consume too much system resources and lead to insufficient resources. What you do at this time depends on your decision, such as you can reject requests by a response that indicates that the server is temporarily too busy.
Efficient Threading Pool Usage guide you just have to follow some simple guidelines, and the thread pool can be a very effective way to build your service application:
- Do not put tasks that synchronize waiting for the results of other tasks into the task queue. This will result in the deadlock described above, where all threads in the pool are waiting for a task to execute, and this task in the queue cannot be executed because all threads are in use.
- It may be prudent to put a long-time task into the thread pool. If the program must wait for a resource, such as an I/O completion, define a maximum wait time, and then fail or re-execute later. This ensures that some other tasks will eventually be executed successfully by releasing a thread from a task that might be completed.
- Understand your task. To effectively adjust the thread pool size, you need to understand what the tasks in the queue are going to do. Are they CPU intensive operations? Do they take time to occupy I/O? Your answer will affect your configuration of your app. If these tasks come from different classes and have distinct characteristics, it might be more helpful to customize different work queues for different types of tasks so that each pool can be well-configured.
Thread pool Size configuration adjusting the size of the thread pool is largely a matter of avoiding two mistakes: having too many or too few threads. Fortunately, the middle ground between too much or too little for most applications is still very broad.
Review the two main advantages of using threads in your app: While waiting for a slow operation such as I/O, the process can continue to take advantage of the availability of multiple processors. Running a compute-intensive application on an N-processor host can increase throughput by setting the number of threads to N, but adding extra threads that exceed n will not do much good. Indeed, too many threads can even degrade performance because of the additional context switching overhead.
The optimal size of the thread pool depends on the number of available processors and the nature of the tasks in the work queue. For a work queue that will hold a fully compute-intensive task in a N-processor system, the maximum CPU utilization is usually achieved by configuring a thread pool size of n or n + 1 threads.
For tasks that may be waiting for I/O to complete, such as a task that reads an HTTP request from the socket-you need to increase the number of threads in the thread pool beyond the number of available processors because all threads work at the same time. Through analysis, you can estimate the ratio between wait time (WT) and service time (ST) for a typical request. For example, we call this ratio wt/st, and for a N-processor system, you need about N * (1 + wt/st) threads to keep the processor fully utilized.
Processor utilization is not the only basis for configuring the thread pool size. Because the online pool grows, you may encounter problems such as the limitations of the scheduler, memory availability, or other system resources such as the number of sockets, the processing of open files, or database connections.
No self-coding Doug Lea wrote an outstanding open source Concurrency Toolkit, java.util.concurrent, which contains mutex, trust, and performance-performing collection classes such as queues and hash tables, and some work queue implementations under concurrent access. The Pooledexecutor class in this package is an efficient, widely used, and correct implementation of a thread pool based on a work queue. Instead of trying to write the code yourself, it's easy to make mistakes, and you can consider using some of the tools in the Java.util.concurrent package. See resources below for more details.
Conclusion The thread pool is a very useful tool for building server applications. The concept is simple, but there are some issues to be aware of when implementing or using, such as deadlocks, insufficient resources, and the complexity of Wait () and notify (). If you find that your application needs a thread pool, consider using one of the Executor classes in the Java.util.concurrent package, such as Pooledexecutor, instead of writing from scratch. If you find yourself creating some threads to handle short tasks, you should consider using a thread pool.
Resources
- Doug Lea, "Java Concurrency Programming: design principles and Patterns, second Edition"
- Doug Lea's java.util.concurrent bag
- Java.util.concurrent package takes effect with JCP's JSR 166, contained within the JDK 1.6 release
- Allen Holub "Taming Java Threads"
- Alex Roetter provides "guidelines for writing thread-safe classes"
Original link: http://www.ibm.com/developerworks/library/j-jtp0730/.
Java theory and Practice: thread pool and Work queue