Thread pool and Work queue

Source: Internet
Author: User

Why use a thread pool?

Many server applications, such as WEB servers, database servers, file servers, or mail servers, are geared toward processing a large number of short tasks from some remote sources. The request arrives at the server in some way, possibly through a network protocol (such as HTTP, FTP, or POP), through a JMS queue, or possibly by polling the database. Regardless of how the request arrives, the server application often occurs when a single task is processed for a short time and the number of requests is huge.

A simplistic model of building a server application would be to create a new thread whenever a request arrives and then service the request in a new thread. In fact, this approach works well for prototyping, but if you try to deploy a server application that runs in this way, the serious lack of this approach is obvious. One of the disadvantages of one thread per request (Thread-per-request) method is that it is expensive to create a new thread for each request The server that creates a new thread for each request spends more time and resources on creating and destroying threads than it spends on processing actual user requests.

In addition to the overhead of creating and destroying threads, the active thread consumes system resources. Creating too many threads in one JVM can cause the system to run out of memory or "over-switch" because of excessive memory consumption. To prevent resource shortages, server applications need some way to limit the number of requests processed at any given moment.

The thread pool provides a solution for threading life-cycle overhead problems and resource shortages. By reusing threads on multiple tasks, the cost of thread creation is distributed across multiple tasks. The benefit is that because the thread already exists when the request arrives, it inadvertently eliminates the delay caused by thread creation. In this way, you can immediately service the request and make the application respond faster. Furthermore, by properly adjusting the number of threads in the thread pool, that is, when the number of requests exceeds a certain threshold, forcing any other new requests to wait until a thread has been processed so that the resources are prevented.

Back to top of page

Alternative to thread pool

The thread pool is far from being the only way to use multithreading within a server application. As mentioned above, it is sometimes wise to generate a new thread for each new task. However, if task creation is too frequent and the average processing time for a task is too short, generating a new thread for each task will cause performance problems.

Another common threading model is to assign a background thread to a task queue for a certain type of task. AWT and Swing Use this model, where there is a GUI event thread, and all the work that causes the user interface to change must be executed in that thread. However, because there is only one AWT thread, it is not advisable to perform a task in an AWT thread that may take a considerable amount of time to complete. As a result, Swing applications often require additional worker threads for long-running, UI-related tasks.

Each task corresponds to a threading method and a single background thread (Single-background-thread) method that works well in some situations. One thread method per task works very well when there are only a few long-running tasks. As long as scheduling predictability is not important, a single background threading approach works very well, as is the case with low-priority background tasks. However, most server applications are geared toward processing a large number of short-term tasks or subtasks, and therefore often want to have a mechanism to handle these tasks efficiently and with some resource management and timing predictability in a low-overhead manner. These benefits are provided by the thread pool.

Back to top of page

Work queue

In terms of the actual implementation of the thread pool, the term "thread pool" is somewhat misleading because the thread pool "obvious" implementations do not necessarily produce the results we want in most cases. The term "thread pool" appears before the Java platform, so it may be a product of fewer object-oriented methods. However, the term continues to be widely used.

While it is easy to implement a thread pool class where the client class waits for an available thread, passes the task to the thread for execution, and then returns the thread to the pool when the task completes, there are several potential negative effects. For example, what happens when the pool is empty? Callers who attempt to pass a task to the pool thread find that the pool is empty and its threads will block when the caller waits for an available pool thread. One of the reasons why we want to use a background thread is often to prevent the thread being committed from being blocked. Completely blocking the caller, such as the "obvious" implementation of the online pool, can eliminate the problem we are trying to solve.

What we usually want is a work queue of the same set of fixed worker threads, which uses wait () and notify () to notify the waiting line that the assigns work has arrived. The work queue is typically implemented as a list of some kind that has a related monitor object. Listing 1 shows an example of a simple pooled work queue. Although the Thread API does not impose special requirements on the use of the Runnable interface, this pattern of using the Runnable object queue is the public contract for the scheduler and work queues.

Listing 1. Work queue with thread pool
public class workqueue{private final int nthreads;    Private final poolworker[] threads;    Private final LinkedList queue;        Public WorkQueue (int nthreads) {this.nthreads = Nthreads;        Queue = new LinkedList ();        Threads = new Poolworker[nthreads];            for (int i=0; i<nthreads; i++) {threads[i] = new Poolworker ();        Threads[i].start ();            }} public void execute (Runnable r) {synchronized (queue) {queue.addlast (R);        Queue.notify ();            }} Private class Poolworker extends Thread {public void run () {Runnable R; while (true) {synchronized (queue) {while (Queue.isempty ()) {TR                        y {queue.wait ();                    } catch (Interruptedexception ignored) {}      }              R = (Runnable) queue.removefirst (); }//If we don ' t catch runtimeexception,//The pool could leak threads try                {R.run ();            } catch (RuntimeException e) {//might want to log something here} }        }    }}

You may have noticed that the implementation in Listing 1 uses notify () instead of Notifyall (). Most experts recommend using Notifyall () instead of notify (), and for good reason: Using Notify () has an elusive risk, and it is only appropriate to use the method under certain conditions. On the other hand, if used properly, notify () has more desirable performance characteristics than notifyall (), and in particular, notify () is much less likely to cause an environment switchover, which is important in server applications.

The sample Work queue in Listing 1 satisfies the need for secure use of notify (). Therefore, please continue to use it in your program, but be cautious when using notify () in other situations.

Back to top of page

Risk of using the thread pool

Although the thread pool is a powerful mechanism for building multithreaded applications, using it is not without risk. An application built with a thread pool is susceptible to all the concurrency risks that any other multithreaded application suffers, such as synchronization errors and deadlocks, and it is vulnerable to a few other risks specific to the thread pool, such as pool-related deadlocks, insufficient resources, and thread leaks.

Dead lock

Any multithreaded application has a deadlock risk. When each of a set of processes or threads waits for an event that only another process in that group can cause, we say that the set of processes or threads is deadlocked . The simplest case of a deadlock is that thread A holds an exclusive lock on object X and waits for the lock of object Y, while thread B holds an exclusive lock on object Y, but waits for the lock of Object X. Unless there is some way to break the wait on the lock (Java lock does not support this method), the deadlock thread will wait forever.

Although there is a risk of deadlock in any multithreaded program, the thread pool introduces another deadlock possibility, in which case all pool threads are performing the task of executing the results of another task in the blocked wait queue, but the task cannot run because there are no unused threads. When the thread pool is used to simulate simulations involving many interactive objects, the simulated objects can send queries to each other, and the queries are then executed as queued tasks, and the query object synchronously waits for the response to occur.

Insufficient resources

One advantage of the thread pool is that they typically perform well relative to other alternative scheduling mechanisms (some we've already discussed). This is true only if the thread pool size is properly adjusted. Threads consume a large amount of resources, including memory and other system resources. In addition to the memory required by the thread object, each thread requires two potentially large execution call stacks. In addition, the JVM may create a native thread for each Java thread that consumes additional system resources. Finally, although the scheduling overhead of switching between threads is small, environment switching can severely affect program performance if there are many threads.

If the thread pool is too large, the resources consumed by those threads can severely affect system performance. Switching between threads can be a waste of time, and using a thread that is more than you actually need may cause a resource scarcity problem because the pool thread is consuming some resources that might be more efficiently exploited by other tasks. In addition to the resources used by the thread itself, the work done by the service request may require additional resources, such as JDBC connections, sockets, or files. These are also limited resources, and too many concurrent requests can cause failures, such as the inability to allocate JDBC connections.

Concurrency errors

The thread pool and other queuing mechanisms rely on the wait () and notify () methods, both of which are difficult to use. If the encoding is incorrect, the notification may be lost, causing the thread to remain idle even though there is work to be done in the queue. These methods must be used with extreme caution, and even experts may make mistakes on them. It is best to use existing implementations that already know that you can work, such as the Util.concurrent package that is discussed below without writing your own pool.

Thread leaks

A serious risk in various types of thread pools is a thread leak, which occurs when a thread is removed from the pool to perform a task and the thread does not return the pool after the task has completed. A scenario in which a thread leak occurs when a task throws a runtimeexception or an Error. If the pool class does not snap to them, then the thread will only exit and the size of the thread pool will be permanently reduced by one. When this happens enough times, the thread pool is eventually empty and the system stops because there are no threads available to handle the task.

Some tasks may wait forever for certain resources or input from the user, and these resources are not guaranteed to become available, the user may have gone home, and such tasks will be permanently stopped, and these stopped tasks will cause the same problem as thread leaks. If a thread is permanently consumed by such a task, it is actually removed from the pool. For such tasks, you should either give them only their own threads, or just allow them to wait for a limited amount of time.

Request overload

This is possible only if the request is overwhelmed by the server. In this case, we may not want to queue every incoming request to our work queue, because the tasks queued for execution may consume too much system resources and cause resource shortages. Deciding what to do in this situation is up to you; In some cases, you can simply discard the request and rely on a higher level of protocol to retry the request later, or you can reject the request with a response that states that the server is temporarily busy.

Back to top of page

Guidelines for effective use of thread pools

As long as you follow a few simple guidelines, the thread pool can be an extremely effective way to build server applications:

    • Do not queue up for tasks that synchronize the results of other tasks. This can lead to the type of deadlock described above, in which all threads are occupied by tasks that wait in turn for the results of the queued tasks that cannot be executed because all the threads are busy.
    • Be careful when using a pooled thread for operations that can take a long time. If a program must wait for a resource such as I/O completion, specify the maximum wait time, and then either expire or re-queue the task for later execution. This ensures that some progress will eventually be made by releasing a thread to a task that might be successfully completed.
    • Understand the task. To effectively adjust the thread pool size, you need to understand which tasks are queued and what they are doing. Are they CPU-bound (cpu-bound)? Are they I/O-bound (i/o-bound)? Your answer will affect how you adjust your application. If you have different task classes with distinct characteristics, it might make sense to set up multiple work queues for different task classes so that each pool can be adjusted accordingly.

Back to top of page

Resizing a pool

Sizing the thread pool is basically about avoiding two types of errors: too few threads or too many threads. Fortunately, for most applications, too much and too little room is quite wide.

Recall that using threads in an application has two main advantages, although it waits for slow operations such as I/O, but allows processing to continue and can take advantage of multiprocessor. In applications running on compute limits on machines with N processors, adding additional threads at close to N is likely to improve the total processing power, and adding additional threads when the number exceeds n is not working. In fact, too many threads can even degrade performance because it can cause additional environment switching overhead.

The optimal size of the thread pool depends on the number of available processors and the nature of the tasks in the work queue. If there is only one task queue on a system with N processors, all of which are computationally-based tasks, the thread pool with n or n+1 threads generally gets maximum CPU utilization.

For tasks that may need to wait for I/O to complete (for example, the task of reading an HTTP request from a socket), you need to make the pool larger than the number of available processors because not all threads are working. By using profiling, you can estimate the ratio between the wait time (WT) of a typical request and the service time (ST). If we call this ratio wt/st, then for a system with N processors, it is necessary to set approximately n (1+wt/st) threads to keep the processor fully utilized.

Processor utilization is not the only consideration in the process of sizing the thread pool. As the thread pool grows, you may encounter schedulers, available memory limitations, or other system resource limitations, such as the number of sockets, open file handles, or database connections.

Back to top of page

No need to write your own pool

Doug Lea has written an excellent concurrency utility open source Repository, util.concurrent, which includes mutexes, semaphores, collection classes such as queues and hashes that perform well under concurrent access, and several task queue implementations. The Pooledexecutor class in the package is an effective, widely used, and correctly implemented thread pool that is based on the task force. Instead of trying to write your own thread pool, this can be error-prone, but you might consider using some of the utilities in Util.concurrent. See Resources for links and more information.

The Util.concurrent library also inspired the JSR 166,JSR 166 is a Java Community process (Java Community processes (JCP)) workgroup, and they are planning to develop a set of packages that are included under the Java.util.concurrent package Java class Library, this package should be used in the Java Development Toolkit 1.5 release.

Back to top of page

Conclusion

The thread pool is a useful tool for organizing server applications. It is conceptually simple, but when implementing and using a pool, you need to pay attention to several issues, such as the complexity of deadlocks, insufficient resources, and wait () and notify (). If you find that your application requires a thread pool, consider using one of the Executor classes in util.concurrent, such as Pooledexecutor, instead of writing from scratch. If you want to create your own thread to handle short-lived tasks, you should definitely consider using line pool instead.

Thread pool and Work queue

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.