Java Theory and Practice: thread pool and work queue

Source: Internet
Author: User

Many server applications, such as Web servers, database servers, file servers, or email servers, are designed to process a large number of short tasks from some remote sources. Requests arrive at the server in a certain way, which may be through the network protocol (such as HTTP, FTP, or POP), through the JMS queue, or possibly by polling the database. Regardless of how requests arrive, server applications often encounter a situation where a single task is processed for a short period of time and the number of requests is huge.
An overly simple model for building server applications should be: Create a New thread whenever a request arrives, and then serve the request in the new thread. In fact, this method works well for prototype development, but if you try to deploy a server application that runs in this way, the serious shortage of this method is obvious. One of the shortcomings of each request corresponding to a thread (thread-per-request) method is that the overhead of creating a new thread for each request is high; it takes more time and system resources to create and destroy a server that creates a new thread for each request than to process actual user requests.
In addition to the overhead of creating and destroying threads, active threads also consume system resources. Creating too many threads in a JVM may cause the system to use up memory or "over-switching" due to excessive memory consumption ". To prevent resource insufficiency, server applications need some methods to limit the number of requests processed at any given time point.
The thread pool provides a solution for thread lifecycle overhead and resource insufficiency. By reusing threads for multiple tasks, the overhead created by threads is apportioned to multiple tasks. The advantage is that the thread already exists when the request arrives, so the delay caused by thread creation is accidentally eliminated. In this way, the application can immediately respond to the request. In addition, adjust the number of threads in the thread pool appropriately, that is, when the number of requests exceeds a certain threshold, force any other new requests to wait, wait until a thread is obtained for processing to prevent resource insufficiency.
Alternative solution of Thread Pool
The thread pool is far from the only way to use multithreading in server applications. As mentioned above, it is wise to generate a new thread for each new task. However, if the task is created too frequently and the average processing time of the task is too short, generating a new thread for each task will cause performance problems.
Another common thread model is to allocate a background thread and task queue for a type of tasks. AWT and Swing use this model. There is a GUI event thread in this model, and all work that changes the user interface must be executed in this thread. However, since there is only one AWT thread, it may take quite a long time to execute the task in the AWT thread. This is not desirable. Therefore, Swing applications often require additional working threads for tasks related to the UI that run for a long time.
Each task corresponds to a thread method and a single-background-thread method, which works very well in some situations. A single thread method for each task works very well when there are only a few tasks that run for a long time. As long as the scheduling predictability is not very important, a single background thread method works very well, such as low-priority background tasks. However, most server applications are oriented to handling a large number of short-term tasks or subtasks, therefore, we often hope to have a mechanism that can effectively handle these tasks with low overhead and some measures for resource management and regular predictability. The thread pool provides these advantages.
Work queue
In terms of the actual implementation method of the thread pool, the term "Thread Pool" is somewhat confusing, because the "obvious" Implementation of the thread pool does not necessarily produce the expected results in most cases. The term "Thread Pool" comes before the Java platform, so it may be a product of less object-oriented methods. However, this term continues to be widely used.
Although we can easily implement a thread pool class, the client class waits for an available thread, passes the task to this thread for execution, and then returns the thread to the pool when the task is completed, however, this method has several potential negative effects. For example, what happens when the pool is empty? All callers trying to pass tasks to the pool thread will find that the pool is empty. When the caller waits for an available pool thread, its thread will be blocked. One of the reasons we need to use background threads is often to prevent the thread being committed from being blocked. The caller is completely blocked. For example, the "obvious" implementation in the thread pool can prevent the problem that we are trying to solve.
We usually want a work queue that combines the same set of fixed working threads. It uses wait () and notify () to notify the waiting thread that new work has arrived. This work queue is usually implemented as a linked list with related monitor objects. Listing 1 shows a simple example of working queues. Although the Thread API does not impose special requirements on the use of the Runnable interface, this mode of using the Runnable object queue is a public convention between the scheduler and the work queue.

List 1. Working Queues with thread pools
[Java]
Public class WorkQueue
{
Private final int nThreads;
Private final PoolWorker [] threads;
Private final complete list queue;
Public WorkQueue (int nThreads)
{
This. nThreads = nThreads;
Queue = new queue list ();
Threads = new PoolWorker [nThreads];
For (int I = 0; I <nThreads; I ++ ){
Threads [I] = new PoolWorker ();
Threads [I]. start ();
}
}
Public void execute (Runnable r ){
Synchronized (queue ){
Queue. addLast (r );
Queue. Sort y ();
}
}
Private class PoolWorker extends Thread {
Public void run (){
Runnable r;
While (true ){
Synchronized (queue ){
While (queue. isEmpty ()){
Try
{
Queue. wait ();
}
Catch (InterruptedException ignored)
{
}
}
R = (Runnable) queue. removeFirst ();
}
// If we don't catch RuntimeException,
// The pool cocould leak threads
Try {
R. run ();
}
Catch (RuntimeException e ){
// You might want to log something here
}
}
}
}
}

You may have noticed that the implementation in Listing 1 uses policy () instead of policyall (). Most experts recommend that you use policyall () instead of policy () for a good reason: the use of policy () has an unpredictable risk and is only suitable for certain conditions. On the other hand, if used properly, notify () has better performance characteristics than policyall (); in particular, notify () causes much less environment switching, this is important in server applications.
The sample work queue in Listing 1 meets the need for secure use of ipvy. Therefore, continue using it in your program, but be especially careful when using notify () in other cases.
Risks of using the thread pool
Although the thread pool is a powerful mechanism for building multi-threaded applications, using it is not risky. Applications built with the thread pool are vulnerable to all concurrency risks that any other multi-threaded applications may suffer, such as synchronization errors and deadlocks. It is also vulnerable to a few other risks that are specific to the thread pool, such as deadlocks related to the pool, insufficient resources, and thread leakage.
Deadlock
Any multi-threaded application has the risk of deadlock. When every one of a group of processes or threads is waiting for an event that can only be caused by another process in the group, we will say that this group of processes or threads are deadlocked. The simplest case of A deadlock is that thread A holds the exclusive lock of object X and waits for the lock of object Y, while thread B holds the exclusive lock of object Y while waiting for the lock of object X. Unless there is a way to break the lock wait (Java locks do not support this method), the deadlocked thread will wait forever.
Although there is a deadlock risk in any multi-threaded program, the thread pool introduces another deadlock possibility. In that case, all the pool threads are executing the task that is waiting for the execution result of another task in the blocked queue, but this task cannot be run because there is no unused thread. When the thread pool is used to simulate many interactive objects, the simulated objects can send queries to each other. These queries are then executed as queued tasks, and the query objects are waiting for response synchronously, this will happen.
Insufficient resources
One advantage of the thread pool is that it is generally executed well compared to other alternative Scheduling Mechanisms (which we have discussed. However, this is only true if the thread pool size is adjusted appropriately. Threads consume a large amount of resources, including memory and other system resources. In addition to the memory required by the Thread object, each Thread requires two potentially large execution call stacks. In addition, the JVM may create a local thread for each Java thread, which consumes additional system resources. Finally, although the scheduling overhead for switching between threads is small, if there are many threads, Environment switching may seriously affect the program performance.
If the thread pool is too large, the resources consumed by those threads may seriously affect the system performance. Switching between threads will waste time, and exceeding the number of threads you actually need may cause resource shortage because the pool thread is consuming some resources, these resources may be used more effectively by other tasks. In addition to the resources used by the thread itself, the work performed during service requests may require other resources, such as JDBC connections, sockets, or files. These resources are limited, and too many concurrent requests may also be invalid. For example, JDBC Connections cannot be allocated.
Concurrency Error
The thread pool and other queuing systems depend on the wait () and Y () methods, which are difficult to use. If the encoding is incorrect, the notification may be lost, causing the thread to remain idle even if there is work in the queue to be processed. Be especially careful when using these methods; even experts may make mistakes on them. It is better to use the existing implementation that is known to be able to work. For example, you do not need to write the util. concurrent package discussed in your own pool below.
Thread Leakage
A serious risk in various thread pools is thread leakage. When a thread is removed from the pool to execute a task, but the thread does not return to the pool after the task is completed, this will happen. A thread leakage occurs when the task throws a RuntimeException or Error. If the pool classes do not capture them, the thread will exit and the size of the thread pool will be permanently reduced by one. When this happens many times, the thread pool will eventually be empty and the system will stop because there is no available thread to process the task.
Some tasks may always wait for some resources or input from the user, and these resources cannot be guaranteed to become available, and the user may have already gone home. Such tasks will be permanently stopped, these stopped tasks also cause the same problems as thread leaks. If a thread is permanently consumed by such a task, it is actually removed from the pool. For such tasks, you should either give them their own threads or let them wait for a limited time.
Request overload
It is only possible that a request will crush the server. In this case, we may not want to queue every incoming request to our work queue, because tasks waiting for execution in the queue may consume too many system resources and cause resource shortage. In this case, it is up to you to decide what to do. In some cases, you can simply discard the request and retry the request later based on a higher level protocol, you can also use a response indicating that the server is temporarily busy to reject the request.
Guidelines for Effective Use of thread pools
As long as you follow several simple guidelines, the thread pool can be an extremely effective method for building server applications:
Do not queue tasks waiting for other task results synchronously. This may lead to the deadlock in the form described above. In the deadlock, all threads are occupied by some tasks, and these tasks are waiting for the result of queuing tasks in turn, these tasks cannot be executed because all threads are busy.
Be careful when using a commonly used thread for operations that may take a long time. If the program has to wait for a resource such as I/O to complete, specify the maximum wait time and whether the task will expire or be requeued for execution later. This ensures that some progress will be achieved by releasing a thread to a task that may be successfully completed.
Understand the task. To effectively adjust the thread pool size, you need to understand the tasks that are being queued and what they are doing. Are they CPU-bound? Are they I/O-restricted (I/O-bound? Your answers will affect how you adjust your application. If you have different task classes and these classes have different features, it may be meaningful to set multiple work queues for different task classes, so that you can adjust each pool accordingly.
Adjust the pool size
Adjusting the thread pool size is basically to avoid two types of errors: Too few threads or too many threads. Fortunately, for most applications, there is a lot of room between too many and too few.
Recall: There are two main advantages of using threads in an application. Although you are waiting for slow operations such as I/O, you can continue processing and use multi-processor. In applications running on N processor machines with computing restrictions, adding additional threads when the number of threads is close to N may improve the total processing capability, when the number of threads exceeds N, adding additional threads does not work. In fact, too many threads may even reduce performance because it will lead to additional Environment switching overhead.
The optimum size of the thread pool depends on the number of available processors and the nature of tasks in the work queue. If there is only one working queue on a system with N processors, all of which are computing tasks, when the thread pool has N or N + 1 threads, the maximum CPU utilization is generally obtained.
For tasks that may need to wait for I/O to complete (for example, tasks that read HTTP requests from sockets), the pool size must exceed the number of available processors, because not all threads are working all the time. By using the summary analysis, You can estimate the ratio of the waiting time (WT) of a typical request to the service time (ST. If we call this ratio WT/ST, we need to set about N * (1 + WT/ST) for a system with N processors) threads to keep the processor fully utilized.
CPU utilization is not the only consideration for adjusting the thread pool size. As the thread pool grows, you may encounter restrictions on the scheduler, available memory, or other system resources, for example, the number of sockets, opened file handles, or database connections.
Doug Lea compiled an excellent concurrent utility open source library util. concurrent, which includes mutex, semaphores, collection classes such as well-executed queues and scattered lists under concurrent access, and several work queue implementations. The PooledExecutor class in this package is an effective and widely used thread pool based on the Task Force column. You do not need to write your own thread pool, which is prone to errors. On the contrary, you can consider using some utilities in util. concurrent. For more information, see references.
Util. the concurrent Library also inspires JSR 166, which is a Java Community Process (JCP) working group and they are planning to develop a group that is included in Java. util. the concurrency utility in the Java class library under the concurrent package. This package should be used for the Java Development Toolkit 1.5 release.
A thread pool is a useful tool for organizing server applications. It is very simple in concept, but when implementing and using a pool, you need to pay attention to several issues, such as deadlocks, insufficient resources and the complexity of wait () and Policy. If you find that your application requires a thread pool, consider using an Executor class in util. concurrent, such as PooledExecutor, instead of writing it from the beginning. If you want to create a thread to process tasks with a short lifetime, you must use the thread pool instead.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.