Why to use thread pooling. Many server applications, such as WEB servers, database servers, file servers, or mail servers, are geared toward handling a large number of short tasks from certain remote sources. Requests arrive at the server in some way, either through a network protocol (such as HTTP, FTP, or POP), through a JMS queue, or possibly by polling the database. Regardless of how the request arrives, a common occurrence in server applications is that a single task has a very short processing time and the number of requests is huge. A simplistic model for building a server application should be to create a new thread whenever a request arrives, and then service the request in a new thread. In fact, this approach works well for prototyping, but if you try to deploy a server application that runs in this way, the serious lack of this approach is obvious. One of the drawbacks of each request corresponding to one thread (Thread-per-request) method is that creating a new thread for each request is expensive The server that creates the new thread for each request spends more time and resources on creating and destroying threads than on the actual user requests. In addition to the overhead of creating and destroying threads, the active thread consumes system resources. Creating too many threads in one JVM can cause the system to run out of memory or "switch over" due to excessive memory consumption. To prevent the lack of resources, the server application requires some means to limit the number of requests processed at any given time. The thread pool provides a solution to the problem of threading lifecycle overhead and insufficient resources. By reusing threads for multiple tasks, the cost of thread creation is spread across multiple tasks. The benefit is that, because the thread already exists when the request arrives, it inadvertently eliminates the delay caused by the thread creation. This allows the request to be serviced immediately so that the application responds faster. Furthermore, by appropriately adjusting the number of threads in the thread pool, that is, when the number of requests exceeds a threshold, any other new requests are forced to wait until a thread is obtained to handle them, thereby preventing the lack of resources. Alternative to thread pools The thread pool is far from the only way to use multithreading within a server application. As mentioned above, it is sometimes wise to generate a new thread for each new task. However, if the task is created too frequently and the average processing time for the task is too short, generating a new thread for each task can cause performance problems. Another common threading model is to assign a background thread and task queue to a type of task. AWT and Swing Use this model, in which there is a GUI event thread, and all work that causes the user interface to change must be performed in that thread. However, because there is only one AWT thread, it may take a considerable amount of time to perform a task on the AWT thread, which is undesirable. As a result, Swing applications often require additional worker threads for long-running, UI-related tasks. Each task corresponds to a threading method and a single background thread (Single-background-thread) method that works best in some cases. Each task has a threading method that works very well for a small number of long-running tasks. As long as scheduling predictability is not important, a single background threading method works very well, as is the case with low-priority background tasks. However, most server applications are geared toward handling a large number of short-term tasks or subtasks, and therefore often want to have a mechanism for handling these tasks efficiently and at low cost, as well as some resource management and timing predictability measures. A thread pool provides these benefits. Work queues As far as the actual implementation of the thread pool is concerned, the term "thread pool" is somewhat misleading because the thread pool "obvious" implementations do not necessarily produce the results we want in most cases. The term "thread pool" appears before the Java platform, so it may be the product of less object-oriented methods. However, the term continues to be widely used. Although we can easily implement a thread pool class in which the client class waits for an available thread, passes the task to the thread for execution, and then returns the thread to the pool when the task completes, there are several potential negative effects on this approach. For example, what happens when the pool is empty. The caller who attempts to pass a task to the pool thread will find that the pool is empty and its thread blocks when the caller waits for an available pool thread. One of the reasons why we want to use background threads is often to prevent the thread being committed from being blocked. Completely blocking the "obvious" implementation of the caller, such as the online pool, can eliminate the problem we are trying to solve. What we usually want is a work queue with a fixed set of worker threads that uses wait () and notify () to inform the waiting line that threading's work has arrived. The work queue is usually implemented as a sort of linked list with the relevant monitor object. Listing 1 shows an example of a simple pooled work queue. Although the Thread API does not impose special requirements on the use of the Runnable interface, this pattern of using Runnable object queues is a public convention for schedulers and work queues. Listing 1. A work queue with a thread pool
public class Workqueue { private final int nthreads; Private final poolworker[] threads; Private final LinkedList queue; Public workqueue (int nthreads) { This.nthreads = nthreads; Queue = new LinkedList (); Threads = new Poolworker[nthreads]; for (int i=0; i<nthreads; i++) { Threads[i] = new Poolworker (); Threads[i].start (); } } public void Execute (Runnable r) { Synchronized (queue) { Queue.addlast (R); Queue.notify (); } } Private class Poolworker extends Thread { public void Run () { Runnable R; while (true) { Synchronized (queue) { while (Queue.isempty ()) { Try { Queue.wait (); } catch (Interruptedexception ignored) { } } R = (Runnable) queue.removefirst (); } If we don ' t catch RuntimeException, The pool could leak threads try { R.run (); } catch (RuntimeException e) { Might want to log something } } } } } |
You may have noticed that the implementation in Listing 1 uses notify () instead of Notifyall (). Most experts recommend the use of Notifyall () rather than notify (), and the reason is sufficient: the use of Notify () is an elusive risk, and it is only appropriate to use the method under certain conditions. On the other hand, if used properly, notify () has more desirable performance characteristics than notifyall (), and in particular, notify () is a much smaller environment switch, which is important in server applications. The example Work queue in Listing 1 satisfies the need for safe use of notify (). Therefore, continue, use it in your program, but be careful when using notify () in other situations. Risk using a thread pool While the thread pool is a powerful mechanism for building multithreaded applications, it is not without risk. Applications built with thread pools are susceptible to all the concurrent risks that any other multithreaded application can suffer, such as sync errors and deadlocks, and are vulnerable to a few other risks specific to the thread pool, such as pool-related deadlocks, resource shortages, and thread leaks. Dead lock Any multithreaded application has a deadlock risk. When each of a group of processes or threads waits for an event that only another process in the group can cause, we say that the set of processes or threads are deadlocked. The simplest scenario for a deadlock is that thread A holds an exclusive lock on object X and waits for the lock of object Y, while thread B holds an exclusive lock on object Y but waits for the lock of object X. Unless there is some way to break the wait for the lock (which is not supported by the Java Lock), the deadlock thread will wait forever. Although there is a risk of deadlock in any multithreaded program, but the thread pool introduces another deadlock, in which case all the pool threads are performing a task that is blocking the execution of another task in the queue, but the task cannot run because there are no unused threads. When a thread pool is used to simulate simulations involving many interacting objects, the impersonated object can send a query to each other, which is then executed as a queued task, and the query object synchronously waits for the response to occur. Insufficient resources One advantage of thread pooling is that they usually perform well relative to other alternative scheduling mechanisms, some of which we have already discussed. However, this is true only if the thread pool size is properly adjusted. Threads consume large amounts of resources, including memory and other system resources. In addition to the memory required by the thread object, each thread requires two potentially large execution call stacks. In addition, the JVM may create a native thread for each Java thread that consumes additional system resources. Finally, although the scheduling overhead of switching between threads is small, if there are many threads, environment switching can also seriously affect program performance. If the thread pool is too large, the resources consumed by those threads can severely affect system performance. Switching between threads can be a waste of time, and using more threads than you actually need may cause resource-poor problems because pool threads are consuming resources that might be used more efficiently by other tasks. In addition to the resources used by the thread itself, the work done by the service request may require additional resources, such as a JDBC connection, a socket, or a file. These are also limited resources, and too many concurrent requests can cause failures, such as the inability to allocate JDBC connections. Concurrency error The thread pool and other queuing mechanisms rely on using the wait () and notify () methods, both of which are difficult to use. If the encoding is incorrect, the notification may be lost, causing the thread to remain idle, even though there is work to be done in the queue. You must be careful when using these methods, even experts can make mistakes on them. It is best to use an existing implementation that is already known to work, such as the Util.concurrent package that is discussed below without writing your own pool. Thread leaks A serious risk in the various types of thread pools is a thread leak, which occurs when a thread is dropped from the pool to perform a task, and the thread does not return to the pool after the task completes. A case of thread leakage occurs when a task throws a runtimeexception or an error. If the pool class does not capture them, the thread will only exit and the size of the thread pool will be permanently reduced by one. When this happens, the thread pool is eventually empty, and the system stops because no threads are available to handle the task. Some tasks may wait forever for some resources or input from the user, which is not guaranteed to become available, the user may have gone home, and such tasks will be permanently stopped, and these stopped tasks will cause and thread leaks the same problem. If a thread is permanently consumed by such a task, it is actually removed from the pool. For such tasks, you should either give them only their own threads or just wait for a limited amount of time. Request overload It is possible that only requests will overwhelm the server. In this case, we may not want to queue each incoming request to our work queue, because the task queued for execution may consume too much system resources and cause a lack of resources. It is up to you to decide what to do in this situation, in some cases you can simply discard the request, rely on a higher level of protocol to retry the request later, or you can reject the request with a response that indicates that the server is temporarily busy. Guidelines for effective use of thread pools As long as you follow a few simple guidelines, a thread pool can be an extremely effective way to build a server application: Do not queue tasks that wait for the results of other tasks to be synchronized. This may lead to the form of deadlock described above, in which all threads are occupied by tasks that wait for the results of queued tasks, which cannot be executed because all threads are busy. Be careful when using a pooled thread for a potentially lengthy operation. If your program must wait for a resource such as I/O to complete, specify the maximum wait time, and then disable or queue the task for later execution. This guarantees that some progress will eventually be made by releasing a thread to a task that might successfully be completed. Understand the task. To effectively adjust the thread pool size, you need to understand the tasks that are queued and what they are doing. Are they CPU-limited (cpu-bound)? Are they the I/O limit (i/o-bound)? Your answer will affect how you adjust your application. If you have different task classes that have distinct characteristics, it might make sense to set up multiple work queues for different task classes, so you can adjust each pool accordingly. Resizing a pool Adjusting the size of the thread pool is basically to avoid two types of errors: too few threads or too many threads. Fortunately, for most applications, the leeway between too much and too little is quite wide. Recall that there are two main advantages to using threads in your application, although you are waiting for slow operations such as I/O, but allow you to continue processing, and you can take advantage of multiple processors. In applications that run on compute limits on N-processor machines, adding additional threads when the number is close to n is likely to improve the total processing power, while adding additional threads when the thread number exceeds n does not work. In fact, too many threads can even degrade performance because it can lead to additional environmental switching overhead. The optimal size of the thread pool depends on the number of available processors and the nature of the tasks in the work queue. If there is only one Task Force column on a system with N processors, all of which are computational tasks, the online pool typically obtains maximum CPU utilization when it has n or n+1 threads. For tasks that may need to wait for I/O to complete (for example, the task of reading HTTP requests from sockets), you need to make the pool larger than the number of available processors because not all threads are working. By using the profiling, you can estimate the ratio between the wait time (WT) for a typical request and the service time (ST). If we call this ratio wt/st, then for a system with N processors, we need to set approximately n (1+wt/st) threads to keep the processor fully utilized. Processor utilization is not the only consideration in the process of adjusting the thread pool size. As the thread pool grows, you may encounter constraints on the scheduler, available memory, or other system resources, such as the number of sockets, open file handles, or database connections. No need to write your own pool Doug Lea has written an excellent concurrency utility open Source Library util.concurrent, which includes mutexes, semaphores, collection classes such as queues and hashes executed well under concurrent access, and several Task Force column implementations. The Pooledexecutor class in this package is an effective, widely used, correctly implemented thread pool based on the task force. Instead of trying to write your own thread pool, it's easy to make mistakes, but you might consider using some of the utilities in Util.concurrent. See Resources for links and more information. The Util.concurrent library has also inspired JSR 166,JSR 166 to be a Java Community process (Java Community process (JCP)) workgroup that is planning to develop a set of packages included under the Java.util.concurrent package Java class Library, this package should be used in the Java Development Toolbox 1.5 release. A thread pool is a useful tool for organizing server applications. It is conceptually simple, but there are several issues to be aware of when implementing and using a pool, such as deadlocks, insufficient resources, and the complexity of Wait () and notify (). If you find that your application requires a thread pool, consider using one of the Executor classes in util.concurrent, such as Pooledexecutor, instead of writing from scratch. If you want to create your own thread to handle tasks that are short-lived, you should definitely consider using line pool instead. |