Brian Goetz (brian@quiotix.com)
Chief Consultant, quiotix Corp
April 2003
For each project, like many other application infrastructure services, there is usually no need to rewrite the concurrent Utility Class (such as the work queue and thread pool) from the beginning ). This month, Brian Goetz will introduce Doug Lea's util. Concurrent package, an open source package of high-quality, widely used, and concurrent utilities. You can use the forum in this article to give your thoughts on this article to the author and other readers. (You can also click "discussion" at the top or bottom of this article to join the Forum .)
Most of us will never consider writing these utilities ourselves when the project requires an XML parser, text indexing program and search engine, regular expression compiler, XSL processor, or PDF builder. Every time we need these facilities, we will use commercial or open source implementation to execute these tasks for a simple reason-the existing implementation works well and is easy to use, and writing these utilities by ourselves will be half the effort, or you cannot even get the result. As software engineers, we prefer to follow the belief of Isaac Newton that standing on the shoulders of giants is sometimes desirable, but not always. (In Richard Hamming's Turing Award Lecture, he believes that computer scientists should be more "self-reliant .)
Exploring the reason for repeated invention of "Wheel"
For some low-level application framework services (such as logging, database connection sharing, high-speed cache, and task scheduling) that are required by almost every server application ), we can see that these basic infrastructure services are rewritten over and over again. Why is this happening? Because the existing options are insufficient, or the custom version is better or better suited to the application at hand, I think this is unnecessary. In fact, a custom version developed for an application is often not more suitable for the application than a widely available and universal implementation, and may be worse. For example, although you do not like log4j, it can complete tasks. Although a self-developed logging system may have some specific features that log4j lacks, it is hard to prove that for most applications, A complete custom log record package is worth the cost of writing from scratch, rather than using existing and general implementations. However, many project teams finally write log records, connection shares, or thread scheduling packages over and over again.
Seemingly simple
One of the reasons we don't consider writing our own XSL processor is that it will take a lot of work. However, these low-level framework services seem simple on the surface, so it does not seem difficult to write them by yourself. However, they are difficult to work normally, not as they seem at the beginning. The main reason these special "Wheels" have been repeatedly invented is that, in a given application, they often have very small requirements for these tools at the beginning, however, when you encounter the same problems in numerous other projects, this requirement will gradually increase. The reason is usually as follows: "We don't need perfect log record/scheduling/cache packages, we only need some simple packages, so we only need to write some packages that can achieve our goal, we will adjust it based on our specific needs ". However, it is often because you have quickly extended the compiled simple tool and tried to add more features until you compile a complete infrastructure service. At this point, you usually stick to the program you write, whether it is good or bad. You have already paid all the cost for building your own program, so in addition to the actual migration cost invested by the general implementation, you must also overcome this "paid cost" obstacle.
Value of concurrent Components
Writing the basic structure of scheduling and concurrency is indeed more difficult than it seems. The Java language provides a set of useful low-level synchronization primitives: Wait (), Y (), and synchronized. However, using these primitives requires some skills, it is necessary to consider performance, deadlock, fairness, resource management, and how to avoid dangers caused by thread security. Concurrent code is hard to write, making it even harder to test-even if an expert sometimes encounters an error at the first time. Doug Lea, creator of concurrent programming in Java (see references), has compiled an excellent and free concurrency utility package, it includes lock, mutex, queue, thread pool, lightweight tasks, valid concurrent sets, atomic arithmetic operations, and other basic components of concurrent applications. This package is generally called util. concurrent (because the actual package name is long), this package will form Java Community process JSR 166 being standardized in JDK 1.5 Java. util. the basis of the concurrent package. At the same time, util. Concurrent has been well tested, and many server applications (including JBoss J2EE application servers) use this package.
Fill in blank
A set of useful advanced synchronization tools (such as mutex, signal and blocking, and thread security collection) are omitted in the core Java class library ). Java-based concurrency primitives-synchronization, wait (), and Y ()-are too low-level for most server applications. What happens if I try to get the lock, but I haven't obtained it after the timeout in the specified period? If the thread is interrupted, will the attempt to obtain the lock be abandoned? Create a lock that can be held by up to n threads? Supports multiple locking methods (for example, concurrent reads with mutex writes )? Or is it possible to get the lock in one way, but release it in another way? The built-in locking mechanism does not directly support these cases, but it can be built on the basic concurrency primitives provided by Java. However, this requires some skills and is prone to errors.
Server application developers need simple facilities to execute mutex, synchronous Event Response, cross-activity data communication, and asynchronous scheduling tasks. For these tasks, the low-level primitives provided by the Java language are difficult to use and are prone to errors. Util. the concurrent package aims to fill this gap by providing a set of classes for locking, blocking queues, and task scheduling, this allows you to handle common errors or limit the resources consumed by task queues and running tasks.
Schedule asynchronous tasks
The most widely used classes in util. Concurrent are those that process asynchronous event scheduling. In this column's July article, we studied thread pools and work queues, and how many Java applications use the runnable queue mode to schedule small work units.
You can simply create a new thread for a task to derive the backend thread that executes the task. This method is very attractive:
New thread (New runnable () {...}). Start ();
Although this practice is good and concise, there are two major defects. First, it takes some resources to create a new thread, so many threads are generated. Each thread will execute a short task and then exit. This means that the JVM may do more work, the resources consumed by creating and destroying threads are much more than the resources actually consumed by useful work. Even if the thread creation and destruction overhead is zero, this execution mode still has a second, more difficult-how to limit the resources used when executing a certain type of task? If a large number of requests suddenly come, how can we prevent a large number of threads from being generated at the same time? Server applications in the real world need to manage resources more carefully than this. You must limit the number of asynchronous tasks simultaneously.
The thread pool solves the above two problems-the thread pool can improve scheduling efficiency and limit resource usage at the same time. Although it is easy for people to write work queues and use pool threads to execute runnable thread pools (the sample code in the column article in February is exactly for this purpose ), however, writing an effective task scheduling program requires more work than simply synchronizing access to the shared queue. In the real world, the task scheduler should be able to process dead threads and kill excessive pool threads so that they do not consume unnecessary resources and dynamically manage the pool size based on the load, and limit the number of queued tasks. To prevent server applications from crashing due to memory insufficiency errors during overload, the last item (that is, limiting the number of queued tasks) is very important.
Decision-making is required to restrict the task queue-how to handle this overflow if the work queue overflows? Abandon the latest task? Abandon the oldest task? Blocking the submitting thread until the queue has available space? How to execute a new task in the thread being submitted? There are various practical overflow management policies, which are applicable in some situations, but not in others.
Executor
Util. Concurrent defines an executor interface to execute runnable asynchronously. It also defines several implementations of executor, which have different scheduling features. It is very easy to queue a task into executor:
Executor executor = new queuedexecutor ();
...
Runnable = ...;
Executor.exe cute (runnable );
The simplest implementation of threadedexecutor creates a new thread for each runnable. resource management is not provided here-like new thread (New runnable (){}). start () is a common method. But threadedexecutor has an important advantage: By changing only the executor structure, you can move to other execution models without having to slowly find all the places where new threads are created in the entire application source code. Queuedexecutor uses a backend thread to process all tasks, which is very similar to the event threads in AWT and swing. Queuedexecutor has a good feature: tasks are executed in the queue order, because all tasks are executed in one thread, and tasks do not need to synchronize all access to shared data.
Pooledexecutor is a complex thread pool implementation. It not only provides task scheduling in the worker thread pool, but also allows you to flexibly adjust the size of the pool, it also provides thread lifecycle management, which can limit the number of tasks in the work queue to prevent tasks in the queue from consuming all available memory, it also provides a variety of available close and saturation policies (blocking, discarding, throwing, discarding the oldest, and running in callers ). All executors implement the creation and destruction of Management threads, including closing all threads when executor is disabled, and providing hooks for thread creation, so that the application can manage the thread instantiation it wants to manage. For example, this allows you to place all worker threads in a specific threadgroup or give them descriptive names.
Futureresult
Sometimes you want to start a process asynchronously and use the results of the process when you need it later. The futureresult Utility Class makes this easy. Futureresult indicates a task that may take some time to execute and can be executed in another thread. The futureresult object can be used as the handle of the execution process. You can check whether the task has been completed. You can wait for the task to complete and retrieve the result. You can combine futureresult and executor. You can create a futureresult and queue it into the executor, and retain references to futureresult. Listing 1 shows a simple example of using futureresult and executor together. It starts image coloring asynchronously and continues with other processing:
Listing 1. Running futureresult and executor =...
Imagerenderer Renderer =...
Futureresult futureimage = new futureresult ();
Runnable command = futureimage. setter (New callable (){
Public object call () {return Renderer. Render (rawimage );}
});
// Start the rendering process
Executor.exe cute (command );
// Do other things while executing
Drawborders ();
Drawcaption ();
// Retrieve the future result, blocking if necessary
Drawimage (image) (futureimage. Get (); // use future
Futureresult and Cache
You can also use futureresult to improve the concurrency of On-Demand Loading of High-speed cache. By placing futureresult in the cache, rather than placing the computing results, you can reduce the time for holding the write lock on the cache. Although this approach does not speed up the first thread to put a certain item into the high-speed cache, it will reduce the time for the first thread to block other threads from accessing the high-speed cache. It also allows other threads to use results earlier because they can retrieve futuretask from the cache. Listing 2 shows the futureresult example for using the cache:
Listing 2. Using futureresult to improve the cache public class filecache {
Private map cache = new hashmap ();
Private executor = new pooledexecutor ();
Public void get (final string name ){
Futureresult result;
Synchronized (cache ){
Result = cache. Get (name );
If (result = NULL ){
Result = new futureresult ();
Executor.exe cute (result. setter (New callable (){
Public object call () {return LoadFile (name );}
}));
Cache. Put (result );
}
}
Return result. Get ();
}
}
This method enables the first thread to quickly enter and exit the synchronization block, so that other threads can obtain the computing results of the first thread as quickly as the first thread, it is impossible for both threads to try to calculate the same object.
Conclusion
The util. Concurrent package contains many useful classes. You may think that some of these classes are as good as the classes you have written, maybe even better than before. They are high-performance implementations of the basic components of many multi-threaded applications and have undergone a lot of tests. Util. concurrent is the entry point of JSR 166. It will bring a set of concurrency utilities that will become Java in JDK 1.5. util. concurrent package, but you do not have to wait until that time to use it. In future articles, I will discuss some customized synchronization classes in util. Concurrent and study the differences between util. Concurrent and Java. util. Concurrent APIs.