The threading model of the Java programming language may be the weakest part of this language. It is completely unsuited to the requirements of the actual complex program and is not object oriented at all. This article recommends significant modifications and additions to the Java language to address these issues.
The Java language threading model is one of the most difficult and satisfying parts of the language. Although the Java language itself is a good thing to support threading programming, it has too little support for threading syntax and class packs, and can only be applied to a very small application environment.
Most books on Java thread programming proving the flaws in the Java threading Model and provide a first-aid kit (Band-aid/Bundy Band-Aid) class library to address these problems. I call these types of first-aid kits because the problems they solve are supposed to be contained in the Java language's own syntax. In the long run, the syntax rather than the class library method will produce more efficient code. This is because the compiler and the Java virtual device (JVM) can tune program code together, and these optimizations are difficult or impossible to implement for code in the class library.
Allen Holub points out that in my taming Java Threads (see Resources) book and in this article, I further recommend that you make some modifications to the Java programming language itself so that it can really solve these threading programming problems. The main difference between this and my book is that I've been thinking more about writing this article, so I've improved on the proposals in the book. These suggestions are only tentative--only my personal thoughts on these issues, and the realization of these ideas requires a lot of work and peer evaluation. But this is a beginning after all, I intend to solve these problems to set up a dedicated working group, if you are interested, please send e-mail to threading@holub.com. I will send you a notice once I have really started.
The proposals put forward here are very bold. Some people recommend subtle and minor modifications to the Java Language Specification (JLS) (see Resources) to address the current fuzzy JVM behavior, but I want to make a more thorough improvement.
In the actual draft, many of my suggestions include introducing new keywords for this language. While it is often true that the existing code for a language is not required to break through, it must be able to introduce new keywords if the language is not to remain unchanged and obsolete. In order for the imported keyword to not conflict with the existing identifier, I will use a ($) character, which is illegal in the existing identifier, after careful consideration. (for example, use a $task instead of a task). The compiler's command-line switches are required to support the use of these keyword variants instead of ignoring the dollar sign.
The concept of task (tasks)
The fundamental problem with the Java threading model is that it is not object oriented at all. Object-oriented (OO) designers do not consider problems at all in a threading perspective; they are considering synchronous information asynchronous information (the synchronization information is processed immediately – the message handle is returned until the information processing is complete, and the asynchronous message is processed for a period of time in the background after it is received-and the message handle is returned long before the information processing is finished). The Toolkit.getimage () method in the Java programming language is a good example of asynchronous information. The message handle of the GetImage () is returned immediately without having to wait for the entire image to be retrieved by the background thread.
This is an object-oriented (OO) approach. However, as mentioned earlier, the Java threading model is object-oriented. A Java programming language thread is actually just a run () process that calls other procedures. There is no object, asynchronous or synchronized information, and other concepts here at all.
One solution that has been discussed in depth in my book is to use a active_object for this issue. An active object is an object that can receive an asynchronous request and is processed in the background for a period of time after the request has been received. In the Java programming language, a request can be encapsulated in an object. For example, you can pass an instance of a Runnable interface implementation to this active object, and the Run () method of the interface encapsulates the work that needs to be done. The Runnable object is discharged into the queue by this active object, and the active object uses a background thread to execute it when it is its turn to execute.
Asynchronous information that runs on an active object is actually synchronized because it is fetched and executed sequentially from the queue by a single service thread. Therefore, using an active object in a more procedural model can eliminate most synchronization problems.
In a sense, the entire SWING/AWT subsystem of the Java programming language is an active object. The only safe way to send a message to a swing queue is to invoke a method similar to Swingutilities.invokelater (), which sends a Runnable object on the Swing event queue, and when it is its turn, swing The event-handling thread will process it.
So my first suggestion is to incorporate the concept of a task (task) into the Java programming language so that the active object is integrated into the language. (The concept of task is drawn from Intel's RMX operating system and the Ada programming language.) Similar concepts are supported in most real-time operating systems. )
A task has a built-in active object distributor and automatically manages all the mechanisms that handle asynchronous information.
Defining a task is basically the same as defining a class, except that you need to add a asynchronous modifier to the task's method to indicate that the assignment program for the active object handles these methods in the background.
All write requests are queued in the Active-object input queue with a dispatch () procedure call. Any exceptions (exception) that occur when processing these asynchronous information in the background are handled by the Exception_handler object, which is passed to the File_io_task constructor.
The main problem with this class based approach is that it's too complex--for a simple operation like this, the code is too miscellaneous. After introducing the $task and $asynchronous keywords into the Java language, you can rewrite the previous code as follows:
Note that the asynchronous method does not specify a return value because its handle is returned immediately, rather than waiting for the requested operation to be processed. Therefore, there is no reasonable return value at this time. For derived models, $task keyword and class are the same: $task can implement interfaces, inheriting classes, and other inherited tasks. Methods marked with the asynchronous keyword are processed by the $task in the background. Other methods will run synchronously, just as you would in a class.
$task keyword can be decorated with an optional $ERROR clause, as shown above, which indicates that there will be a default handler for any exception that cannot be caught by the asynchronous method itself. I use $ to represent the exception object being thrown. If you do not specify a $error clause, a reasonable error message (probably stack trace information) is printed.
Note that in order to ensure thread safety, the parameters of the asynchronous method must be invariant (immutable). The runtime system should ensure this invariance through relevant semantics (simple replication is usually not sufficient).
All task objects must support some pseudo information (pseudo-message).
In addition to the commonly used modifiers (public, and so on), the Task keyword should also accept a $pooled (n) modifier, which causes the task to use a thread pool instead of running an asynchronous request with a single thread. n Specifies the size of the desired thread pool, which can be increased if necessary, but should be shrunk to its original size when the thread is no longer needed. Pseudo-Domain (Pseudo-field) $pool _size returns the original n parameter value specified in $pooled (n).
In the eighth chapter of The Taming Java Threads, I give a server-side socket handler as an example of a thread pool. It is a good example of a task that uses a thread pool. The basic idea is to produce a stand-alone object, its task is to monitor a server-side socket. Whenever a client connects to the server, the server-side object crawls a predesigned sleep thread from the pool and sets the thread to serve the client connection. The socket server will produce an additional customer service thread, but when the connection is closed, the additional threads will be removed.
The Socket_server object uses a separate background thread to handle the asynchronous listen () request, which encapsulates the "accept" loop of the socket. When each client connects, listen () requests a client_handler to process the request by calling handle (). Each handle () request executes in their own thread (because this is a $pooled task).
Note that each asynchronous message that is routed to the $pooled $task is actually handled using their own threads. Typically, because a $pooled $task is used to implement an autonomous operation, the best solution for resolving potential synchronization problems associated with accessing state variables is to use this in the $asynchronous method as a unique copy of the object pointed to. This means that when an asynchronous request is sent to a $pooled $task, a clone () action is performed, and the this pointer to this method points to the cloned object. Communication between threads can be achieved by synchronizing access to the static zone.
Improved synchronized
Although in most cases $task eliminates the need for synchronous operations, not all multithreaded systems are implemented with tasks. Therefore, it is also necessary to improve the existing threading module. The Synchronized keyword has the following disadvantages: A timeout value cannot be specified. Cannot break a thread that is waiting for a lock to be requested. Unable to request multiple locks securely. (Multiple locks can only be obtained in accordance with order.) )
The solution to these problems is to extend the synchronized syntax so that it supports multiple parameters and can accept a time-out note (specified in the following brackets). Here's the syntax I want:
Synchronized (x && y && z) Gets the locks of the X, Y, and Z objects.
Synchronized (x | | y | | z) gets a lock on an X, Y, or Z object.
Synchronized ((x && y) | | z) for some extensions to the preceding code.
Synchronized (...) [1000] Set a 1-second timeout to obtain a lock.
SYNCHRONIZED[1000] F () {...} Gets the lock for this when entering the F () function, but can have a timeout of 1 seconds.
TimeoutException is a runtimeexception derived class that is thrown after a wait timeout.
Timeouts are needed, but they are not enough to make your code strong. You also need to have the ability to abort request lock waiting from outside. So, when a interrupt () method is passed to a thread waiting for a lock, this method should throw a Synchronizationexception object and interrupt the waiting thread. This exception should be a derived class of runtimeexception, so you do not have to deal with it specifically.
The main problem with these recommended change methods for synchronized syntax is that they need to be modified at the binary code level. The code now uses Access monitoring (enter-monitor) and exit monitoring (Exit-monitor) instructions to implement synchronized. These directives have no parameters, so you need to extend the definition of the binary code to support multiple lock requests. However, this modification will not be easier than modifying the Java virtual machine in Java 2, but it is backward-compatible with existing Java code.
Another problem that can be solved is the most common deadlock scenario, in which two threads are waiting for each other to complete an operation.
Imagine a thread that calls a (), but is deprived of the right to run until the LOCK1 is obtained lock2. The second thread goes into operation, calls B (), gets lock2, but because the first thread occupies lock1, it cannot get lock1, so it is then in a wait state. At this point the first thread is awakened and it attempts to obtain Lock2, but is not available because it is occupied by a second thread. A deadlock occurs at this time.
The compiler (or virtual machine) rearranges the order in which the lock is requested so that the LOCK1 is always first obtained, eliminating the deadlock.
However, this method does not necessarily always succeed for multithreading, so it is necessary to provide some means to automatically break the deadlock. An easy way is to wait for the second lock to release the acquired lock at times.
If each program that waits for a lock uses a different timeout value, it can break the deadlock and one of the threads can run. I suggest using the following syntax to replace the preceding code:
The synchronized statement will always wait, but it will often discard the acquired locks to break the potential deadlock. Ideally, the timeout value for each recurring wait is a random value over the previous one.
Improved Wait () and notify ()
The Wait ()/notify () system also has some problems: unable to detect whether wait () is returned normally or due to a timeout. You cannot use a traditional conditional variable to implement a "signal" (signaled) state. Nested monitoring (monitor) locking is too easy to occur.
The timeout detection problem can be resolved by redefining wait () so that it returns a Boolean variable instead of void. A true return value indicates a normal return, and False indicates that the timeout is returned.
The concept of state-based conditional variables is important. If this variable is set to False, the waiting thread will be blocked until the variable enters a true state, and any waiting thread for the condition variable waiting for true is automatically freed. (In this case, the wait () call does not occur blocking.) )。
Nested monitoring locking problem is very troublesome, I do not have a simple solution. A nested monitoring lock is a form of deadlock that occurs when a lock's owning thread does not release the lock until it is suspended.
In this case, two locks are involved in the get () and put () operations: One on the Stack object and the other on the LinkedList object. Here we consider the case when a thread attempts to invoke a pop () operation of an empty stack. This thread obtains the two locks and then calls wait () to release the lock on the Stack object, but does not release the lock on the list. If the second thread tries to push an object into the stack at this point, it will hang on the synchronized (list) statement forever and will never be allowed to crush an object. The deadlock occurs because the first thread waits for a non-empty stack. This means that the first thread can never be returned from wait () because it occupies a lock, which causes the second thread to never run to the Notify () statement.
In this example, there are many obvious ways to solve the problem: for example, use synchronization for any method. But in the real world, the solution is usually not so simple.
A feasible approach is to release all locks acquired by the current thread in reverse order in wait (), and then retrieve them in the original fetch order when the waiting conditions are met. However, I can imagine that the code used in this way is simply incomprehensible to people, so I don't think it's a really viable approach. If you have a good method, please send me an e-mail.
I would also like to be able to wait until the following complex conditions have been achieved one day. For example:
Where a, B, and C are arbitrary objects.
Modifying the Thread class
The ability to support both preemptive and collaborative threading is a basic requirement in some server applications, especially if you want the system to be the most high-performance. I think the Java programming language goes too far in simplifying the threading model, and the Java programming language should support Posix/solaris's "green" and "lightweight" (lightweight) process concepts (Taming Java Threads The first chapter is discussed). This means that some implementations of Java virtual machines, such as Java virtual machines on NT, should simulate collaborative processes within them, and other Java virtual machines should emulate preemption threads. And it's easy to add these extensions to the Java virtual machine.
A Java thread should always be preemptive. This means that the threads of a Java programming language should work like Solaris's lightweight process. The Runnable interface can be used to define a Solaris-style "green thread" that must be able to transfer control to other green threads running in the same lightweight process.
Effectively generates a green thread for the Runnable object and binds it to the lightweight process represented by the Thread object. This implementation is transparent to existing code because it is as effective as it is available.
By using this method to pass a few Runnable objects to the thread's constructor, you can extend the existing syntax of the Java programming language to support multiple green threads in a single lightweight thread by Runnable the object as a green thread. (Green threads can collaborate with one another, but they can be preempted by green processes (Runnable objects) that run on other lightweight processes (thread objects). )。 For example, the following code creates a green thread for each Runnable object that shares the lightweight process represented by the thread object.
The existing overwrite (override) thread object and the practice of implementing run () continue to work, but it should map to a green thread that is bound to a lightweight process. The default run () method in the Thread () class effectively creates a second Runnable object internally. )
Collaboration between threads
More features should be added to the language to support communication between threads. Currently, the PipedInputStream and PipedOutputStream classes can be used for this purpose. But for most applications, they are too weak. I recommend adding the following function to the thread class: Add a Wait_for_start () method, which is typically blocked until a thread's run () method is started. (This is not a problem if the waiting thread is freed before calling run.) In this way, one thread can create one or more worker threads and ensure that the worker threads are running before the creation thread continues to perform the operation. Add $send (Object o) and object= $receive () methods (to the object class) that use an internal blocking queue to transfer objects between threads. The blocking queue should be automatically created as a by-product of the first $send () call. $send () Call joins the object into the queue. The $receive () call is usually blocked until an object is queued and then it returns this object. The variables in this method should support setting the team and outbound operation time-out capabilities: $send (Object o, long Timeout), and $receive (long timeout).
internal support for read-write locks
The concept of read-write locks should be built into the Java programming language. The reader lock is discussed in detail in the "Taming Java Threads" (and elsewhere), in a nutshell: A read-write lock supports multiple threads to access an object at the same time, but at the same time only one thread can modify the object and cannot modify it while the access is in progress.
For an object, multiple threads should be supported into $reading blocks only if there are no threads in the $writing block. A thread attempting to enter $writing block is blocked until the read thread exits $reading block while the read is in progress. When there are other threads in the $writing block, the thread attempting to enter the $reading or $writing block is blocked until the write thread exits the $writing block.
If both read and write threads are waiting, by default, the Read line Cheng first. However, you can use the $writer_priority property to modify the definition of a class to change this default.
Access to partially created objects should be illegal
In the current case, JLS allows access to partially created objects. For example, a thread created in a constructor can access the object being created, even if the object is not completely created.
A thread that sets X to 1 can be combined with a thread that sets X to 0. Therefore, the value of x cannot be predicted at this time.
One workaround for this problem is to prevent running its run () method for the thread created in this constructor, even if it has a higher priority than the thread that called new, before the constructor returns.
This means that the start () request must be deferred before the constructor returns.
In addition, the Java programming language should allow the synchronization of constructors. In other words, the following code (which is illegal in the current case) will work as expected:
I think the first method is more concise than the second, but it is more difficult to achieve.
The volatile keyword should work as expected
JLS requires retention of requests for volatile operations. Most Java virtual machines simply ignore this part of the content, which is not supposed to be. In multiprocessor situations, this problem occurs with many hosts, but it should have been resolved by JLS. If you are interested in this, Bill Pugh of the University of Maryland is working on this (see Resources).
Issues with Access
The lack of good access control can make threading programming very difficult. In most cases, you do not have to consider thread-safety (THREADSAFE) issues if you can guarantee that a thread is only invoked from within the synchronization subsystem. I recommend the following restrictions on the concept of access rights in the Java programming language; You should use the Package keyword precisely to limit package access. I think that when the existence of default behavior is a flaw in any computer language, I am puzzled by the existence of this default right now (and this defaults to "package (package)" level rather than "private"). In other ways, the Java programming language does not provide an equivalent default keyword. While using an explicit package qualifier destroys existing code, it makes the code more readable and eliminates potential errors for the entire class (for example, if access is ignored because of an error, rather than being deliberately ignored). The new private protected is reintroduced with the same functionality as the current protected, but packet-level access should not be allowed. Allow private private syntax to specify that "implemented access" is private for all external objects, even for the same class as the current object. For "." The only reference (implicit or explicit) to the left should be this. Extend the syntax of public to authorize it to make access to specific classes. For example, the following code should allow an object of the Fred class to invoke Some_method (), but for objects of other classes, this method should be private.
This recommendation differs from the "friend" mechanism of C + +. In the "friend" mechanism, it authorizes a class to access all the private parts of another class. Here, I recommend a tightly controlled access to a limited set of methods. In this way, a class can define an interface for another class that is not visible to the rest of the system.
Unless the domain references a truly invariant (immutable) object or the static final base type, all fields are defined as private. Direct access to a domain in a class violates the two basic rules of OO Design: abstraction and encapsulation. From a threading standpoint, allowing direct access to a domain makes it easier to make asynchronous access to it.
Add $property keyword. Objects with this keyword can be accessed by a "bean box" application that uses the reflection operations (introspection) API defined in class classes, otherwise works with private private. $property properties are available in fields and methods so that the existing JavaBean Getter/setter method can easily be defined as a property.
invariance (immutability)
Since access to invariant objects does not require synchronization, the invariant concept (the value of an object cannot be changed after it is created) is invaluable under multithreaded conditions. In Java programming speech, the implementation of invariance is not strict enough for two reasons: for an invariant object, it can be accessed before it is completely created. This access can produce incorrect values for some domains. The definition of a constant (all fields of a class are final) is too loosely defined. For objects specified by the final reference, the object itself can change state, although the reference itself cannot be changed.
The first problem can be resolved by not allowing the thread to start execution in the constructor (or to execute the start request until the constructor returns).
For the second problem, you can resolve this problem by qualifying the final modifier to point to a constant object. This means that for an object, only all the fields are final, and the domain of all referenced objects is final, and this object is truly constant. In order not to break the existing code, this definition can be enhanced with the compiler, that is, when only one class is explicitly marked as invariant, the class is invariant.
With the $immutable modifier, the final modifier in the field definition is optional.
Finally, when an inner class (inner class) is used, an error in the Java compiler makes it impossible to reliably create immutable objects.
This error message occurs even if the empty final is initialized in each constructor. This error has been in the compiler since the introduction of internal classes in version 1.1. In this version (three years later), this error persists. Now it's time to correct the mistake.
for instance-level access to class-level domains
In addition to access permissions, there is a problem that class-level (static) methods and instance (Non-static) methods can directly access class-level (Static) fields. This access is very dangerous because the synchronization of an instance method does not acquire a class-level lock, so a synchronized static method and a synchronized method can still access the domain of the class at the same time. An obvious way to correct this problem is to require that only the static access method be used in an instance method to access a static field of a invariant class. Of course, this requirement requires a compiler and runtime check.
Because F () and g () can run in parallel, they can change the value of x at the same time (produce indefinite results). Keep in mind that there are two locks: the static method requires a lock that belongs to a class object, and a Non-static method requires a lock that belongs to an instance of this class.
Or, the compiler should get read/write locks used:
Another approach is (this is also an ideal method)--the compiler should automatically use a read/write lock to synchronize access to the invariant static domain, so that the programmer does not have to worry about the problem.
A sudden end of a background thread
When all non-background threads terminate, the background threads are abruptly terminated. When a background thread creates some global resources (such as a database connection or a temporary file), these resources are not closed or deleted at the end of the latter thread, causing problems.
For this issue, I recommend that you make a rule that the Java virtual machine does not close the application in the following situations: any non-background threads are running, or any background thread is executing a synchronized method or synchronized code block.
A background thread can be closed immediately after it finishes synchronized blocks or synchronized methods.
Re-introduce the Stop (), suspend () and resume () keywords
This may not be feasible for practical reasons, but I do not want to abolish stop () (in Thread and Threadgroup). However, I will change the semantics of the Stop () so that it does not break existing code when invoked. However, with regard to the Stop (), remember that when the thread terminates, stop () Releases all locks, potentially causing the thread that is working on this object to enter an unstable (locally modified) state. Because the stopped thread has freed all of its locks on this object, these objects cannot be accessed again.
For this problem, you can redefine the behavior of the Stop () so that the thread terminates immediately only if it does not occupy any locks. If it occupies a lock, I recommend that the thread release the last lock before terminating it. You can implement this behavior by using a mechanism similar to throwing an exception. Set a flag by the stop line thread and test the flag as soon as you exit all sync blocks. If this flag is set, an implicit exception is thrown, but the exception should no longer be caught and no output will be generated when the thread ends. Note that Microsoft's NT operating system does not handle a sudden stop (abrupt) of an external instruction well. (It does not notify the dynamic Connection Library of the Stop message, so it can cause a system-level resource vulnerability.) This is the reason why I recommend using a similar exception method to simply cause run () to return.
The real problem with this and exception-like approach is that you must insert code to test the "stopped" flag after each synchronized block. And this additional code can degrade system performance and increase code length. Another way I can think of is to make the stop () implement a "deferred (lazy)" stop, in which case the next call to wait () or yield () is terminated. I also want to add a isstopped () and stopped () method to thread (at this point, thread will work like isinterrupted () and interrupted (), but will detect "stop-requested" status) 。 This approach is not as generic as the first, but it works and does not overload.
The suspend () and resume () methods should be put back in the Java programming language, which is very useful and I don't want to be treated like a kindergarten child. It makes no sense to remove them because they can potentially be dangerous (a thread could occupy a lock when it is suspended). Please let me decide whether to use them myself. If the received line one thread occupy the lock, Sun should take them as a Run-time exception handling (Run-time exception) that calls suspend (), or better yet, delay the actual suspend process until the thread releases all the locks.
blocked I/O should work correctly
Should be able to interrupt any blocked operation, instead of just letting them wait () and sleep (). I discussed this issue in the socket section in chapter II of the "Taming Java Threads". But now, the only way to interrupt an I/O operation on a blocked socket is to close the socket without interrupting a blocked file I/O operation. For example, once a read request is started and a blocking state is entered, the thread is blocking state unless it actually reads something. Even turning off the file handle does not interrupt the read operation.
Also, the program should support timeout for I/O operations. All objects that may have a blocking action, such as a InputStream object, should also support this method.
This is equivalent to the Setsotimeout (time) method of the Socket class. Similarly, you should support passing timeouts as arguments to blocked calls.
Threadgroup class
Threadgroup should implement all of the methods in thread that can change the state of threads. I particularly want it to implement the join () method so that I can wait for all threads in the group to terminate.
Summarize
The above is my suggestion. As I said in the title, if I Were king ... (EH). I hope that these changes (or other equivalent methods) will eventually be introduced into the Java language. I do think the Java language is a great programming language, but I also think that the Java threading model is not well designed, which is a pity. However, the Java programming language is evolving, so there are prospects for improvement.