Reprinted from Http://www.cnblogs.com/fanzhidongyzby/p/4098546.html
Server-side programming often requires the construction of a high-performance IO model, with four of common IO models:
(1) Synchronous blocking IO (Blocking io): The traditional IO model.
(2) Synchronous non-blocking IO (non-blocking io): The socket created by default is blocked and non-blocking IO requires the socket to be set to Nonblock. Note that the NIO mentioned here is not a Java NiO (New IO) library.
(3) IO multiplexing (IO multiplexing): The classic reactor design pattern, sometimes called asynchronous blocking Io,java selector and Epoll in Linux are this model.
(4) Asynchronous IO (asynchronous IO): The classic Proactor design pattern, also known as asynchronous nonblocking io.
The concept of synchronous and asynchronous describes how a user thread interacts with the kernel: synchronization means that the user thread needs to wait or poll for kernel IO operations to complete before it is initiated, and asynchronous means that the user thread continues execution after the IO request has been initiated and notifies the user thread when the kernel IO operation is complete. or call the callback function registered by the user thread.
The blocking and non-blocking concept describes how the user thread invokes kernel IO operations: Blocking means that the IO operation needs to be completely completed before returning to the user space, and non-blocking means that the IO operation is returned to the user immediately after it is called, without having to wait until the IO operation is complete.
In addition, Richard Stevens's signal-driven IO (Signal driven IO) model, mentioned in the Unix Network programming Volume 1, is not covered in this article because the model is not commonly used. Next, we analyze the implementation principles of the four common IO models in detail. To facilitate the description, we use the read operation of IO as an example.
One, synchronous blocking IO
The synchronous blocking IO model is the simplest IO model, and the user thread is blocked while the kernel is doing IO operations.
Figure 1 Synchronous blocking IO
As shown in 1, the user thread initiates an IO read operation through the system call to read, which is transferred from the user space to the kernel space. The kernel waits until the packet arrives, and then copies the received data to the user space to complete the read operation.
The pseudo-code description of the user thread using the synchronous blocking IO model is:
{
Read (socket, buffer);
process (buffer);
}
That is, the user waits until read reads the data from the socket into buffer before proceeding with the received data. During the entire IO request, the user thread is blocked, which causes the user to not do anything when initiating the IO request, and the CPU resource utilization is insufficient.
Second, synchronous non-blocking IO
Synchronous non-blocking IO is based on the synchronous blocking IO, and the socket is set to Nonblock. This allows the user thread to return immediately after initiating an IO request.
Figure 2 Synchronous non-blocking IO
2, because the socket is non-blocking, the user thread returns as soon as the IO request is initiated. However, no data is read, and the user thread needs to constantly initiate an IO request until the data arrives and the data is actually read to continue execution.
The pseudo-code description of the user thread using the synchronous nonblocking IO model is:
{
while (read (socket, buffer)! = SUCCESS)
;
process (buffer);
}
That is, the user needs to constantly call read, attempting to read the data in the socket until the read succeeds before processing the received data. Throughout the IO request process, although the user thread can return immediately after each IO request, but in order to wait for the data, still need to continuously poll, repeated requests, consumes a lot of CPU resources. This model is rarely used directly, but is a feature of non-blocking IO in other IO models.
Third, IO multiplexing
The IO multiplexing model is based on the multi-path separation function Select provided by the kernel, using the Select function to avoid polling waits in the synchronous nonblocking IO model.
Figure 3 Multi-channel separation function Select
As shown in 3, the user first adds a socket that requires IO operations to the Select and then blocks the wait for the select system call to return. When the data arrives, the socket is activated and the Select function returns. The user thread formally initiates a read request, reads the data, and continues execution.
From a process point of view, there is not much difference between IO requests and synchronous blocking models using the Select function, and even more than adding a monitor socket and calling the Select function for extra operations. However, the biggest advantage after using select is that the user can process multiple socket IO requests simultaneously in a single line range. The user can register multiple sockets, and then constantly call Select to read the activated socket, to achieve the same line range simultaneous processing of multiple IO requests. In the synchronous blocking model, it is necessary to achieve this goal through multithreading.
The pseudo-code description of the user thread using the Select function is:
{
Select (socket);
while (1) {
sockets = select ();
For (socket in sockets) {
if (Can_read (socket)) {
Read (socket, buffer);
process (buffer);
}
}
}
}
wherein the while loop is added to the Select monitor before the socket is called in while, the select gets the activated socket, and once the socket is readable, the read function is called to fetch the data from the socket.
However, the advantages of using the Select function are not limited to this. Although the above approach allows multiple IO requests to be processed within a single thread, the process of each IO request is still blocked (blocking on the Select function), and the average time is even longer than the synchronous blocking IO model. If the user thread registers only the socket or IO request that it is interested in, and then does its own thing and waits until the data arrives, it can increase the CPU utilization.
The IO multiplexing model implements this mechanism using the reactor design pattern.
Figure 4 Reactor design pattern
As shown in 4, the EventHandler abstract class represents the Io event handler, which has an IO file handle handle (obtained through Get_handle), and handle_event (read/write, etc.) for the handle operation. Subclasses that inherit from EventHandler can customize the behavior of the event handler. The reactor class is used to manage EventHandler (registration, deletion, and so on) and uses handle_events to implement event loops, constantly invoking a multi-separator select for a synchronous event multiplexer (typically the kernel), as long as a file handle is activated (readable/writable, etc.). Select returns (blocks), and Handle_events invokes the handle_event of the event handler associated with the file handle.
Figure 5 IO Multiplexing
As shown in 5, by reactor, the user thread can poll the IO operation state uniformly to the Handle_events event loop for processing. After the user thread registers the event handler, it can continue to do other work (async), while the reactor thread is responsible for invoking the kernel's Select function to check the socket status. When a socket is activated, it notifies the appropriate user thread (or executes a callback function of the user thread) to perform the work of handle_event for data reading and processing. Because the Select function is blocked, the multiplexed IO model is also known as the asynchronous blocking IO model. Note that the blocking mentioned here refers to the thread being blocked when the Select function executes, not the socket. In general, when using the IO multiplexing model, the socket is set to Nonblock, but this does not have an impact, because when a user initiates an IO request, the data has arrived and the user thread must not be blocked.
The pseudo-code description for the user thread using the IO multiplexing model is:
void Usereventhandler::handle_event () {
if (Can_read (socket)) {
Read (socket, buffer);
process (buffer);
}
}
{
Reactor.register (new Usereventhandler (socket));
}
The user needs to rewrite EventHandler's handle_event function to read the data and work on the data, and the user thread simply registers its eventhandler with the reactor. The pseudo-code for the Handle_events event loop in reactor is roughly the following.
Reactor::handle_events () {
while (1) {
sockets = select ();
For (socket in sockets) {
Get_event_handler (socket). Handle_event ();
}
}
}
The event loop constantly calls select to get the socket being activated, and then according to the EventHandler that the socket corresponds to, the actuator handle_event function is available.
IO multiplexing is the most commonly used IO model, but it is not asynchronous enough to be "thorough" because it uses a select system call that blocks threads. So IO multiplexing can only be called asynchronous blocking Io, not true asynchronous IO.
Iv. Asynchronous IO
"Real" asynchronous IO requires stronger support from the operating system. In the IO multiplexing model, the event loop notifies the user thread of the status of the file handle, and the user thread reads the data and processes the data itself. In the asynchronous IO model, when the user thread receives the notification, the data has been read by the kernel and placed in the buffer area specified by the user thread, and the kernel notifies the user thread to use it directly after the IO is completed.
The asynchronous IO model implements this mechanism using the proactor design pattern.
Figure 6 Proactor Design pattern
6,proactor mode and reactor mode are structurally similar, but differ greatly in user (client) usage. In reactor mode, the user thread listens to an event that is interested in registering the reactor object, and then invokes the event handler when the event fires. In Proactor mode, the user thread will asynchronousoperation (read/write, etc.), Proactor and Completionhandler registered to Asynchronousoperationprocessor when the operation is complete. Asynchronousoperationprocessor uses facade mode to provide a set of asynchronous operations APIs (read/write, etc.) for use by the user, and when the user thread invokes the asynchronous API, it continues to perform its own task. Asynchronousoperationprocessor will open a separate kernel thread to perform asynchronous operations, enabling true asynchrony. When the asynchronous IO operation is complete, Asynchronousoperationprocessor takes the user thread out of Proactor and Completionhandler registered with Asynchronousoperation, The Completionhandler is then forwarded along with the result data of the IO operation to the proactor,proactor responsible for callback event completion handler handle_event for each asynchronous operation. Although each asynchronous operation in the Proactor mode can bind to a Proactor object, typically in the operating system, Proactor is implemented as singleton mode to facilitate centralized distribution operation completion events.
Figure 7 Asynchronous IO
7, in the asynchronous IO model, the user thread directly initiates a read request using the asynchronous IO API provided by the kernel, and returns immediately after initiating, continuing the execution of the user thread code. However, at this point the user thread has registered the called Asynchronousoperation and Completionhandler to the kernel, and then the operating system opens a separate kernel thread to handle IO operations. When the data of the read request arrives, the kernel is responsible for reading the data in the socket and writing to the user-specified buffer. The final kernel sends the read data and the user thread registration Completionhandler to the internal Proactor,proactor to notify the user thread of the completion of the IO (typically by invoking the completion event handler that is registered by the user thread) to complete the asynchronous IO.
The pseudo-code description of the user thread using the asynchronous IO model is:
void Usercompletionhandler::handle_event (buffer) {
process (buffer);
}
{
Aio_read (socket, new Usercompletionhandler);
}
The user needs to rewrite Completionhandler's handle_event function for working with the data, and the parameter buffer represents the data that Proactor has prepared, and the user thread directly invokes the asynchronous IO API provided by the kernel. And the rewritten Completionhandler can be registered.
Compared to the IO Multiplexing model, asynchronous IO is not very common, and many high performance concurrent service programs use the IO multiplexing model + multi-threaded task processing architecture to meet the requirements. In addition, the current operating system support for asynchronous IO is not particularly perfect, more is the use of the IO Multiplexing model to simulate asynchronous IO (IO event triggering does not directly notify the user thread, but the data is read and written to the user-specified buffer). Asynchronous IO has been supported after Java7, and interested readers can try it.
This paper briefly describes the structure and principles of four high-performance IO models from the basic concepts, workflow and code example three levels, and makes clear the confusing concepts of synchronous, asynchronous, blocking and non-blocking. Through the understanding of high-performance IO model, we can choose the IO model which is more in line with the actual business characteristics in the development of the service-side program, and improve the service quality. I hope this article will be of some help to you.
[Reprint] Analysis of high performance IO model