What is the difference between synchronous (synchronous) IO and asynchronous (asynchronous) Io, what is blocking (blocking) IO and non-blocking (non-blocking) IO respectively? The problem is that different people may give different answers, such as wikis, that asynchronous IO and non-blocking io are a thing. This is because different people have different backgrounds, and the context is not the same when discussing this issue. Therefore, in order to better answer this question, I first limit the context of this article.
This article discusses the background of network IO in a Linux environment.
The most important reference in this document is Richard Stevens'sUNIX? Network Programming Volume 1, third edition:the Sockets Networking ", 6.2 section"I/O Models ", Stevens in this section detailing the various I o The characteristics and differences, if the English is good enough, it is recommended to read directly. The style of Stevens is famous, so don't worry about it. The flowchart in this paper is also intercepted from the reference literature.
Stevens compared five IO Model in the article:
Blocking IO
nonblocking IO
IO multiplexing
Signal Driven IO
Asynchronous IO
Since signal driven IO is not commonly used in practice, I only refer to the remaining four IO Model.
Again, the objects and steps involved in the IO occur.
For a network IO (here we read for example), it involves two system objects, one that calls the IO process (or thread), and the other is the system kernel (kernel). When a read operation occurs, it goes through two stages:
1 Waiting for data preparation (waiting for the
2 copying data from the kernel to the process (Copying the data from the kernel to the)
It is important to remember these two points because the difference between these IO model is that there are different situations in both phases.
Blocking IO
In Linux, all sockets are blocking by default, and a typical read operation flow is probably this:
When the user process invokes the RECVFROM system call, Kernel begins the first phase of IO: Preparing the data. For network IO, there are times when the data has not arrived at the beginning (for example, a full UDP packet has not been received), and kernel waits for enough data to arrive. On this side of the user process, the entire process is blocked. When kernel waits until the data is ready, it copies the data from the kernel to the user's memory, and then kernel returns the result, and the user process removes the block state and re-runs it.
Therefore, the blocking IO is characterized by block in both phases of IO execution.
Non-blocking IO
Under Linux, you can make it non-blocking by setting the socket. When you perform a read operation on a non-blocking socket, the process looks like this:
As you can see, when the user process issues a read operation, if the data in kernel is not ready, it does not block the user process, but returns an error immediately. From the user process point of view, it initiates a read operation and does not need to wait, but immediately gets a result. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the data in the kernel is ready and again receives the system call of the user process, it immediately copies the data to the user's memory and then returns.
Therefore, the user process is in fact need to constantly actively ask kernel data well no.
IO multiplexing
The word IO multiplexing may be a bit unfamiliar, but if I say select,epoll, I'll probably get it. Some places also call this IO mode for event driven IO. As we all know, the benefit of Select/epoll is that a single process can simultaneously handle multiple network connections of IO. The basic principle of the select/epoll is that the function will constantly poll all sockets that are responsible, and when a socket has data arrives, notifies the user of the process. It's process
When the user process invokes select, the entire process is blocked, and at the same time, kernel "monitors" all select-responsible sockets, and when the data in any one socket is ready, select returns. This time the user process then invokes the read operation, copying the data from the kernel to the user process.
This figure is not much different from the blocking IO diagram, in fact, it's even worse. Because two system calls (select and Recvfrom) are required, blocking IO only invokes one system call (Recvfrom). However, the advantage of using select is that it can handle multiple connection at the same time. (Say one more word.) Therefore, if the number of connections processed is not high, Web server using Select/epoll does not necessarily perform better than the Web server using multi-threading + blocking IO, and may be more delayed. The advantage of Select/epoll is not that a single connection can be processed faster, but that it can handle more connections. )
In the IO multiplexing model, the actual, for each socket, is generally set to become non-blocking, but, as shown, the entire user's process is actually always block. Only the process is the block of the Select function, not the socket IO.
asynchronous I/O
The asynchronous IO under Linux is actually used very little. Let's take a look at its process:
After the user process initiates the read operation, you can begin to do other things immediately. On the other hand, from the perspective of kernel, when it receives a asynchronous read, first it returns immediately, so no block is generated for the user process. Then, kernel waits for the data to be ready and then copies the data to the user's memory, and when all this is done, kernel sends a signal to the user process to tell it that the read operation is complete.
So far, four IO model has been introduced. Now back to the first few questions: what is the difference between blocking and non-blocking, and what is the difference between synchronous IO and asynchronous IO?
First answer the simplest of this: blocking vs non-blocking. The difference between the two is clearly explained in the previous introduction. Calling blocking IO will block the corresponding process until the operation is complete, and non-blocking IO will return immediately when the kernel is ready for the data.
Before explaining the difference between synchronous IO and asynchronous IO, you need to give a definition of both. The definition given by Stevens (in fact, the definition of POSIX) is this:
A Synchronous I/O operation causes the requesting process to being blocked until that I/O operation completes;
An asynchronous I/O operation does not cause the requesting process to be blocked;
The difference is that synchronous IO will block the process when it does "IO operation". According to this definition, the blocking io,non-blocking Io,io Multiplexing described previously are synchronous IO. One might say that non-blocking io is not block. Here is a very "tricky" place, defined in the "IO operation" refers to the real IO operation, is the example of recvfrom this system call. Non-blocking IO does not block the process when it executes recvfrom this system call if the kernel data is not ready. However, when the data in the kernel is ready, recvfrom copies the data from the kernel to the user's memory, at which point the process is blocked, during which time the process is block. The asynchronous IO is not the same, and when the process initiates an IO operation, the direct return is ignored until the kernel sends a signal telling the process that IO is complete. Throughout this process, the process has not been blocked at all.
Comparison of each IO model:
As described above, the difference between non-blocking io and asynchronous io is obvious. In non-blocking io, although the process will not be blocked for most of the time, it still requires the process to go to the active check, and when the data is ready, it is also necessary for the process to proactively call Recvfrom to copy the data to the user's memory. and asynchronous Io is completely different. It's like a user process handing over an entire IO operation to someone else (kernel) and then sending a signal notification when someone finishes it. During this time, the user process does not need to check the status of the IO operation, nor does it need to actively copy the data.
LINUX5 IO Model and synchronous asynchronous, blocking non-blocking