Summarize the various IO methods

Source: Internet
Author: User
Tags epoll socket
Synchronous (synchronous) IO and asynchronous (asynchronous) Io, what is the difference between blocking (blocking) IO and non-blocking (non-blocking) IO, respectively. The problem is that different people may give different answers, some think asynchronous IO and non-blocking io are a thing. This is because different people have different backgrounds, and the context is not the same when discussing this issue. Therefore, in order to better answer this question, I first limit the context of this article.

This article discusses the background of network IO in a Linux environment.

The most important references in this paper are Richard Stevens's "Unix®network programming Volume 1, third edition:the Sockets Networking", section 6.2 " I/O Models ", Stevens in this section detailing the various IO characteristics and differences, if the English is good enough, recommend direct reading. The style of Stevens is famous, so don't worry about it. The flowchart in this paper is also intercepted from the reference literature.

Stevens compared five IO Model in the article:

Blocking IO

nonblocking IO

IO multiplexing

Signal Driven IO

Asynchronous IO

Since signal driven IO is not commonly used in practice, I only refer to the remaining four IO Model.

Again, the objects and steps involved in the IO occur.

For a network IO (here we read for example), it involves two system objects, one that calls the IO process (or thread), and the other is the system kernel (kernel). When a read operation occurs, it goes through two stages:

1 Waiting for data preparation (waiting for the

2 copying data from the kernel to the process (Copying the data from the kernel to the)

It is important to remember these two points because the difference between these IO model is that there are different situations in both phases.

Blocking IO

In Linux, all sockets are blocking by default, and a typical read operation flow is probably this:

When the user process invokes the RECVFROM system call, Kernel begins the first phase of IO: Preparing the data. For network IO, there are times when the data has not arrived at the beginning (for example, a full UDP packet has not been received), and kernel waits for enough data to arrive. On this side of the user process, the entire process is blocked. When kernel waits until the data is ready, it copies the data from the kernel to the user's memory, and then kernel returns the result, and the user process removes the block state and re-runs it.

Therefore, the blocking IO is characterized by block in both phases of IO execution.

non-blocking IO

Under Linux, you can make it non-blocking by setting the socket. When you perform a read operation on a non-blocking socket, the process looks like this:

As can be seen from the graph, when the user process issues a read operation, if the data in the kernel is not ready, then it does not block the user process, but immediately returns an error. From the user process point of view, it initiates a read operation and does not need to wait, but immediately gets a result. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the data in the kernel is ready and again receives the system call of the user process, it immediately copies the data to the user's memory and then returns.

Therefore, the user process is in fact need to constantly actively ask kernel data well no.

IO Multiplexing

The word IO multiplexing may be a bit unfamiliar, but if I say select,epoll, I'll probably get it. Some places also call this IO mode for event driven IO. As we all know, the benefit of Select/epoll is that a single process can simultaneously handle multiple network connections of IO. The basic principle of the select/epoll is that the function will constantly poll all sockets that are responsible, and when a socket has data arrives, notifies the user of the process. Its flow chart is as follows:

When the user process invokes select, the entire process is blocked, and at the same time, kernel "monitors" all select-responsible sockets, and when the data in any one socket is ready, select returns. This time the user process then invokes the read operation, copying the data from the kernel to the user process.

This figure is not much different from the blocking IO diagram, in fact, it's even worse. Because two system calls (select and Recvfrom) are required, blocking IO only invokes one system call (Recvfrom). However, the advantage of using select is that it can handle multiple connection at the same time. (Say one more word.) Therefore, if the number of connections processed is not high, Web server using Select/epoll does not necessarily perform better than the Web server using multi-threading + blocking IO, and may be more delayed. The advantage of Select/epoll is not that a single connection can be processed faster, but that it can handle more connections. )

In the IO multiplexing model, the actual, for each socket, is generally set to become non-blocking, but, as shown above, the entire user's process is actually always block. Only the process is the block of the Select function, not the socket IO.

asynchronous I/O

The asynchronous IO under Linux is actually used very little. Let's take a look at its process:

After the user process initiates the read operation, you can begin to do other things immediately. On the other hand, from the perspective of kernel, when it receives a asynchronous read, first it returns immediately, so no block is generated for the user process. Then, kernel waits for the data to be ready and then copies the data to the user's memory, and when all this is done, kernel sends a signal to the user process to tell it that the read operation is complete.

So far, four IO model has been introduced. Now back to the first few questions: what is the difference between blocking and non-blocking, and what is the difference between synchronous IO and asynchronous IO?

First answer the simplest of this: blocking vs non-blocking. The difference between the two is clearly explained in the previous introduction. Calling blocking IO will block the corresponding process until the operation is complete, and non-blocking IO will return immediately when the kernel is ready for the data.

Before explaining the difference between synchronous IO and asynchronous IO, you need to give a definition of both. The definition given by Stevens (in fact, the definition of POSIX) is this:

A Synchronous I/O operation causes the requesting process to being blocked until that I/O operation completes;

An asynchronous I/O operation does not cause the requesting process to be blocked;

The difference is that synchronous IO will block the process when it does "IO operation". According to this definition, the blocking io,non-blocking Io,io Multiplexing described previously are synchronous IO. One might say that non-blocking io is not block. Here is a very "tricky" place, defined in the "IO operation" refers to the real IO operation, is the example of recvfrom this system call. Non-blocking IO does not block the process when it executes recvfrom this system call if the kernel data is not ready. However, when the data in the kernel is ready, recvfrom copies the data from the kernel to the user's memory, at which point the process is blocked, during which time the process is block. The asynchronous IO is not the same, and when the process initiates an IO operation, the direct return is ignored until the kernel sends a signal telling the process that IO is complete. Throughout this process, the process has not been blocked at all.

The comparison of each IO model is shown in the figure:

As described above, the difference between non-blocking io and asynchronous io is obvious. In non-blocking io, although the process will not be blocked for most of the time, it still requires the process to go to the active check, and when the data is ready, it is also necessary for the process to proactively call Recvfrom to copy the data to the user's memory. and asynchronous Io is completely different. It's like a user process handing over an entire IO operation to someone else (kernel) and then sending a signal notification when someone finishes it. During this time, the user process does not need to check the status of the IO operation, nor does it need to actively copy the data.

Finally, a few more examples are not very appropriate to illustrate these four IO Model:

There are a,b,c,d of four people in fishing:

A used is the most old-fashioned fishing rod, so, have to keep guarding, wait until the fish hooked up again lever;

B's Fishing rod has a function, can show whether there are fish hooked, so, B and next to the MM chat, and then see if there are fish bait, some words on the rapid lever;

C with the fishing rod and B almost, but he thought of a good way, is to put several fishing rods at the same time, and then keep in the side, once the show said the fish hooked, it will be the corresponding rod pull up;

D is a rich man, simply hired a person to help him to fish, once the person caught the fish up, send a message to D.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.