Python IO Model

Last Update:2017-07-21 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Waiting for data preparation (waiting for the

Copy data from the kernel into the process (Copying the data from the kernel-the process)

Stevens compared five IO Model in the article:

Blocking IO

nonblocking IO

IO multiplexing

Signal Driven IO

Asynchronous IO

Since signal driven IO is not commonly used in practice, I only refer to the remaining four IO Model.

Blocking IO blocking IO

In Linux, all sockets are blocking by default, and a typical read operation flow is probably this:

When the user process invokes the RECVFROM system call, Kernel begins the first phase of IO: Preparing the data. For network IO, there are times when the data has not arrived at the beginning (for example, a full UDP packet has not been received), and kernel waits for enough data to arrive. On this side of the user process, the entire process is blocked. When kernel waits until the data is ready, it copies the data from the kernel to the user's memory, and then kernel returns the result, and the user process removes the block state and re-runs it.
Therefore, the blocking IO is characterized by block in both phases of IO execution.

Non-blocking io non-blocking IO

Under Linux, you can make it non-blocking by setting the socket. When you perform a read operation on a non-blocking socket, the process looks like this:

As you can see, when the user process issues a read operation, if the data in kernel is not ready, it does not block the user process, but returns an error immediately. From the user process point of view, it initiates a read operation and does not need to wait, but immediately gets a result. When the user process determines that the result is an error, it knows that the data is not ready, so it can send the read operation again. Once the data in the kernel is ready and again receives the system call of the user process, it immediately copies the data to the user's memory and then returns. Therefore, the user process is in fact need to constantly actively ask kernel data well no.

Pros: Ability to do other work while waiting for the task to complete (including submitting other tasks, i.e. "backstage" can have multiple tasks at the same time).

Disadvantage: The response delay for task completion is increased because each time a read operation is polled, the task may be completed at any time between polling two times. This can result in a decrease in overall data throughput.

Multiplexing IO multiplexing IO

The word IO multiplexing may be a bit unfamiliar, but if I say select,epoll, I'll probably get it. Some places also call this IO mode for event driven IO. As we all know, the benefit of Select/epoll is that a single process can simultaneously handle multiple network connections of IO. The basic principle of the select/epoll is that the function will constantly poll all sockets that are responsible, and when a socket has data arrives, notifies the user of the process. It's process

When the user process invokes select, the entire process is blocked, and at the same time, kernel "monitors" all select-responsible sockets, and when the data in any one socket is ready, select returns. This time the user process then invokes the read operation, copying the data from the kernel to the user process.
This figure is not much different from the blocking IO diagram, in fact, it's even worse. Because two system calls (select and Recvfrom) are required, blocking IO only invokes one system call (Recvfrom). However, the advantage of using select is that it can handle multiple connection at the same time. (Say one more word.) Therefore, if the number of connections processed is not high, Web server using Select/epoll does not necessarily perform better than the Web server using multi-threading + blocking IO, and may be more delayed. The advantage of Select/epoll is not that a single connection can be processed faster, but that it can handle more connections. ）
In the IO multiplexing model, the actual, for each socket, is generally set to become non-blocking, but, as shown, the entire user's process is actually always block. Only the process is the block of the Select function, not the socket IO.

Conclusion: The advantage of select is that it can handle multiple connections, not for a single connection

Asynchronous IO

The asynchronous IO under Linux is actually used very little. Let's take a look at its process:

After the user process initiates the read operation, you can begin to do other things immediately. On the other hand, from the perspective of kernel, when it receives a asynchronous read, first it returns immediately, so no block is generated for the user process. Then, kernel waits for the data to be ready and then copies the data to the user's memory, and when all this is done, kernel sends a signal to the user process to tell it that the read operation is complete.

So far, four IO model has been introduced. Now back to the first few questions: what is the difference between blocking and non-blocking, and what is the difference between synchronous IO and asynchronous IO?
First answer the simplest of this: blocking vs non-blocking. The difference between the two is clearly explained in the previous introduction. Calling blocking IO will block the corresponding process until the operation is complete, and non-blocking IO will return immediately when the kernel is ready for the data.

Before explaining the difference between synchronous IO and asynchronous IO, you need to give a definition of both. The definition given by Stevens (in fact, the definition of POSIX) is this:
A synchronous I/O operation causes the requesting process to being blocked until that I/O operationcompletes;
An asynchronous I/O operation does not cause the requesting process to be blocked;
The difference is that synchronous IO will block the process when it does "IO operation". According to this definition, the blocking io,non-blocking Io,io Multiplexing described previously are synchronous IO. One might say that non-blocking io is not block. Here is a very "tricky" place, defined in the "IO operation" refers to the real IO operation, is the example of recvfrom this system call. Non-blocking IO does not block the process when it executes recvfrom this system call if the kernel data is not ready. However, when the data in the kernel is ready, recvfrom copies the data from the kernel to the user's memory, at which point the process is blocked, during which time the process is block. The asynchronous IO is not the same, and when the process initiates an IO operation, the direct return is ignored until the kernel sends a signal telling the process that IO is complete. Throughout this process, the process has not been blocked at all.

Comparison of each IO model:

As described above, the difference between non-blocking io and asynchronous io is obvious. In non-blocking io, although the process will not be blocked for most of the time, it still requires the process to go to the active check, and when the data is ready, it is also necessary for the process to proactively call Recvfrom to copy the data to the user's memory. and asynchronous Io is completely different. It's like a user process handing over an entire IO operation to someone else (kernel) and then sending a signal notification when someone finishes it. During this time, the user process does not need to check the status of the IO operation, nor does it need to actively copy the data.

Selectors module

Import selectors
Import socket

sel = selectors. Defaultselector ()

def accept (sock, mask):
conn, addr = Sock.accept () # should be ready
Print (' accepted ', Conn, ' from ', addr)
Conn.setblocking (False)
Sel.register (conn, selectors. Event_read, READ)

DEF read (conn, mask):
data = CONN.RECV (+) # should is ready
If data:
Print (' Echoing ', repr (data), ' to ', conn)
Conn.send (data) # Hope it won ' t block
Else
Print (' closing ', conn)
Sel.unregister (conn)
Conn.close ()

Sock = Socket.socket ()
Sock.bind ((' localhost ', 1234))
Sock.listen (100)
Sock.setblocking (False)
Sel.register (sock, selectors. Event_read, accept)

While True:
Events = Sel.select ()
For key, mask in events:
callback = Key.data
Callback (Key.fileobj, mask)

Python IO Model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More