May 2 Python Learning summary IO model

Last Update:2018-05-06 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

IO model

1, blocking IO2, non-blocking IO3, multiplexing IO4, asynchronous IO

One, blocking IO

The blocking IO is characterized by a block of two phases of IO execution (two stages of waiting for data and copying data).

Virtually all IO interfaces (including the socket interface) are blocking, unless specifically specified.

A blocking interface is a system call (typically an IO interface) that does not return a call result and keeps the current thread blocked until the system call obtains a result or a time-out error.

In Linux, all sockets are blocking by default, and a typical read operation flow is probably this:

Second, non-blocking IO (not recommended) keep asking.

Under Linux, you can make it non-blocking by setting the socket. When you perform a read operation on a non-blocking socket, the process looks like this:

As you can see, when the user process issues a read operation, if the data in kernel is not ready, it does not block the user process, but returns an error immediately. From the user process point of view, it initiates a read operation and does not need to wait, but immediately gets a result. When the user process determines that the result is an error, it knows that the data is not ready, so the user can do something else in the interval between this time and the next time the read query is initiated, or send the read operation directly again. Once the data in the kernel is ready and again receives the system call of the user process, it immediately copies the data to the user's memory (this phase is still blocked) and returns.

In non-blocking IO, the user process is in fact required to constantly proactively ask kernel data to be prepared.

#Service SideImportSocketImportTimeserver=Socket.socket () server.setsockopt (socket). Sol_socket,socket. SO_REUSEADDR,1) Server.bind ('127.0.0.1', 8083)) Server.listen (5) server.setblocking (False) r_list=[]w_list={} while1:    Try: Conn,addr=server.accept () r_list.append (conn)exceptBlockingioerror:#Emphasis emphasized :!!! The essence of non-blocking IO is no blocking at all!!!         #Time.sleep (0.5) # Opening the line comment is purely for the convenience of viewing the effect        Print('doing other things.')        Print('rlist:', Len (r_list))Print('wlist:', Len (w_list))#traverse the Read list, and then remove the socket to read the contentsdel_rlist=[]         forConninchr_list:Try: Data=CONN.RECV (1024)                if  notdata:conn.close () del_rlist.append (conn)ContinueW_list[conn]=Data.upper ()exceptBlockingioerror:#does not succeed, it continues to retrieve the next socket's receive                Continue            exceptConnectionreseterror:#The current socket is out of exception, then closed, and then added to the delete list, waiting to be clearedconn.close () del_rlist.append (conn)#traverse the Write list, then remove the socket send contentdel_wlist=[]         forConn,datainchW_list.items ():Try: Conn.send (data) del_wlist.append (conn)exceptBlockingioerror:Continue        #clean up useless sockets without having to listen to their IO operations         forConninchDel_rlist:r_list.remove (conn) forConninchDEL_WLIST:W_LIST.POP (conn)

　　　　The essence of non-blocking IO is no blocking at all!!!

　Excellent:

Be able to do other work while waiting for the task to complete (including submitting other tasks, that is, "backstage" can have multiple tasks at "" and "").

Lack:

1. Cyclic call recv () will significantly push up the CPU occupancy rate; This is why we leave a sentence of Time.sleep (2) in the code, otherwise it is very easy to appear in the low-match host machine condition

2. The Response latency for task completion is increased, because each time a read operation is polled, the task may be completed at any time between two polls. This can result in a decrease in overall data throughput.

　　　　In addition, in this scenario recv () is more of a test "operation is complete" role, the actual operating system provides more efficient detection "operation is completed" function of the interface, such as select () multiplexing mode, can detect more than one connection active at a time.

iii. multiplexing io(IO multiplexing) Select to ask for it, centralize processing

R1,wl,x1=select.select (r_list,w_list,[],0.5)    "" "    rlist--wait until ready for reading    wlist-Wait until ready for writing    xlist – wait for a ' exceptional condition ' If only one    kind of CO Ndition is required, pass [] for the other lists.         """

　　　　

The word IO multiplexing may be a bit unfamiliar, but if I say select/epoll, I'll probably get it. Some places also call this IO mode for event-driven IO(driven io). As we all know, the benefit of Select/epoll is that a single process can simultaneously handle multiple network connections of IO. The basic principle of the select/epoll is that the function will constantly poll all sockets that are responsible, and when a socket has data arrives, notifies the user of the process. It's process

When the user process invokes select, the entire process is blocked, and at the same time, kernel "monitors" all select-responsible sockets, and when the data in any one socket is ready, select returns. This time the user process then invokes the read operation, copying the data from the kernel to the user process.
This figure is not much different from the blocking IO diagram, in fact it's even worse. Because there are two system calls (select and recvfrom) that need to be used, blocking IO only calls a system call (Recvfrom). However, the advantage of using select is that it can handle multiple connection at the same time.

Emphasize:

1. If the number of connections processed is not high, Web server using Select/epoll does not necessarily perform better than the Web server using multi-threading + blocking IO, and may be more delayed. The advantage of Select/epoll is not that a single connection can be processed faster, but that it can handle more connections.

2. In a multiplexed model, for each socket, it is generally set to non-blocking, but, as shown, the entire user's process is always block. Only the process is the block of the Select function, not the socket IO.

Conclusion: The advantage of select is that it can handle multiple connections, not for a single connection

#Service Side fromSocketImport*ImportSelectserver=socket (af_inet, Sock_stream) server.bind (('127.0.0.1', 8093)) Server.listen (5) server.setblocking (False)Print('starting ...') rlist=[Server,]wlist=[]wdata={} whileTRUE:RL,WL,XL=select.select (rlist,wlist,[],0.5)    Print(WL) forSockinchRL:ifSock = =server:conn,addr=sock.accept () rlist.append (conn)Else:            Try: Data=SOCK.RECV (1024)                if  notdata:sock.close () rlist.remove (sock)Continuewlist.append (sock) Wdata[sock]=Data.upper ()exceptException:sock.close () rlist.remove (sock) forSockinchwl:sock.send (Wdata[sock]) wlist.remove (sock) wdata.pop (sock)

Select monitors the process analysis of FD changes:

#用户进程创建socket对象, the copy monitoring of FD to the kernel space, each FD will correspond to a System file table, the kernel space of FD response to the data, will send a signal to the user process data has arrived; #用户进程再发送系统调用, For example (accept) copy the kernel space data to the user space, as well as the data to accept the core space of the data purged, so that the new monitoring of the FD and then again the data can be responded to (the sending side because the TCP protocol is based on the need to receive a reply before clearing).

Excellent:

Compared to other models, the event-driven model using select () executes only single-threaded (process), consumes less resources, consumes too much CPU, and provides services to multiple clients.

If you try to build a simple event-driven server program, this model has some reference value.

Lack:

1, the first select () interface is not the best choice to achieve "event-driven". Because the Select () interface itself consumes a lot of time to poll each handle when the value of the handle to be probed is large.

Many operating systems provide a more efficient interface,

2. Secondly, the model is a combination of event detection and incident response, and once the event response is large, it is catastrophic for the whole model.

Iv. Asynchronous IO (asynchronous I/O)

The asynchronous IO in Linux is not used much, and is only introduced from kernel version 2.6. Let's take a look at its process:

After the user process initiates the read operation, you can begin to do other things immediately. On the other hand, from the perspective of kernel, when it receives a asynchronous read, first it returns immediately, so no block is generated for the user process. Then, kernel waits for the data to be ready and then copies the data to the user's memory, and when all this is done, kernel sends a signal to the user process to tell it that the read operation is complete.

Pure asynchronous IO gives the kernel the task of copying data from the kernel to the process, and the process is only notified that the data arrives

Six, IO Model comparison analysis

So far, four IO model has been introduced. Now back to the first few questions: what is the difference between blocking and non-blocking, and what is the difference between synchronous IO and asynchronous IO?
First answer the simplest of this: blocking vs non-blocking. The difference between the two is clearly explained in the previous introduction. Calling blocking IO will block the corresponding process until the operation is complete, and non-blocking IO will return immediately when the kernel is ready for the data.

Before you tell the difference between synchronous IO and asynchronous IO, you need to give a definition of both. The definition given by Stevens (in fact, the definition of POSIX) is this:
A synchronous I/O operation causes the requesting process to being blocked until that I/O operationcompletes;
An asynchronous I/O operation does not cause the requesting process to be blocked;
The difference is that synchronous IO will block the process when it does "IO operation". According to this definition, four IO models can be divided into two categories, previously described in the blocking io,non-blocking Io,io multiplexing belong to the synchronous IO category, and asynchronous I/O after the class.

One might say that non-blocking io is not block. Here is a very "tricky" place, defined in the "IO operation" refers to the real IO operation, is the example of recvfrom this system call. Non-blocking IO does not block the process when it executes recvfrom this system call if the kernel data is not ready. However, when the data in the kernel is ready, recvfrom copies the data from the kernel to the user's memory, at which point the process is blocked, during which time the process is block. The asynchronous IO is not the same, and when the process initiates an IO operation, the direct return is ignored until the kernel sends a signal telling the process that IO is complete. Throughout this process, the process has not been blocked at all.

Comparison of each IO model:

As described above, the difference between non-blocking io and asynchronous io is obvious. In non-blocking io, although the process will not be blocked for most of the time, it still requires the process to go to the active check, and when the data is ready, it is also necessary for the process to proactively call Recvfrom to copy the data to the user's memory. and asynchronous Io is completely different. It's like a user process handing over an entire IO operation to someone else (kernel) and then sending a signal notification when someone finishes it. During this time, the user process does not need to check the status of the IO operation, nor does it need to actively copy the data.

May 2 Python Learning summary IO model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More