Python Learning---IO model 1227

Last Update:2018-07-29 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1.1. Event-Driven

Event-driven is a programming paradigm, a programming style that specializes in dealing with unknown events by binding an event, activating the thing after an external trigger, and achieving the purpose of performing certain operations. such as the browser's onclick () event

1.2. IO Model Basics

Before interpreting, there are a few concepts to be explained:

User space and kernel space

Process switching

Blocking of processes

File descriptor

Cache I/O

user space and kernel space

Now that the operating system is using virtual memory, the 32-bit operating system, its addressing space (virtual storage space) is 4G (2 of 32).
The core of the operating system is the kernel, which is independent of the normal application, has access to protected memory space, and has all the permissions to access the underlying hardware device.
In order to ensure that the user process can not directly manipulate the kernel (kernel), to ensure the security of the kernel, worry about the system to divide the virtual space into two parts, part of the kernel space, part of the user space.
For the Linux operating system, the highest 1G bytes (from the virtual address 0xc0000000 to 0xFFFFFFFF) for the kernel to use, called the kernel space, and the lower 3G bytes (from the virtual address 0x00000000 to 0xBFFFFFFF) for each process to use, Called User space.

Process switching

To control the execution of a process, the kernel must have the ability to suspend a process that is running on the CPU and resume execution of a previously suspended process. This behavior is referred to as process switching, which is done by the operating system. So it can be said that any process that runs under the support of the operating system kernel is closely related to the kernel.
The process of moving from one process to another runs through the following changes:

Save the processor context, including program counters and other registers.

Update PCB information.

The PCB of the process is moved into the appropriate queue, such as ready, in an event blocking queue.

Select another process to execute and update its PCB.

Update the data structure of memory management.

Restore the processing machine context.
Note: Overall process switching is very resource-intensive

Blocking of processes

The executing process, because some expected events did not occur, such as requesting system resources failed, waiting for the completion of an operation, new data has not arrived or no new work to do, etc., the system automatically executes the blocking primitive (block), making itself from the running state into a blocking state. It can be seen that the blocking of a process is an active behavior of the process itself, and therefore it is possible to turn it into a blocking state only if the process is in a running state (acquiring the CPU). when a process goes into a blocking state, it does not consume CPU resources , such as socket.accept ().

File descriptor fd

File descriptor, a term in computer science, is an abstraction that describes a reference to a file.
The file descriptor is formally a non-negative integer. In fact, it is an index value that points to the record table in which the kernel opens a file for each process maintained by the process. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. In programming, some of the underlying programming often revolves around file descriptors. However, the concept of file descriptors is often applied only to operating systems such as UNIX and Linux.

Cache I/O

Cache I/O is also known as standard I/O, and most file system default I/O operations are cache I/O. In the Linux cache I/O mechanism, the operating system caches the I/O data in the file system's page cache, which means that the data is copied into the buffer of the operating system kernel before it is copied from the operating system kernel buffer to the application's address space . User space is not able to directly access the kernel space, the kernel state to the user state of the data copy

1.3. IO Model

IO model

Blocking IO

Non-blocking IO

Multiplexed IO

Signal-driven models

Asynchronous IO

Blocking IO " Blocking IO "

In Linux, all sockets are blocking by default

For example, Socket.accept (): involves 2 blocking processes: 1. Waiting for data preparation [waiting for client connection] 2. Copying data from the kernel to the process

features of blocking IO:

is that the two phases of Io execution are blocked by the block.

non-blocking IO "non-blocking io"

Can be set to non-blocking IO, non-blocking IO will continue to execute the following statement without blocking, regardless of whether or not the data is received. If there is data, then retrieve the data, if there is no data, the error

Q: Will non-blocking io always go down a single line?

A: Certainly not, the loop sends the RECEFROM system call, has then receives, did not continue to go down, repeats the call, the user process actually needs unceasingly the active inquiry kernel the data is good no, during the CPU release, may execute other code, the data copy stage is the blocking state, The operating system needs to copy the content from the kernel area to the user area

The disadvantage: The message can not be acquired in time, non-blocking IO will divide the block time into n fragments, during the period if there is a message received, Non-blocking will still follow a specific 5 seconds to receive messages, resulting in delayed reception of data.

multiplexed io[io event driver ]

Multiplexed io is also called IO event-driven io[event drivern io]. Where Select[apache]/epoll[nginx] is also multiplexed io.

The benefit of Select/epoll is that a single process can simultaneously handle multiple network connections of IO. The basic principle of the select/epoll is that the function will constantly poll all socketsthat are responsible, and when a socket has data arrives, notifies the user of the process.

When the user process invokes select, the entire process is blocked, and at the same time, kernel "monitors" all select-responsible sockets, and when the data in any one socket is ready, select returns. This time the user process then invokes the read operation, copying the data from the kernel to the user process. Only the process is the block of the Select function, not the socket IO.

The advantage of Select/epoll is not that a single connection can be processed faster, but that it can handle more connections.

Note 1:If a file is readable in the return result of the Select function, the process can let kernel copy the data that is in the kernel to the user area by calling accept () or recv ().

NOTE 2: The advantage of select is that multiple connections can be processed, not for a single connection

Asynchronous IO "asynchronized io"

After the user process initiates the read operation, you can begin to do other things immediately. On the other hand, from the perspective of kernel, when it receives a asynchronous read, first it returns immediately, so no block is generated for the user process. Then, kernel waits for the data to be ready and then copies the data to the user's memory, and when all this is done, kernel sends a signal to the user process to tell it that the read operation is complete.

1.4. IO FAQ

Q: What is the difference between blocking and non-blocking?

A: Calling blocking IO will block the corresponding process until the operation is complete, and non-blocking IO will return immediately when the kernel is ready for the data.

Q: What is the difference between synchronous IO and asynchronous IO?

A: A synchronous I/O operation causes the requesting process to being blocked until that I/O operationcompletes;

An asynchronous I/O operation does not cause the requesting process to be blocked;

Note: Blocking io,non-blocking Io,io multiplexing all belong to synchronous IO

Select,poll,epoll are all part of Io multiplexing, and IO Multiplexing is a category of synchronization, so Epoll is just a pseudo-async.

io differences:

the difference between Sellect, poll and Epoll

Select

Select was first seen in 1983 in 4.2BSD, and it is used by a select () system to monitor arrays of multiple file descriptors, and when select () returns, the ready file descriptor in the array is changed by the kernel to the flag bit. Allows the process to obtain these file descriptors for subsequent read and write operations.
Select is currently supported on almost all platforms
A disadvantage of select is that the maximum number of file descriptors that a single process can monitor is limited to 1024 on Linux, but can be improved by modifying the macro definition or even recompiling the kernel.
In addition, the data structure maintained by select () stores a large number of file descriptors, with the increase in the number of file descriptors, the cost of replication increases linearly. At the same time, because the latency of the network response time makes a large number of TCP connections inactive, but calling select () takes a linear scan of all sockets, so this also wastes some overhead.

Poll

It is not substantially different from select in nature, but poll does not have a limit on the maximum number of file descriptors.
It is generally not used, equivalent to the transition phase

Epoll

It was not until Linux2.6 that the kernel directly supported the implementation method, that is epoll. A multi-channel I/O readiness notification method that is considered to be the best performance under Linux2.6. Windows does not support
there is no limit to the maximum number of file descriptors .
For example, 100 connections, there are two active, Epoll will tell the user that the two two active, directly take the OK, and select is a loop over.

(understanding) Epoll can support both horizontal and edge triggering (edge triggered, which only tells the process which file descriptor has just become ready, it only says it again, and if we do not take action then it will not be told again, this way is called edge triggering), The performance of edge triggering is theoretically higher, but the code implementation is quite complex.
Another essential improvement is the epoll adoption of event-based readiness notification methods. In Select/poll, the kernel scans all monitored file descriptors only after a certain method is called, and Epoll registers a file descriptor beforehand with Epoll_ctl (), once it is ready based on a file descriptor, The kernel uses a callback mechanism like callback to quickly activate the file descriptor and be notified when the process calls Epoll_wait ().
So the so-called asynchronous Io, such as Nginx, Tornado, etc., we call it asynchronous Io, is actually IO multiplexing.

Example 1 (non-blocking IO):

Server

Import Timeimport Socketsk = Socket.socket (socket.af_inet,socket. SOCK_STREAM) # Sk.setsockoptsk.setsk.bind ((' 127.0.0.1 ', 6667)) Sk.listen (5) sk.setblocking (False) while True:    try :        print (' Waiting client connection ... ')        connection,address = Sk.accept ()   # process actively polls        print ("+ + +", Address)        Client_messge = Connection.recv (1024x768)        print (str (client_messge, ' UTF8 '))        Connection.close ()    except Exception as E:        print (e)        Time.sleep (4)

Client

Import Timeimport Socketsk = Socket.socket (socket.af_inet,socket. SOCK_STREAM) while True:    sk.connect ((' 127.0.0.1 ', 6667))    print ("Hello")    sk.sendall (bytes ("Hello", " UTF8 "))    Time.sleep (2)    break

Example 2 (IO multiplexing):

The task of listening is given to the kernel by invoking a function like select. IO multiplexing has two special system calls to select, poll, and Epoll functions. Select calls are kernel-level, and select polls are the difference between non-blocking polling-the former can wait for multiple sockets to be able to listen to multiple IO ports at the same time, and when any one of the sockets is ready, the data can be returned for readability. The process then makes the recvfrom system call, copying the data from the kernel to the user process, of course, the process is blocked

Server

Import Socketimport Selectsk = Socket.socket () sk.bind (("127.0.0.1", 9904)) Sk.listen (5) while True:    R, W, E = Select.select ([SK,], [], [], 5) for    I in R:        conn,add=i.accept ()        print (conn) print        ("Hello")    print (' >>>>>> ')

Client

Import Socketsk=socket.socket () Sk.connect (("127.0.0.1", 9904)) while 1:    inp=input (">>"). Strip ()    Sk.send (Inp.encode ("UTF8"))    Data=sk.recv (1024x768)    print (Data.decode ("UTF8"))

Multi-person Concurrent Chat

Server

Import Socketimport Selectsk=socket.socket () Sk.bind (("127.0.0.1", 8801)) Sk.listen (5) inputs=[sk,]while True:    R, W,e=select.select (inputs,[],[],5)    print (Len (r)) for    obj in R:        if Obj==sk:            conn,add=obj.accept ()            PRINT (conn)            Inputs.append (conn)        else:            data_byte=obj.recv (1024x768)            print (str (data_byte, ' UTF8 ' )            inp=input (' answer%s customer >>> '%inputs.index (obj) ')            Obj.sendall (bytes (INP, ' UTF8 '))    print (' >> ', R)

Client:

Import Socketsk=socket.socket () Sk.connect ((' 127.0.0.1 ', 8801)) while True:    inp=input (">>>>")    Sk.sendall (Bytes (INP, "UTF8"))    Data=sk.recv (1024x768)    print (str (data, ' UTF8 '))

"More References" http://www.cnblogs.com/yuanchenqi/articles/5722574.html

Python Learning---IO model 1227

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More