I/O multiplexing
I/O multiplexing is used to enhance efficiency and a single process can monitor multiple network connection IO simultaneously
I/O refers to the Input/output
I/O multiplexing, through a mechanism, can monitor multiple file descriptors, once the descriptor is ready (read-ready and write-ready), can inform the program to do the appropriate read and write operations.
I/O multiplexing avoids blocking on Io, where messages that were originally for multiple processes or multithreading to receive multiple connections turned into a single process or a single thread to hold multiple sockets after polling processing. Select
The select is used by system calls to monitor a set of arrays of multiple file descriptors, and by calling Select () to return the result, the file descriptors in the array are tagged by the kernel, and the process can then get the file descriptors and then read and write them accordingly.
The actual execution process of select is as follows:
Select needs to provide an array to monitor and then copy the user state to the kernel state
Kernel-state linear loop monitoring array, which needs to traverse the entire array each time
The kernel discovers that the file descriptor state conforms to the operation result and returns it
So the socket we are monitoring to be set to non-blocking, only in this way to ensure that the advantages are not blocked
Basic platforms Support weaknesses
Every time you call Select, you need to copy the FD collection from the user state to the kernel state, which can be expensive when you have more FD.
The number of FD that a single process can monitor has a maximum limit because the data structure used is an array.
Each select is linearly traversing the entire array, and when the FD is large, the overhead of traversing is very large python uses the Select
R, W, E = Select.select (rlist, Wlist, errlist [, timeout])
Both Rlist,wlist and errlist are waitable objects, both file descriptors, an integer, or an object that has a function fileno () that returns a file descriptor.
Rlist: Array of file descriptors waiting to be read-ready
Wlist: Array of file descriptors waiting for write-ready
Errlist: Waiting for an array of exceptions
Under Linux These three lists can be empty lists, but not on Windows
When a file descriptor in the rlist array is readable (calling the Accept or read function), gets the file descriptor and adds it to the R array.
Gets the file descriptor added to the W array when the file descriptor in the Wlist array occurs writable
When an error occurs in the file descriptor in the Errlist array, the file descriptor is added to the E queue
When the timeout is not set, if the listening file descriptor does not change, it will block until the change occurs
When the timeout is set to 1 o'clock, if the listener does not change, the select blocks for 1 seconds and then returns three empty lists. If the change is made, it is executed and returned directly.
The parameters that can be received in the 3 list can be Python's file objects, such as the objects returned by Sys.stdin,os.open,open, and so on. The socket object will return Socket.socket (), or you can customize the class, as long as the appropriate Fileno function is available, provided the real filename descriptor
#-*-Coding:utf-8-*-
Import Select
Import socket
Import datetime
response = B "Hello, world!"
Sock = Socket.socket ()
# When you need to set the socket option, you need to first set the Socketlevel to Sol_socket sol=socket option level
# So_ REUSEADDR represents the reuse of the address reuse addr
sock.setsockopt (socket. Sol_socket, SOCKET. SO_REUSEADDR, 1)
sock.bind (("localhost", 10000))
Sock.listen (5)
sock.setblocking (0)
inputs = [ Sock,] while
True:
print (Datetime.datetime.now ())
rlist, wlist, errlist = select.select (inputs, [], [] (a)
print (">>>", Rlist, Wlist, errlist) for
s in rlist:
if s = = sock:
con, addr = s.accep T ()
# Adds a new request connection to the monitor list
inputs.append (con)
else:
# to receive information for other file descriptors and return
try:
data = S.RECV (1024)
if data:
s.send (response)
finally:
s.close ()
Inputs.remove (s)
Poll
Poll is essentially the same as SELECT, except that the maximum number of connections monitored is not limited to the Select, because the data structure used by poll is a linked list, and the select uses an array, which is to initialize the length size and cannot change
Poll principle
To copy the FD list from the user state to the kernel state
Kernel traversal, after discovering that the FD status is ready, return the FD list
Poll State
Pollin have data read
POLLPRT have data emergency read
pollout prepare output: Output does not block
pollerr Some error conditions occur
pollhup suspend
pollnval Invalid request: Description cannot be opened
Advantages
Cross-platform usage disadvantage
Every time you call Select, you need to copy the FD collection from the user state to the kernel state, which can be expensive when you have more FD.
Each select is linearly traversing the entire list, and when the FD is large, the overhead of the traversal is very large Python uses poll
Poll method
Register, registering the file descriptor to be monitored in poll and adding the monitored event type
Unregister, logoff file descriptor monitoring
Modify, modifying file descriptors to monitor event types
Poll ([timeout]), rotation registers the monitored file descriptor, returns the meta ancestor list, the meta ancestor content is a document descriptor and the monitoring type (
Pollin,pollout, and so on), if timeout is set, it blocks timeout seconds and then returns the control list, which blocks until there is a return value if no timeout microseconds are set.
#-*-coding:utf-8-*-Import Select Import Socket import DateTime sock = Socket.socket () sock.setsockopt (socket. Sol_socket, SOCKET. SO_REUSEADDR, 1) sock.bind (("localhost", 10000)) Sock.listen (5) # is set to Non-blocking sock.setblocking (0) poll = Select.poll () poll.re Gister (sock, select. Pollin) connections = {} while True: # traverse monitored file descriptor print (Datetime.datetime.now ()) for FD, event in Poll.poll (10000): If event = Select.
Pollin:if FD = = Sock.fileno (): # If the current sock is present, then receive request con, addr = sock.accept () Poll.register (Con.fileno (), select.
Pollin) Connections[con.fileno ()] = Con else: # If it is a listening request, read its contents and set it to wait for write listening con = connections[fd] data = CON.RECV (1024) if Data:print ("%s Accept%s"% (fd, data) poll.modify) (FD, select.
pollout) Else:con = connections[fd] Try: Con.send (b "Hello,%d"% fd) print ("Con >>>", con) finally:
Poll.unregister (Con) connections.pop (FD) Con.close ()
Epoll
Epoll is equivalent to the Linux kernel support method, and Epoll is mainly to solve some of the shortcomings of Select,poll
Array length limit
Solution: The FD limit is the maximum number of files that can be opened, the specific number can be viewed/proc/sys/fs/file-max. Typically, it's about memory.
You need to copy all the arrays to the kernel state each time you poll
Solution: Each time the event is registered, the FD is copied to the kernel state, not every time the poll is copied, thus ensuring that each FD only needs to be copied once.
Each traversal requires a linear traversal of the list
Solution: No longer use the traversal scheme, give each FD a callback function, when FD is ready, call the callback function, this callback function will add FD to the Ready list of FD, so epoll only need to traverse the ready lists.
Two kinds of event models existing in Epoll
Horizontal trigger Level-triggered,epoll The default event model for FD is a horizontal trigger, that is, when the FD can read and write, it triggers and returns FD, for example, when FD is readable, but the recv is not fully read, and the next time the FD triggers the return, relatively speaking, This is safer. Some
edge trigger edge-triggered, Epoll can trigger an edge on an FD, the edge trigger means that every time I just trigger one time I will give you back once, even if you finish half of the processing, I will not return to you, unless he next another event.
Use Example: Epoll.register (Serversocket.fileno (), select. Epollin | Select. Epollet)
Python uses Epoll
#-*-coding:utf-8-*-Import Select Import Socket import datetime EOL1 = B ' \ n ' EOL2 = B ' \n\r\n ' response = B ' HTTP/1
.0 Ok\r\ndate:mon, 1 1996 01:01:01 gmt\r\n ' response + = B ' content-type:text/plain\r\ncontent-length:13\r\n\r\n '
Response + B ' Hello, world! ' Sock = Socket.socket () sock.setsockopt (socket. Sol_socket, SOCKET.
SO_REUSEADDR, 1) sock.bind (("localhost", 10000)) Sock.listen (5) sock.setblocking (0) Epoll = Select.epoll () Epoll.register (sock, select. Epollin) # To increase the request and response operation connections = {} requests = {} responses = {} Try:while true:print (Datetim) for long connections E.datetime.now ()) events = Epoll.poll (1) Print (events) for FD, event in Events:if FD
= = Sock.fileno (): # Receive request con, addr = sock.accept () con.setblocking (0) Epoll.register (Con, select.) Epollin | Select.
Epollet) Connections[con.fileno ()] = Con Requests[con.fileno ()] = B " Responses[con.fileno ()] = Response Elif Event & Select. Epollin:print ("ssssssssssssss") con = connections[fd] requests[fd] + = con
. recv (1024) # To determine if con has been completely sent to complete the if EOL1 in REQUESTS[FD] or EOL2 in REQUESTS[FD]: Epoll.modify (FD, select.) epollout) Print ('-' * + ' \ n ' + requests[fd].decode () [: -2]) elif event & Select. Epollout: # Send complete, will fd hang con = connections[fd] Byteswritten = Con.send (respo NSES[FD]) # intercepts the sent content and determines whether it is completely sent, finished, Epoll hung fd,fdshutdown responses[fd] = responses[fd][ Byteswritten:] If Len (responses[fd]) = = 0:epoll.modify (fd, 0) con . Shutdown (socket. SHUT_RDWR) elif Event & Select.
Epollhup: # processing suspend FD, Epoll cancellation FD, close socket, connections remove FD Epoll.unregister (FD) connections[fd].close () del CONNECTIONS[FD] Finally:epoll.unre
Gister (sock) Sock.close ()