Overview
Before you explain, there are a few concepts that you should first describe:
-User space and kernel space
-Process Switching
-Blocking of the process
-File descriptor
-Cache I/O
user space and kernel space
Now that the operating system uses virtual memory, for 32-bit operating systems, its addressing space (virtual storage space) is 4G (2 of 32). The core of the operating system is the kernel, independent of ordinary applications, access to protected memory space, and all permissions to access the underlying hardware devices. In order to ensure the user process can not directly operate the kernel (kernel), to ensure the security of the kernel, worry about the system to divide the virtual space into two parts, part of the kernel space, part of the user space. For the Linux operating system, the highest 1G bytes (from the virtual address 0xc0000000 to 0xFFFFFFFF) for the kernel to use, called kernel space, and lower 3G bytes (from the virtual address 0x00000000 to 0xBFFFFFFF) for each process to use, Called User space.
Process Switching
To control the execution of a process, the kernel must have the ability to suspend processes running on the CPU and restore the execution of a previously suspended process. This behavior is referred to as process switching. So it can be said that any process is run under the support of the operating system kernel and is closely related to the kernel.
To run from one process to another, this process passes through the following changes:
1. Save the processor context, including program counters and other registers.
2. Update PCB information.
3. The process of the PCB into the appropriate queue, such as ready, in an event blocking queue.
4. Select another process to execute and update its PCB.
5. Update the data structure of the memory management.
6. Restore the processor context.
is very resource-intensive, specific can refer to this article: process switching
Note: Process Control blocks (processing controls block) are a data structure in the core of the operating system, which mainly represents the process state. The function is to make a program (including data) that cannot be run independently in a multiprogramming environment, as a basic unit capable of running independently or a process that executes concurrently with other processes. Or, the OS is based on the PCB to the concurrent implementation of the process of control and management. PCB is usually a continuous storage area in the system memory footprint, which stores all the information that the operating system needs to describe the process and control the process.
blocking of the process
The process being executed, because some of the expected events did not occur, such as the request for system resources failed, waiting for the completion of an operation, new data has not yet arrived or new work to do, and so on, the system will automatically perform blocking primitives (block), so that their own from the run state into a blocking As a result, a process blocking is an active behavior of the process itself, and therefore only a running process (obtaining a CPU) may turn it into a blocking state. When a process enters a blocking state, it does not consume CPU resources.
File Descriptor fd
A document descriptor (file descriptor) is a term in computer science and is an abstract concept that describes a reference to a file.
The file descriptor is formally a non-negative integer. In fact, it is an index value that points to the record table that the kernel opens files for the process that each process maintains. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. In programming, some programs that involve the bottom layer are often developed around the file descriptor. But the concept of file descriptors is often applied only to operating systems such as UNIX and Linux.
cache I/O
Cache I/O is also known as standard I/O, and most file system default I/O operations are cached I/O. In the Linux cache I/O mechanism, the operating system caches the I/O data in the file system's page cache, that is, the data is copied to the operating system kernel buffer before it is copied from the operating system kernel buffers to the application's address space.
Disadvantages of Caching I/O:
Data copy operations in the application address space and kernel are required during the transfer process, and the CPU and memory overhead of these data copying operations is very large.
I/O multiplexing scenarios
A situation like this: When a TCP client handles two inputs simultaneously: standard input and TCP sockets. The problem we encounter is that the service process is killed while the customer is blocking the fgets call (on the standard input). Server TCP may correctly send a fin to client TCP, but since the client process is blocking the process of reading from standard input, it will not see the EOF until it is read from the socket (possibly a long time). Such a process requires a capability to inform the kernel beforehand that the kernel notifies the process once it finds that one or more of the I/O conditions specified by the process are ready (that is, it is ready to be read, or the descriptor can take more output). This capability is called I/O multiplexing (I/O multiplexing)and is supported by both the Select and poll functions.
I/O multiplexing applies as follows:
(1) When a client processes multiple descriptors (typically interactive input and network socket interfaces), I/O multiplexing must be used.
(2) When a client handles multiple sets of interfaces at the same time, this is possible, but rarely occurs.
(3) If a TCP server handles both the listener socket and the connected sleeve interface, I/O multiplexing is generally used.
(4) If a server is to handle TCP, but also to handle UDP, generally use I/O multiplexing.
(5) If a server is to handle multiple services or multiple protocols, I/O multiplexing is generally used.
Compared with multiple processes and multithreading technology, I/O multiplexing technology has the greatest advantage of low system overhead, the system does not have to create processes/threads, and do not have to maintain these processes/threads, thereby greatly reducing the overhead of the system. Select handles several non-blocking socket connections simultaneously through a single process implementation. IO Model
5 I/O models available under UNIX: Blocking I/O (blocking io) non-blocking I/O (nonblocking io) I/Os multiplexing (IO multiplexing,select and poll) signal-driven I/O (signal drive N IO) asynchronous I/O (asynchronous IO)
The detailed explanation of each model content can refer to the sixth Chapter of UNIX network programming, which only briefly describes the next I/O multiplexing:
With I/O multiplexing (IO multiplexing), we can call select or poll, blocking one of the two system call types, rather than blocking the actual I/O system calls.
We block the select call and wait for the datagram socket to become readable. When the select returns the socket-readable condition, we call Recvfrom to copy all datagrams to the application process buffer.
Another I/O model closely related to I/O multiplexing is the use of blocking I/O in multiple threads. This model is very similar to the model described above, but it does not use Select to block multiple file descriptors, but instead uses multiple threads (one thread per file descriptor) so that each thread can freely invoke blocking I/O system calls such as Recvfrom.
Therefore, I/O multiplexing is characterized by a mechanism in which a process can wait for multiple file descriptors at the same time, and the Select () function can return any one of these file descriptors (socket descriptors) into a read-ready state.
Select function
Select handles several non-blocking socket connections simultaneously through a single process implementation . This function allows the process to instruct the kernel to wait for any one of the events to occur, and to wake it only if one or more events occur or undergo a specified amount of time.
Select (Rlist, Wlist, Xlist, Timeout=none)
The Select function monitors file descriptors in 3 categories, namely Writefds, Readfds, and Exceptfds. After invocation, the SELECT function blocks until a descriptive pair is ready (with data readable, writable, or except), or a time-out (timeout specifies the wait time, if immediately returned to null), the function returns. When the Select function returns, you can iterate through Fdset to find the ready descriptor.
As an example, we can call the Select to tell the kernel to return only when the following occurs: Any descriptor in the collection {1,4,5} is ready to read any descriptor in the collection {2,7} is ready to write any descriptor in the collection {1,4} has exception conditions to be processed
It's been going on for 10 seconds.
This means that the call to select tells the kernel what descriptors (read, write, or unusual conditions) are interested in and how long to wait.
Select is currently supported on almost all platforms, and its good cross-platform support is one of its advantages. One disadvantage of select is that the number of file descriptors that a single process can monitor is the most restrictive, typically 1024 on Linux, which can be elevated by modifying the macro definition or even recompiling the kernel. But this can also result in a decrease in efficiency (reason: to get the socket already ready by traversing the file descriptor). In fact, the large number of clients that are connected at the same time may have little in the ready state at a time, so the efficiency will decrease linearly as the number of the monitored descriptors grows. )。 An instance of select
The Python Select () method invokes the operating system's IO interface directly, which monitors sockets,open files, and pipes (all file handles with the Fileno () method) become readable and writeable, or communication errors, Select () makes it easier to monitor multiple connections at the same time, and this is more efficient than writing a long loop to wait and monitor a multiple-client connection because the select operates directly through the C network interface provided by the operating system, rather than through the Python interpreter.
The general process of personal summary is as follows:
Service side:
(1) Create a TCP/IP connection and create a socket
(2) Create a communication list
(3) The main cycle of inputs Handle inputs Handle outputs Handle "exceptional conditions"
Client:
(1) Create a connection socket
(2) Concurrent connection to the server
(3) to send and receive data
can refer to an English document: https://pymotw.com/2/select/ service Side
#-*-coding:utf-8-*-Import Select Import socket import \ Import Queue #通过这个demo来理解一下异步这种设计模式 #----------------------- ----------------------------Create a TCP/IP connection # Create a TCP/IP socket "" "Af_inet (also known as pf_inet) is a socket type IPV4 network protocol, AF_INET6 is IPV6
, while Af_unix is the UNIX system. There are two types of sockets commonly used for local communication: Streaming sockets (SOCK_STREAM) and datagram Sockets (SOCK_DGRAM).
A stream is a connection-oriented socket that is applied to a connection-oriented TCP service; A datagram socket is a connectionless socket that corresponds to a connectionless UDP service application. "" "Server = Socket.socket (socket.af_inet, socket.)
SOCK_STREAM) "" If flag is 0, the socket is set to non-blocking mode, otherwise the socket is set to blocking mode (default).
In non-blocking mode, if the call to Recv () does not find any data, or the Send () call does not immediately send the data, the Socket.error exception is caused. "" "Server.setblocking (0) # Bind the socket to the port server_address = (' localhost ', 12300) print >> Sys.stderr, ' Starting up '%s port%s '% server_address server.bind (server_address) # Listen for incoming Connections Server.listen ( 5) #---------------------------------------------------Communication List #select () method to receive and monitor 3 communication lists, the first one is all the input data, that is, the external sent from the information. The 2nd is to monitor and receive all incoming data (outgoing data), the 3rd monitor error message #第一个列表, all incoming connections of all clientsAnd the data will be processed in this list by the server's main loop program # Sockets from which we expect to read #开始时仅监视这个server是否有活动, there will be activity when someone else actively joins inputs = [Serv ER] #第二个列表 # Sockets to which we expect to write outputs = [] # Outgoing message queues (socket:queue) Message_queu es = {} # # #---------------------------------------------------main loop program while inputs: # The least one of the SOC Kets to is ready for processing #当检测所有文件句柄都没有活动的时候, blocking the state and printing the following results print >>sys.stderr, ' \nwaiting for the ' next Event ' #select检测这几个列表里面有没有就绪的文件描述符, returns three lists, that is, looping these three lists and returning the available (Ready) List readable, writable, exceptional = Select.select (i Nputs, outputs, inputs) #---------------------------------------------------processing after select # Handle inputs for S In readable: #---------------------------------------------------The first is "" If this socket is main "serv ER "socket, which is responsible for listening to the client's connection if it is a server that represents a client connection, this is a new connection, a new instance if this main server socket appears in the readable, it means that this is serv
The ER end has been ready to receive a new connection coming in. ' ' ' if (S is server): # A ' readable ' server socket is ready to accept a connection C
onnection, client_address = s.accept () print >> sys.stderr, ' New connection from ', client_address #为了能同时处理多个连接, in the following code, we set this to non-blocking mode connection.setblocking (0) #把这个新的连接放到input列表中, accept does not directly Recv, avoid blocking (because there may be a connection waiting, if the recv is blocked) inputs.append (connection) #字典将socket对象作为key, and to match a queue, that is, for each company All with a queue # give the connection a queue for data we want to send message_queues[connection] = Que Ue.
Queue () #进行下一次轮循 Else: #---------------------------------------------------Second
"" The socket is a connection that has been established, it sends out the data, and at this point you can get the data from it by recv () and then put the received data into the queue so that you can return the received data back to the client.
"" "Data = S.RECV (1024) if data: # A Readable client socket has data Print >> Sys.stderr, ' received '%s ' from%s '% (data, s.getpeername ()) #getpeername () returns the local protocol address associated with a socket me
Ssage_queues[s].put (data) # ADD output channel for response # Put the connection in the outputs list If s isn't in Outputs:outputs.append (s) #------------------------------------- --------------When the connection is already in the outputs list else: #如果没有data, you need to remove the corresponding link from the list, otherwise it will cause the inputs loop of the while, printing continuously waitin G for the next event # interpret empty result as closed connection print >> sys.stde
RR, ' closing ', client_address, ' After reading no data ' # Stop listening for input on the connection If s in Outputs:outputs.remove (s) # Now that the client is disconnected, I don't have to return the data to it, so if the client's connection object is still in the outputs list, Delete it Inputs.remove (s) # inputs also remove S.close () # Take this connection off # Remove Messa
GE Queue "" "Because Python is a reference, and Python has a GC mechanism, so the DEL statement acts on the variable, not on the data object, Del deletes the variable instead of the data" "
del Message_queues[s] #删除变量message_queues [s], releasing its reference to the corresponding data # Handle outputs #循环已经准备好了的, a list of connections that can send data For s in writable: "" "Try: < statement > except <NAME>: < statement >
#如果在try部份引发了名为 ' name ' exception, execute this code else: < statement > #如果没有异常发生, execute the code "" " Try: "" "Queue module: queue.put_nowait (): Add a task to the queue without blocking queue.get_nowait (): None
Blocked to get task "" "Next_msg = message_queues[s].get_nowait () except Queue.empty in queue:
# No Messages waiting so stop checking for writability.
Print >>sys.stderr, ' output queue for ', S.getpeername (), ' is empty ' outputs.remove (s) Else: Print >>sys.stderr, ' sending '%s ' to%s '% (next_msg, S.getpeerName ()) S.send (Next_msg.upper ()) #取到数据进行发送 # Handle "Exceptional conditions", handling error lists, such as communication errors in the link, first in the input
s and outputs list can be found and deleted for s in Exceptional:print (' handling Exceptional condition for ', S.getpeername ()) # Stop listening for input on the connection inputs.remove (s) if s in Outputs:outputs.remove (s) s.close () # Remove message Queue del Message_queues[s]
Client
#-*-coding:utf-8-*-#并发的去链接服务端, this example is two concurrent send information to the server without blocking #达到了用多线程进行网络通信的一个效果 import socket import SYS messages = [' This I s the message. ', ' It'll be sent ', ' in parts. ',] server_address = (' localhost ', 12300) # Create A TCP/IP socket #客户端起了两个链接 and put the link in the list, after which the list socks = [Socket.socket (socket.af_inet, socket. SOCK_STREAM), Socket.socket (socket.af_inet, socket. Sock_stream),] # Connect the socket to the port where the server is listening print >> sys.stderr, ' Conne Cting to%s port%s '% server_address for S-socks:s.connect (server_address) for message in messages: # Send Messages on both sockets for s in Socks:print >> sys.stderr, '%s:sending '%s '% (S.getsockname (), MES
SAGE) s.send (message) # Read responses on both sockets for s in Socks:data = S.RECV (1024)
Print >> sys.stderr, '%s:received '%s '% (S.getsockname (), data) if not data: Print >> sys.stderr, ' closing socket ', S.getsockname () S.close ()
Run Result:
Service side:
Client:
Poll
Poll related content I have not been learning to understand, can participate in the following some information and demo:http://www.cnblogs.com/alex3714/articles/5876749.html https:// www.zhihu.com/question/20122137/answer/14049112