Python Concurrency Learning Summary

Source: Internet
Author: User
Tags readable

Directory

    • First, understand the operating system
    • Ii. Type of Task
    • Third, Socket module
    • 四、一个 Simple c/S program
    • V. Using blocking IO for concurrency
      • Scenario One: Blocking io+ multi-process
      • Scenario Two: Blocking io+ multithreading
      • Thinking and summarizing of blocking IO model
    • Vi. using non-blocking IO for concurrency
      • Scenario One: Non-blocking io+try+ polling
      • Scenario two: Non-blocking Io+select proxy polling
        • Select Function Interface Description
        • Thinking about the efficiency of polling
      • Scenario Three: Non-blocking io+selectors+ callback function + Event loop (to be added later)
      • Scenario four: Non-blocking io+ coprocessor + callback function + Event loop (to be added later)
      • Consideration and summary of non-blocking IO (to be added later)
    • Seven, the difference and thinking about synchronous/asynchronous, blocking io/non-blocking IO
First, understand the operating system

The operating system ( OS ) governs all the hardware of the computer and is responsible for allocating and reclaiming hardware resources for the application.
Hardware resources are always limited, and applications ' desires for resources are greedy.
When the hardware resource contention occurs in multiple applications, it OS is responsible for scheduling and ensuring the multi-tasking resource allocation to ensure the stable execution of the system.
Only the CPU code can be executed, so before the application (Task) executes, it must apply to the CPU resource, at the same moment, one CPU can only execute one task code.
The number of computers CPU (the resource side) is much smaller than the number of tasks that need to be performed (demand side), the CPU resources of the operating system are divided by time slices and assigned according to the task type, and the tasks are used in turn CPU .
CPUThe execution/switching speed is very fast, and for the user, multitasking looks like execution at the same time, which is called concurrency.

The following are serial and concurrency comparisons:

The computer's memory, hard disk, network card, screen, keyboard and other hardware provide a place for data exchange.
OSInterfaces are provided IO to enable data exchange, and the process of data exchange is generally not required to CPU participate.
IOThere are two types of interfaces:
1. Blocking IO
IOin the event of (data exchange), the calling thread cannot execute the remaining code down, with intent to occupy CPU but not execute any code, and the single-threaded blocking IO itself cannot support concurrency
2, non-blocking IO
Occurs IO (data exchange), the calling thread can execute the remaining code down, and the single-threaded non-blocking IO itself can support concurrency

The following are the comparison of blocking IO and non-blocking IO:

Ii. Type of Task

CPUThere are two types of partitioning based on the proportions that are consumed during a task execution:
1. CPU-Intensive
Most of the time is occupied CPU and executed code, such as Scientific computing tasks
2, IO-intensive
Most of the time is not occupied CPU , but in the IO operation, such as network services

Third, Socket module

OSThere are two types of interfaces for blocking IO and non-blocking IO, which the application can choose by itself.
SocketThe module encapsulates two interfaces, and Socket the function provided by the module defaults to the blocking IO type.
The user can choose to manually switch to the non-blocking IO type, using the socketobj.setblocking(False) switch to nonblocking io mode.
The following will be a simple example program to record the learning of concurrency thinking and summary.

四、一个 Simple c/S program

Client: Iterates over the user's input and sends it to the server. Receive feedback from the server and print to the screen.
Server: The received user input is capitalized and returned to the client.

Client code is fixed, and the main consideration is server-side code.
In general, we will write the service-side code:

# 服务器端import socketaddr = (‘127.0.0.1‘, 8080)server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)server.bind(addr)server.listen(5)print(‘监听中...‘)while True:  # 链接循环    conn, client = server.accept()    print(f‘一个客户端上线 -> {client}‘)    while True:  # 消息循环        try:            request = conn.recv(1024)            if not request:                break            print(f"request: {request.decode(‘utf-8‘)}")            conn.send(request.upper())        except ConnectionResetError as why:            print(f‘客户端丢失,原因是: {why}‘)            break    conn.close()

The client code remains the same:

# 客户端import socketaddr = (‘127.0.0.1‘, 8080)client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)client.connect(addr)print(f‘服务器{addr}连接成功‘)while True:  # 消息循环    inp = input(‘>>>‘).strip()    if not inp: continue    try:        client.send(inp.encode(‘utf-8‘))        response = client.recv(1024)        print(response.decode(‘utf-8‘))    except ConnectionResetError as why:        print(f‘服务端丢失,原因是: {why}‘)        breakclient.close()

This form of coding I call: single-threaded + blocking io+ loop serial , with the following features:
1, simple coding, concise model, strong readability
2, serial service, the user must use the server a queue

A single-thread blocking IO model is not capable of supporting concurrency, and if concurrency is to be supported, there are two types of solutions.

V. Using blocking IO for concurrency

Single-threaded blocking IO is inherently impossible to implement concurrently. As soon as the IO block occurs, the thread blocks and the code below does not continue. If you want to use single-threaded blocking IO to achieve concurrency, you need to increase the number of threads or the number of processes, and when one thread/process is blocked, it is OS scheduled to be executed by another thread/process.

Scenario One: Blocking io+ multi-process
  server-side code import socketfrom multiprocessing Import processdef Task (conn): "" Communication loop handler function "" "while True:try:request = CONN.RECV (1024x768) if not request:break print (f            "Request: {Request.decode (' Utf-8 ')}") Conn.send (Request.upper ()) except Connectionreseterror as why: Print (f ' client lost due to: {why} ') breakif __name__ = = ' __main__ ': # Windows needs to write the new process to main, otherwise it will error addr = (' 127.0.0.1 ', 8080) server = Socket.socket (socket.af_inet, socket. SOCK_STREAM) server.bind (addr) Server.listen (5) print (' Listening ... ') while true:conn, client = Server.accep T () print (f ' a client-on-line {client} ') p = Process (Target=task, args= (conn)) # turns on subprocess processing with the user's message loop P.star T ()  

A single process is still blocked by encapsulating the server's message loop operations to the user into the process.
scheduling between processes is entrusted OS (important).
The process is too heavy, the creation and destruction processes require considerable overhead, and the number of processes that a single device can cover is very limited (typically around hundreds of).
The switching overhead between processes is also not small.
When the number of processes is less than or equal to CPU the number of cores, real parallelism can be achieved, and when the number of processes is greater than CPU the core, it is still executed concurrently.

Scenario Two: Blocking io+ multithreading
  server-side code import socketfrom Threading Import threaddef Task (conn): "" Communication loop handler function "" "while True: Try:request = CONN.RECV (1024x768) if not request:break print (f "reques            T: {request.decode (' Utf-8 ')} ") Conn.send (Request.upper ()) except Connectionreseterror as why: Print (f ' client lost due to: {why} ') breakif __name__ = = ' __main__ ': addr = (' 127.0.0.1 ', 8080) server = socket.so Cket (socket.af_inet, socket. SOCK_STREAM) server.bind (addr) Server.listen (5) print (' Listening ... ') while true:conn, client = Server.accep T () print (f ' a client-on-line {client} ') T = Thread (Target=task, args= (conn)) # starts multithreading with the user's message loop T.start ()

The server-to-user action is encapsulated in a thread, and io blocking still occurs in a single thread.
Scheduling between threads is handled by the OS (important).
Threads are lighter, and the cost of creating and destroying them is small, but the number of threads is not too large, and a single device can typically hold hundreds of to thousands of threads.
Note: Because of the existence of CPython Gil, multithreaded code written using CPython can only use one CPU core, in other words, execute Python multithreaded code using the official interpreter, which cannot be parallel (in a single process).
Switching overhead between threads is relatively small.
In fact, the biggest problem with multithreading is not that there are too few concurrent numbers, but data security issues.
Threads share data from the same process, and in the process of frequent IO operations, it is unavoidable to modify shared data, which requires additional processing, and when the number of threads increases, the problem of how to properly handle data security becomes a major challenge.

Thinking and summarizing of blocking IO model

1, multi-threaded and multi-process are based on the blocking IO mode provided by the concurrency, the programming model is relatively simple, the readability is very high.
2. If you use multithreaded/process scenarios to provide concurrency, system stability will decrease as the number of threads/processes grows. While a thread/process pool can be used to provide some optimizations, the pool will be less effective after a certain amount. Therefore, both can not support ultra-large-scale concurrency (such as c10m and above).
3, the thread/process switching OS to scheduling, scheduling strategy based on OS the algorithm, the application can not be active control, can not be targeted to the characteristics of the task to do some of the necessary scheduling algorithm adjustment.
4, coding thinking directly, easy to understand, the learning curve is gentle.
5, multi-threaded/process program can be understood as a simple increase in resources, if you want to support the ultra-large-scale concurrency, the simple increase in the behavior of resources is not reasonable (resources can not be unlimited or always consider the cost and efficiency, and the greater the number, the original shortcomings will become more prominent).
6, another solution of the core idea is: Change the IO model.

Vi. using non-blocking IO for concurrency

Single-threaded non-blocking IO model, which itself directly supports concurrency, why? Take a look back at the process picture of blocking IO and non-blocking IO.
The core of the non-blocking IO interface is that the calling thread OS returns the result directly to the originating IO call, OS so the calling thread is not blocked and can execute the code below. However, because it does not block, the calling thread cannot determine that the result of an immediate return is not the desired result, so the calling thread needs to add extra action to judge the returned result, which makes programming more difficult (not a little bit more difficult).

There are two scenarios for judging the results of an immediate return:

    1. Polling
      Threads proactively initiate queries and judgments on a regular/irregular basis
    2. callback function + Event loop
      A thread registers a callback function when it initiates IO, and then processes the event loop uniformly

Note: Non-blocking IO implementations have multiple solutions, the programming model is not readable, and some of the programming thinking is even obscure, difficult to understand, and difficult to encode.

Scenario One: Non-blocking io+try+ polling
Server-side code Import SOCKETADDR = (' 127.0.0.1 ', 8080) server = Socket.socket (socket.af_inet, socket. SOCK_STREAM) server.bind (addr) server.setblocking (False) Server.listen (5) print (' Listening in ... ') # needs to execute the received Conn object into this list recv_ list = []# conn object and data to send data into this list send_list = []# perform link loop while True:try:conn, client = Server.accept () # The line succeeds, stating that the return value is conn,client print (f ' one client on line, {client} ') # Put the successful link conn into the list and execute the Conn message receive action when the Accept error occurs R Ecv_list.append (conn) except Blockingioerror: # Execute accept is unsuccessful, meaning there is no connection currently # Other tasks can be performed before the next execution of the accept (message receive operation  # cannot perform a remove operation on the Receive list during traversal, use the temporary list to store the Conn object that needs to be deleted del_recv_list = [] # Perform a receive operation on a conn list that has been successfully linked for Conn in recv_list: # for each Conn object, execute recv GET request try: # recv is also non-blocking requ                    EST = conn.recv (1024) # execution succeeds, process request if not request: # Current Conn Link has expired Conn.close () # no longer receives messages for this conn link,Add Invalid conn to delete list del_recv_list.append (conn) # Current conn processing complete, switch to the next Conti Nue # request has a message to process and then need to join the Send list in response = Request.upper () # Send list needs to hold tuples, send con N and the data sent Send_list.append ((conn, response)) except Blockingioerror: # Current Conn Data not yet                Ready to handle the next conn continue except Connectionreseterror: # Current Conn invalid, no longer receiving this conn message  Conn.close () Del_recv_list.append (conn) # Could not process remove during Send list traversal, use temporary list del_send_list =            [] # Receive list is all processed, ready to process send list for item in Send_list:conn = item[0] response = item[1] # Execute Send try:conn.send (response) # sent successfully, should remove this item from the Send list DEL_            Send_list.append (item) except Blockingioerror: # The Send buffer may be full, leaving the next send processing continue Except ConnectIonreseterror: # link Expires conn.close () del_recv_list.append (conn)         Del_send_list.append (item) # Delete Conn objects that have been invalidated in the receive list for Conn in Del_recv_list:recv_list.remove (conn) # Delete an object in the Send list that has been sent or does not need to be sent for item in Del_send_list:send_list.remove (item)

The server uses a single thread for concurrency.
For accept multiple conn objects received, join the list and provide multiuser access by traversing the Read list and sending list.

The functions provided by the modules in a single thread are Socket IO set to: Non-blocking IO type.
Added extra action: The result of an immediate return on a non-blocking call, used Try to determine if the expectation is expected.
Since the result of not knowing when to return is expected, it is necessary to keep initiating the call and pass it Try to judge, that is, polling.
During two polling, threads can perform other tasks. But the model also just keeps initiating polling, and doesn't take advantage of these times.

The coding model is complex and difficult to understand.

Optimization: The work of the active polling in this model is the responsibility of the program, in fact, can be handed over to the OS operation. In this way, the application does not need to write the polling section, it can focus more on the business logic ( upper() the part), and Python provides the Select module to handle the application's polling work.

Scenario two: Non-blocking Io+select proxy polling
Server-side code import socketimport selectaddr = (' 127.0.0.1 ', 8080) server = Socket.socket (socket.af_inet, socket. SOCK_STREAM) server.bind (addr) server.setblocking (False) Server.listen (5) print (' Listening ... ') # The first server object needs to be monitored, once readable, Description can be performed acceptread_list = [server,]# need to listen to the Write list, once the WL in the Writable object processing the send, it should also be removed from this list write_list = []# used to temporarily hold a sock object to send data  _dic = {}# Repeatedly initiates a select query while True: # initiates a select query, attempts to get the socket object that can be manipulated RL, WL, XL = Select.select (Read_list, Write_list,            [], 1) # operation readable list for sock in RL: # If the object in the readable list is server, meaning there is a link, then the server can execute the accept if sock is server: # Execute accept must not error, so do not need try conn, client = Sock.accept () # Once you get conn, you need to add this conn to the readable list re  Ad_list.append (conn) Else: # indicates that a readable object is a normal Conn object, and the link invalidation problem is handled when the recv is executed try:request = SOCK.RECV (1024x768) except (Connectionreseterror, Connectionabortederror): # This link fails s           Ock.close () read_list.remove (sock) Else: # You also need to continue to judge the contents of the request if not request: # indicates that this conn link is invalid                Sock.close () # no longer monitors this Conn Read_list.remove (sock) continue                # Processing Request response = Request.upper () # Join Send list write_list.append (sock)        # Save sent data Data_dic[sock] = response # action writable list for sock in WL: # Perform a send operation, send will also error            Try:sock.send (Data_dic[sock]) # After sending, you need to remove the Send list write_list.remove (sock)             # need to remove send data data_dic.pop (sock) except (Connectionreseterror, Connectionabortederror): # This link is invalid Sock.close () read_list.remove (sock) write_list.remove (sock)

The server uses a single thread for concurrency.
Once the Select module is used, the application no longer needs to write the code for the active polling, but instead the part of the work is handed over to Select the module's select function.
The application only needs to traverse select The actionable list returned by the function socket and handle the related business logic.
Although the application threw polling work out select , it didn't have to write code. However select , the low-level interface of the function is inefficient, and the interface can be used to epoll improve efficiency, which is encapsulated in the Selectors module.
In addition, the select function is a blocking IO, and the thread spends most of its time blocking the function when the number of concurrent times is low select . so select the function should be suitable for socket scenes with ready, large-scale concurrency at any moment.
Coding is difficult, the model is difficult to understand.

Select Function Interface Description
def select (Rlist, Wlist, Xlist, Timeout=none): # Real signature unknown; Restored from __doc__ "" "Select (Rlist, wlist, xlist[, timeout])--(Rlist, Wlist, xlist) Wait until one O    R more file descriptors is ready for some kind of I/O.    The first three arguments is sequences of file descriptors to being waited for:rlist--wait until ready for reading Wlist--wait until ready for writing xlist--wait for a ' exceptional condition ' If only one kind of condition I    s required, pass [] for the other lists. A file descriptor is either a socket or file object, or a small integer gotten from a Fileno () method call on one of th        Ose. The optional 4th argument specifies a timeout in seconds;  It is a floating point number to specify fractions of seconds.        If it is absent or None, the call would never time out. The return value is a tuple of three lists corresponding to the first three arguments; Each contains the subset of the CORRESPOnding file descriptors that is ready. IMPORTANT NOTICE * * * on Windows, only sockets is supported;    On Unix, the all file descriptors can used. "" "Pass
    1. Input 4 parameters (3 position, 1 default), return 3 values
    2. The Select function is blocking IO, and the function's return must wait until at least 1 file descriptors are ready
    3. Positional parameters rlist/wlist/xlist are: Read list/write list/exception list to be monitored (the 3rd parameter is not understood)
    4. Under, the windows list can only be placed socket对 as, unix under, could put any file descriptor
    5. The 4th parameter, if it is None (the default), is permanently blocked, otherwise a time-out is given (in seconds), and you can use a decimal number such as 0.5 seconds
    6. The return value is a list of 3, which covers the file descriptor objects that can be manipulated
Thinking about the efficiency of polling

Polling operation, not high efficiency.
Polling is a working perspective: the initiator initiates the inquiry periodically/irregularly, and if the data is not ready, continue to initiate the inquiry. If the data is ready, the initiator processes the data.
Assuming that the caller finds the data ready at the 35th polling time, it means that the first 34 active polling operations have no benefit.
If the caller wants to know if the data is ready, they should ask for it voluntarily, but the efficiency of unsolicited inquiry is relatively low.
The key to this paradox is: How do you know the data is ready?

Use the callback function + event loop .
In this scenario, the caller does not actively initiate polling, but instead passively waits for the IO operation to complete and is OS notified by an event that is ready for the caller.

Scenario Three: Non-blocking io+selectors+ callback function + Event loop (to be added later)

Pass

Scenario four: Non-blocking io+ coprocessor + callback function + Event loop (to be added later)

Pass

Consideration and summary of non-blocking IO (to be added later)
    1. If the IO model of an IO-intensive task is set to non-blocking, this task type will gradually shift from IO-intensive to CPU-intensive.
    2. Non-blocking IO programming model is difficult, readability is poor, model comprehension is difficult
      Pass
Seven, the difference and thinking about synchronous/asynchronous, blocking io/non-blocking IO
    1. Blocking IO and nonblocking io refer to the OS two IO interfaces provided, the difference being whether the call returns immediately.
    2. Synchronous and asynchronous refers to the execution model between two tasks
      Synchronization: Two tasks are interrelated, tasks depend on each other, and the Order of task execution has certain requirements.
      Async: Two tasks are less relevant, tasks can be isolated from each other, task execution order is not required
    3. There are many kinds of understanding about synchronous blocking, synchronous non-blocking, asynchronous blocking, asynchronous non-blocking, and different understanding. I think we should use synchronous/asynchronous as a class to describe the task execution model and block/nonblocking io as a class to describe the IO invocation model.

The following is a simple example of synchronous/asynchronous based on the various interpretations on the web, combined with my own thoughts:

    1. Synchronous
      The first day, dinner time, you are hungry, you go to your wife said: "Wife, I am hungry, quick cooking!" Your wife replied: OK, I'll go and cook.
      You went to the kitchen with your wife and your wife spent 30 minutes cooking for you. During this period, you stand around, do nothing, so look at her, your wife asked you: Why do you stand here? You said: I will wait for you to finish the meal before you go. After 30 minutes, you have supper.

    2. Asynchronous + Polling
      The next day, dinner time, you hungry, you yell: wife, I'm hungry, hurry up and cook! Your wife replied: OK, I'll go and cook.
      Your wife spends 30 minutes cooking for you, but you no longer follow your wife to the kitchen. During this period, you watch TV in the living room, but you are really hungry, so you go to the kitchen every 5 minutes to ask: wife, the meal is ready? Your wife replied, "It will be a while." After 30 minutes, you have supper.

    3. asynchronous + Event Notification
      The third day, supper time, you hungry, you shout: wife, I am hungry, hurry up to cook! Your wife replied: OK, I'll go and cook.
      Your wife spends 30 minutes cooking for you, and you're not going to follow your wife to the kitchen. During this time, you watch TV in the living room, you know your wife is cooking, you will not rush her, concentrate on watching TV. 30 minutes later, your wife is calling you: the meal is ready. At last you have supper.

Python Concurrency Learning Summary

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.