Summary of Python multi-process concurrency and multi-thread concurrent programming instances, python programming instances
This example summarizes Python multi-process concurrency and multi-thread concurrency. We will share this with you for your reference. The details are as follows:
Here we will briefly summarize several concurrent methods supported by python.
The concurrency supported by Python is divided into multi-thread concurrency and multi-process concurrency (asynchronous IO is not covered in this article ). In terms of concept, multi-process concurrency means running multiple independent programs. The advantage is that all tasks processed concurrently are managed by the operating system, the disadvantage is that communication and data sharing between programs and processes are not convenient; multi-thread concurrency is managed by programmers for concurrent tasks, this concurrency method can easily share data between threads (provided that data cannot be mutually exclusive ). Python's support for multithreading and multi-process is more advanced than general programming languages, minimizing the work that needs to be done.
I. multi-process concurrency
Mark Summerfield pointed out that for computing-intensive programs, multi-process concurrency is better than multi-thread concurrency. Computing-intensive programs indicate that most of the running time of programs is consumed in the computing process of the CPU, while the read/write time of hard disks and memory is very short. Relatively, IO-intensive programs mean that most of the program's running time is spent on hard disk and memory read/write, and the CPU computing time is very short.
For multi-process concurrency, python supports two implementation methods: multiprocessing. joinableQueue: This Data Structure manages the "locking" process by itself, so programmers do not have to worry about the "deadlock" problem. python also provides a more elegant and advanced implementation method: using the process pool. The following is a one-to-one introduction.
1. Queue implementation -- use multiprocessing. JoinableQueue
Multiprocessing is a module that supports multi-process concurrency in the python standard library. Here we use the data structure in multiprocessing: JoinableQueue, which is essentially a FIFO queue, the difference between it and a queue (such as a Queue in a queue) is that it is multi-process secure, which means we don't have to worry about its mutex and deadlock issues. JoinableQueue is mainly used to store the executed tasks and collect the execution results of the tasks. For example ):
def read(q): while True: try: value = q.get() print('Get %s from queue.' % value) time.sleep(random.random()) finally: q.task_done()def main(): q = multiprocessing.JoinableQueue() pw1 = multiprocessing.Process(target=read, args=(q,)) pw2 = multiprocessing.Process(target=read, args=(q,)) pw1.daemon = True pw2.daemon = True pw1.start() pw2.start() for c in [chr(ord('A')+i) for i in range(26)]: q.put(c) try: q.join() except KeyboardInterrupt: print("stopped by hand")if __name__ == '__main__': main()
For windows multi-process concurrency, the program file must contain "entry function" (such as the main function), and the entry point must be called at the end. For exampleif __name__ == '__main__': main()
End.
In this simplest multi-process concurrency example, we use multiple processes to print 26 letters. Define a JoinableQueue object for storing tasks, and instantiate two Process objects (each object corresponds to a sub-Process). to instantiate a Process object, the target and args parameters must be transmitted, target is a specific function used to implement each task. args is a parameter of the target function.
pw1.daemon = Truepw2.daemon = True
These two statements set the sub-process as the daemon process -- the main process ends after it ends.
pw1.start()pw2.start()
Once these two statements are run, the child process starts to run independently of the parent process. It will call the function referenced by the target in a separate process-here it is the read function, it is an infinite loop that reads and prints the numbers in the q Parameter one by one.
value = q.get()
This is the main point of multi-process concurrency. q is a JoinableQueue object and supports the get method to read the first element. If q does not contain any element, the process will be blocked until it is saved to a new element in q.
Thereforepw1.start()
pw2.start()
After these two statements, the sub-process started to run, but soon blocked.
for c in [chr(ord('A')+i) for i in range(26)]: q.put(c)
Put 26 letters into the JoinableQueue object in sequence. At this time, the two sub-processes are no longer blocked and start to really execute the task. Both sub-processes use value = q. get () to read data. They are all modifying the q object, and we don't need to worry about synchronization. This is multiProcessing. the advantage of the Joinable data structure is that it is multi-process secure and it automatically handles the "locking" process.
try: q.join()
q.join()
The method will query whether the data in q has been read. -- This indicates whether the task has been completed. If not, the program will block the execution and wait until the data in q is read (you can use Ctrl + C to force the execution to stop ).
In Windows, you can see that multiple sub-processes are running by calling the task manager.
2. Process pool implementation -- use concurrent. futures. ProcessPoolExecutor
Python also supports a more elegant multi-process concurrency mode. Let's look at the example:
def read(q): print('Get %s from queue.' % q) time.sleep(random.random())def main(): futures = set() with concurrent.futures.ProcessPoolExecutor() as executor: for q in (chr(ord('A')+i) for i in range(26)): future = executor.submit(read, q) futures.add(future) try: for future in concurrent.futures.as_completed(futures): err = future.exception() if err is not None: raise err except KeyboardInterrupt: print("stopped by hand")if __name__ == '__main__': main()
Here we use the concurrent. futures. ProcessPoolExecutor object. We can think of it as a process pool and fill in the child process ". We use the submit method to instance a Future object, and then fill in the Future object here in the pool -- Future, where futures is a set object. As long as there is a future in the process pool, the task will start to be executed. Here the read function is simpler-simply print a character and sleep for a while.
try: for future in concurrent.futures.as_completed(futures):
This is to wait for all sub-processes to complete. The sub-process may throw an exception during execution,err = future.exception()
These exceptions can be collected for later processing.
It can be seen that the use of the Future object to process multi-process concurrency is more concise. Whether it is the compilation of the target function or the startup of sub-processes, the future object can also report its status to the user, you can also report execution results or exceptions during execution.
Ii. multi-thread concurrency
For IO-intensive programs, multi-thread concurrency may be better than multi-process concurrency. For IO-intensive tasks such as network communication, network latency is the main factor that determines program efficiency. At this time, it does not matter whether processes or threads are used.
1. queue implementation -- Use Queue. queue
The program is basically consistent with the multi-process, but we do not need to use it heremultiProcessing.JoinableQueue
The queue (from Queue. queue) can meet the following requirements:
def read(q): while True: try: value = q.get() print('Get %s from queue.' % value) time.sleep(random.random()) finally: q.task_done()def main(): q = queue.Queue() pw1 = threading.Thread(target=read, args=(q,)) pw2 = threading.Thread(target=read, args=(q,)) pw1.daemon = True pw2.daemon = True pw1.start() pw2.start() for c in [chr(ord('A')+i) for i in range(26)]: q.put(c) try: q.join() except KeyboardInterrupt: print("stopped by hand")if __name__ == '__main__': main()
Here, we instantiate a Thread object instead of a Process object. The rest of the program looks like a multi-Process.
2. Thread Pool implementation -- use concurrent. futures. ThreadPoolExecutor
Let's look at the example:
def read(q): print('Get %s from queue.' % q) time.sleep(random.random())def main(): futures = set() with concurrent.futures.ThreadPoolExecutor(multiprocessing.cpu_count()*4) as executor: for q in (chr(ord('A')+i) for i in range(26)): future = executor.submit(read, q) futures.add(future) try: for future in concurrent.futures.as_completed(futures): err = future.exception() if err is not None: raise err except KeyboardInterrupt: print("stopped by hand")if __name__ == '__main__': main()
ThreadPoolExecutor is no different from ProcessPoolExecutor, but the signature is changed.
It is not hard to see that it is very easy to convert from multi-process to multi-thread, whether it is using a queue or using a inbound/thread pool-just modifying several signatures. Of course, the internal mechanism is completely different, but the python package is very good, so that we don't need to care about these details, which is just the elegance of python.