Parsing multi-process programming under Python

Source: Internet
Author: User
Tags http request join sleep

This article mainly introduces the preliminary analysis Python under the multi-process programming, using multi-process programming has been the focus of Python programming and difficult, the need for friends can refer to the

To enable Python programs to implement multiple processes (multiprocessing), we first understand the relevant knowledge of the operating system.

The Unix/linux operating system provides a fork () system call, which is very special. A normal function call, called once, returns once, but the fork () is called once, returning two times, because the operating system automatically copies the current process (called the parent process) to a copy (called a subprocess), and then returns within the parent process and the child process, respectively.

The child process returns 0 forever, and the parent process returns the ID of the child process. The reason for this is that a parent process can fork many child processes, so the parent process has to write down the ID of each child process, and the child process needs to call Getppid () to get the ID of the parent process.

Python's OS modules encapsulate common system calls, including fork, which makes it easy to create child processes in a Python program:

?

1 2 3 4 5 6 7 8 9 # multiprocessing.py import os print ' Process (%s) start ... '% os.getpid () pid = Os.fork () If Pid==0:print ' I am Child Process (%s) and my parent are%s.% (Os.getpid (), Os.getppid ()) Else:print ' I (%s) just created a child process (%s). '% (Os.getpid (), PID)

The results of the operation are as follows:

?

1 2 3 Process (876) Start ... I (876) just created a child process (877). I am Child process (877) and me parent is 876.

The above code cannot be run on Windows because Windows does not have a fork call. Since the Mac system is based on the BSD (Unix) kernel, it is no problem to run under the Mac, and recommend that you use Mac to learn python!

With the fork call, a process can replicate a child process to handle new tasks when a new task is received, and the common Apache server is the parent process listening to the port, and whenever a new HTTP request is made, the child process is fork to handle the new HTTP request.

Multiprocessing

If you're going to write a unix/linux service program, it's definitely the right choice. Because Windows does not have fork calls, isn't it possible to write a multiple-process program in Python on Windows?

Since Python is cross-platform, it should also provide a cross-platform support for multiple processes. The multiprocessing module is the Cross-platform version of the multi-process module.

The multiprocessing module provides a process class to represent a processing object, and the following example demonstrates starting a child process and waiting for it to end:

?

1 2 3 4 5 6 7 8 9 10 11 12 13-14 From multiprocessing Import process import OS # subprocess code def to execute Run_proc (name): print ' Run child process%s (%s) ... '% (NA Me, Os.getpid ()) if __name__== ' __main__ ': print ' Parent process%s. '% os.getpid () p = Process (Target=run_proc, args= (' t EST ',)] print ' process would start. ' P.start () p.join () print ' Process end. '

The results of the implementation are as follows:

?

1 2 3 4 Parent process 928. Process would start. Run Child process Test (929) ... Process end.

When you create a subprocess, you only need to pass in a parameter that executes functions and functions, create a process instance, and start with the start () method, which is simpler to create than fork ().

The join () method can wait for the child process to finish and then continue running, typically for synchronization between processes.

Pool

If you want to start a large number of child processes, you can create the child processes in bulk using the process pool:

?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 From multiprocessing import Pool import OS, time, Random def long_time_task (name): print ' Run task%s ' (%s) ... '% (name, Os.getpid ()) start = Time.time () time.sleep (Random.random () * 3) end = Time.time () print ' Task%s runs%0.2f. '% ( Name, (End-start)) if __name__== ' __main__ ': print ' Parent process%s. '% os.getpid () p = Pool () for I in range (5): P.a Pply_async (Long_time_task, args= (i,)) print ' Waiting for all subprocesses do ... ' P.close () p.join () print ' All subproces SES done. '

The results of the implementation are as follows:

?

1 2 3 4 5 6 7 8 9 10 11 12-13 Parent process 669. Waiting for all subprocesses ... Run Task 0 (671) ... Run Task 1 (672) ... Run Task 2 (673) ... Run Task 3 (674) ... Task 2 runs 0.14 seconds. Run Task 4 (673) ... Task 1 runs 0.27 seconds. Task 3 runs 0.86 seconds. Task 0 runs 1.41 seconds. Task 4 runs 1.91 seconds. All subprocesses done.

Code interpretation:

Calling the Join () method on a pool object waits for all child processes to complete, calls Close () before calling join (), and cannot continue adding a new process after calling Close ().

Note that the result of the output is that task 0,1,2,3 is executed immediately, and task 4 waits for a previous task to complete before it is executed because the pool's default size is 4 on my computer, so you can execute up to 4 processes at the same time. This is a deliberately designed restriction, not an operating system restriction. If changed to:

?

1 p = Pool (5)

You can run 5 processes at a time.

Since the pool's default size is the CPU's kernel number, if you have a 8 core CPU, you will have to submit at least 9 sub processes to see the wait effect.

Inter-process communication

There is certainly a need for communication between processes, and the operating system provides many mechanisms to implement interprocess communication. Python's multiprocessing module wraps the underlying mechanism, providing queue, pipes, and many other ways to exchange data.

Let's take queue as an example, create two subprocess in the parent process, write data to queue, read data from queue:

?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28-29 From multiprocessing import process, Queue import OS, time, Random # Write code for Data Process execution: def write (q): For value in [' A ', ' B ', ' C ': print ' put%s to queue ... '% value q.put (value) time.sleep (Random.random ()) # Read the code executed by the data process: def read (q): While True: Value = Q.get (True) print ' Get%s '% value if __name__== ' __main__ ': # Parent process creates queue and passes to each child process: Q = queue () PW = Process (Target=write, args= (q,)) PR = Process (Target=read, args= (Q,)) # start subprocess pw, write: Pw.start () # Start subprocess PR, read: Pr.start () # Wait pw end: Pw.join () # PR process is a dead loop, unable to wait for its end, can only be forced to terminate: Pr.terminate ()

The results of the operation are as follows:

?

1 2 3 4 5 6 Put A to queue ... Get A from queue. Put B to queue ... Get B from queue. Put the C to queue ... Get C from queue.

Under Unix/linux, the multiprocessing module encapsulates the fork () call, so that we do not need to focus on the details of fork (). Because Windows does not have fork calls, multiprocessing needs to "simulate" the fork effect, and all the Python objects in the parent process must be serialized through Pickle to the child process, all If multiprocessing failed in Windows, first consider whether Pickle failed.

Summary

Under Unix/linux, you can use fork () calls to implement multiple processes.

To implement multiple processes across platforms, you can use the multiprocessing module.

interprocess communication is realized through queue, pipes and so on.

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.