1 Basic overview of Process pool pools
In the use of Python for system management, especially the simultaneous operation of multiple file directories or remote control of multiple hosts, parallel operation can save a lot of time, if the number of objects are small, can also be directly applicable to process class dynamic generation of multiple processes, dozens of fair, if hundreds or more, Manually restricting the number of processes is particularly cumbersome, and the process pool is particularly important at this point.
The process pool class can provide a specified number of processes for the user to invoke, and when a new request is submitted to the pool, a new process is created to execute the request if the process pool is not full, and if the number of processes in the process pool has reached the specified maximum number, the request waits until the process has ended in the process pool. A new process is created to process the request.
2 syntax for process pool pools
Pool ([processes[, initializer[, initargs[, maxtasksperchild[, context]])
Processes: The number of worker processes used, or the number returned by default for Os.cpu_count () if processes is none.
Initializer: If initializer is none, then each worker process will call initializer (*initargs) at the beginning.
Maxtasksperchild: The number of tasks that can be completed before the worker process exits, after completion with a new worker process to replace the original process, let idle resources released, maxtasksperchild default is None, this means that the pool exists as long as the work process has been alive
Context: Use multiprocessing in general when developing a work process. Pool () or the pool () method of a context object to create one, both methods are appropriately set up in the context.
Example method:
P is the process pool object
P.apply ():
Apply (func[, args= () [, kwds={}])
This function is used to pass an indeterminate parameter, and the main process is blocked until the function execution is finished, which is actually called synchronous execution.
Synchronous execution, executing events in the order in which the process pool is joined, and executing another one after each execution, unable to get the return value .
P.apply_async ()
Apply_async (func[, args= () [, kwds={}[, Callback=none]])
As with the Apply usage, it is non-blocking and supports the return of the result for callbacks; in effect, asynchronous execution.
asynchronous execution, while initiating multiple process execution events in the process pool, can get the event return value- <multiprocessing.pool.applyresult object at 0x7f7f6e4357f0>.
P.map ()
Map (func, iterable[, Chunksize=none])
The map method in the pool class is basically consistent with the built-in map function usage, which incorporates the function of the map function and the Apply_async () function; it blocks the process until it returns the result.
Note: Although the second parameter is an iterator, the actual application must be in place after the entire queue is ready for the program to run the child process.
P.close (): Closes the process pool, prevents more tasks from being submitted to the process pool, and the worker process exits when the task is completed
P.terminate (): End worker process, no longer processing unfinished tasks
P.join (): Waits for the worker thread to exit, must be used after close () or terminate (), because the terminated process needs to be called by the parent process wait (join is equivalent to wait), or the process becomes a zombie process.
Attention:
(1) Create a process pool object using pool, while the process in the process pool is started
(2) Adding events to the Process pool object, event queuing execution
(3) If the main process exits, all processes in the process pool exit
3 Example 3.1 Base instance
Import Multiprocessing as MP def Test (): Pass # Create 5 processes for in range :# Add a task to a process pool # Close the process pool and no longer accept requests # wait for all child processes to end
Description
(1) After the process pool is created, the p.apply_async (test) statement is executed continuously, which is equivalent to submitting 10 requests to the process pool, which are placed in a queue.
(2) P = MP. Pool (5) after execution, 5 processes are created, but they have not been assigned their own tasks, which means that no matter how many tasks there are, the actual number of processes is only 5, with up to 5 concurrent processes at a time.
(3) When a process task in the pool is completed, the process resource will be released, and the pool will take a new request to the idle process to continue with the FIFO principle.
(4) When all the pool process tasks are completed, there will be 5 zombie processes, if the main process/main thread does not end, the system will not automatically reclaim the resources, need to call the Join function is responsible for recycling.
(5) When creating pool process pools, if you do not specify the maximum number of processes, the number of processes created by default is the number of cores of the system
(6) If you add a task with p.apply (test) blocking method, it can only add one task to the process pool at a time, and then the for loop will be blocked waiting until the added task is executed, and the 5 processes in the process pool alternately execute the new task, which is equivalent to a single process. -- The statement needs to be deeply understood, not fully understood
reference:Python's multiprocessing module process creation, resource recycling-process,pool
3.2 Add a task with the Apply method
ImportMultiprocessing as MPImportOS fromTimeImportSleepdefWorker (msg):Print(Os.getpid ()) Sleep (2) Print(msg)returnmsg#Create a process pool objectP = MP. Pool (processes = 4)#Create 4 ProcessesPool_result= [] forIinchRange (10): Msg='hello-%d'%I R= P.apply (worker, (msg,))#to add an event to the process poolPool_result.append (R)#gets the return value of the event function forRinchPool_result:Print('return:', R) p.close ()#Close the process pool and no longer accept requestsP.join ()#wait for the event in the process pool to complete, reclaim the process pool
Run
8419Hello-08418Hello-18420Hello-28421Hello-38419Hello-48418Hello-58420Hello-68421Hello-78419Hello-88418Hello-9return: hello-0return: hello-1return: hello-2return: hello-3return: hello-4return: hello-5return: hello-6return: hello-7return: hello-8return: hello-9
This code runs slower and is related to process blocking. Equivalent to single thread!
When you modify print('return:', R) in the code (22 lines) to print(' return:', R.get ())
8670Hello-08671Hello-18672Hello-28673Hello-38670Hello-48671Hello-58672Hello-68673Hello-78670Hello-88671Hello-9Traceback (most recent): File"test1.py", line 22,inch<module>Print('return:', R.get ()) Attributeerror:'Str'object has no attribute'Get'
Last error: attributeerror: 'str' object has no attribute 'get'
3.3 Applay_async Way to add a task
ImportMultiprocessing as MPImportOS fromTimeImportSleepdefWorker (msg):Print(Os.getpid ()) Sleep (2) Print(msg)returnmsg#Create a process pool objectP = MP. Pool (processes = 4)#Create 4 ProcessesPool_result= [] forIinchRange (10): Msg='hello-%d'%I R= P.apply_async (worker, (msg,))#to add an event to the process poolPool_result.append (R)#gets the return value of the event function forRinchPool_result:Print('return:', R) p.close ()#Close the process pool and no longer accept requestsP.join ()#wait for the event in the process pool to complete, reclaim the process pool
Run
return: <multiprocessing.pool.applyresult object at 0x7f66d0e37d68>return: <multiprocessing.pool.applyresult object at 0x7f66d0e37e80>return: <multiprocessing.pool.applyresult object at 0x7f66d0e37f98>return: <multiprocessing.pool.applyresult object at 0x7f66d0e410f0>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41208>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41320>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41438>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41550>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41668>return: <multiprocessing.pool.applyresult object at 0x7f66d0e41780>8739874087428741Hello-0hello-38742Hello-187398740Hello-28741Hello-58739Hello-68740Hello-7Hello-4Hello-8Hello-9
Attention:
(1) Because this is an asynchronous way to add tasks, so it runs very fast
(2) because the for is a built-in loop function, the execution is more efficient, so the first 10 rows of the result are the For statement execution result
(3) r = P.apply_async (worker, (msg,)) executes the result as a progress object.
(4) Because the task is executed asynchronously, it is "disorderly" in the result, and is not printed in order as Applay.
Similarly, print('return:', R) in the code (22 lines) is modified to print(' return: ', R.get ()) ,
Run results
8839884088418842Hello-0hello-1Hello-38839Hello-2884288418840return: hello-0return: hello-1return: hello-2return: hello-3Hello-4Hello-58839Hello-68842Hello-7return: hello-4return: hello-5return: hello-6return: hello-7Hello-9Hello-8return: hello-8return: hello-9
Python Learning note--multiprocess multi-process component pool