This article mainly introduces the detailed Python process and process pool (processing library), very practical value, the need for friends can refer to the following
Environment: win7+python2.7
Always want to learn multi-process or multi-threading, but before just a little basic knowledge and simple introduction, unable to understand how to apply, until the previous period of time to see a crawler project GitHub involves multi-process, multi-threaded related content, while looking at the Baidu-related knowledge points, Now write down some relevant knowledge points and some applications to make a record.
First of all, what is a process: a process is an execution activity of a program on a computer, and a process is started when a program is run. The process is divided into the system process and the user process. As long as the process used to complete the various functions of the operating system is the system process, they are in the running state of the operating system itself; And all the processes you start are user processes. A process is the unit in which the operating system allocates resources.
Intuitively, the user name of the task Manager indicates that the system is a process, marked administrator is the user process, the other net is the network Luo, lcacal service is local services, about the process more specific information can be Wikipedia, here to save some effort, Or you won't get it back.
I. Simple use of multiple processes
, multiprocessing has a number of functions, many of which I have not yet learned, and this is only what I know now.
process Creation :P rocess (target= main run function, name= custom process name is not writable, args= (parameter))
Method:
Is_alive (): Determine if the process is alive
Join ([timeout]): The child process ends the next step, timeout is time-out, sometimes the process is stuck, and the timeout is set for the program to run
Run (): If you do not specify target when creating the process object, the process's Run method will be executed by default
Start (): Start process, differentiate run ()
Terminate (): Terminate the process, about the termination process is not so simple, seemingly with Psutil package will be better, have the opportunity to learn more and then write down.
Wherein, process starts a session with start ().
Property:
Authkey: the Authkey () function in the document found the following sentence: Set authorization key of the Process setting authorization key, not found the relevant application instance, this key is how to use it? The article does not mention
Daemon: The parent process terminates automatically and cannot generate a new process, it must be set before start ()
ExitCode: The process is none at run time, if it is –n, the signal N ends
Name: process names, custom
PID: Each process has a unique PID number.
1.Process (), Start (), Join ()
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': A=time.time () p1=process (target=fun1,args= (4,)) P2 = Process (target=fun2, args= (6,)) P1.start () P2.start () P1.join () P2.join () b=time.time () print ' Finish ', b-a
There are altogether two processes open, P1 and p2,arg= (4,) 4 is the parameter of the FUN1 function, here to use the Tulpe type, if two parameters or more is arg= (parameter 1, parameter 2 ...), then start the process with start (), We set the wait for the P1 and P2 processes to finish before proceeding to the next step. To see the results below, fun2 and Fun1 basically start running at the same time, when the run is complete (fun1 sleep 4 seconds, and fun2 sleep for 6 seconds), execute print ' finish ', b-a statement
This is fun2 Mon June 13:48:04 2017this is fun1 mon June 13:48:04 2017fun1 finish Mon June 13:48:08 2017fun2 Finish Mon June 13:48:10 2017finish 6.20300006866Process finished with exit code 0
Let's see what happens when Start () is in a different position from join ().
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': A=time.time () p1=process (target=fun1,args= (4,)) P2 = Process (target=fun2, args= (6,)) P1.start () P1.join () P2.start () P2.join () b=time.time () print ' Finish ', b-a
Results:
This is fun1 Mon June 14:19:28 2017fun1 finish Mon June 14:19:32 2017this is fun2 mon June 14:19:32 2017fun2 Finish Mon June 14:19:38 2017finish 10.1229999065Process finished with exit code 0
Look, it is time to run the FUN1 function, run finished before running fun2 then print ' Finish ', that is, run the process P1 run the process P2, feel the charm of join () it. Now try commenting out the join () and see what happens again.
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': A=time.time () p1=process (target=fun1,args= (4,)) P2 = Process (target=fun2, args= (6,)) P1.start () P2.start () P1.join () #p2. Join () b=time.time () print ' Finish ', b-a
Results:
This is fun1 Mon June 14:23:57 2017this is fun2 mon June 14:23:58 2017fun1 finish Mon June 14:24:01 2017finish 4.059 00001526FUN2 finish Mon June 14:24:04 2017Process finished with exit code 0
This time to run the FUN1 (because the P1 process is useful for join (), so the main program waits for the P1 to run the next step), and then continue to run the main process of print ' finish ', and finally fun2 run finished
2.name,daemon,is_alive ():
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': A=time.time () p1=process (name= ' fun1 process ', target=fun1,args= (4,)) P2 = Process (name= ' fun2 process ', Target=fun2, args= (6,)) p1.daemon=true P2.daemon = True P1.start () P2.start () p1.join () Print p1,p2 print ' Process 1: ', P1.is_ali ve (), ' Process 2: ', p2.is_alive () #p2. Join () b=time.time () print ' Finish ', b-a
Results:
This is fun2 Mon June 14:43:49 2017this is fun1 mon June 14:43:49 2017fun1 finish Mon June 14:43:53 2017<process ( FUN1 process, stopped daemon) > <process (fun2 process, started daemon) > process 1:false process 2:truefinish 4.06500005722Process Finished with exit code 0
As can be seen, name is given to the process to give the name, run to print ' Process 1: ', p1.is_alive (), ' Process 2: ', p2.is_alive () This sentence, the P1 process has ended (return false), the P2 process is still running (return true), But P2 does not use join (), so go directly to the main process, due to the use of daemon=ture, the parent process terminates automatically, the P2 process does not end to forcibly end the entire program.
3.run ()
Run () runs the program by default with the run () function when the process does not specify the target function.
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': a = Time.time () p=process () P.start () p.join () b = time.time () print ' Finish ', b-a
Results:
Finish 0.0840001106262
As seen from the results, process p did nothing, in order for the process to run properly, we jiangzi write:
The target function has no parameters:
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (): print ' This is fun1 ', Time.ctime () Time.sle EP (2) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', Tim E.ctime () if name = = ' main ': a = Time.time () p=process () p.run=fun1 P.start () p.join () b = time.time () print ' Finish ', B- A
Results:
This is fun1 Mon June 16:34:41 2017fun1 finish Mon June 16:34:43 2017finish 2.11500000954Process finished with exit Co De 0
The target function has parameters:
#-*-Coding:utf-8-*-from multiprocessing import processimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () TIME.SL EEP (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', TI Me.ctime () if name = = ' main ': a = Time.time () p=process () p.run=fun1 (2) P.start () p.join () b = time.time () print ' Finish ', B-a
Results:
This is fun1 Mon June 16:36:27 2017fun1 finish Mon June 16:36:29 2017Process Process-1:traceback (most recent call Las T): File "E:\Anaconda2\lib\multiprocessing\process.py", line 258, in _bootstrap self.run () TypeError: ' Nonetype ' object Is isn't callablefinish 2.0529999733Process finished with exit code 0
The objective function has an exception with parameters, why? I can't find the reason now, but practice has found that when the last parameter is given to the process to run, there are no other parameters, and there is a hope that someone knows.
Two. Process Pool
It's easy for us to use process when we need to use a few or even more than 10 processes, but if you want hundreds or thousands of processes, it's obviously too stupid to use process, multiprocessing provides the pool class, which is now talking about the process pools, that can put a lot of processes together, Set a running process cap, run only the set number of processes at a time, wait for the process to end, and then add a new process
Pool (Processes =num): Sets the number of running processes, and when a process finishes running, a new process is added.
Apply_async (function, (parameter)): Non-blocking, where the parameter is the Tulpe type,
Apply (function, (parameter)): Blocked
Close (): Close pool, no new tasks can be added
Terminate (): End running process, no longer processing unfinished tasks
Join (): works as described by the process, but is used after close or terminate.
1. Single Process Pool
#-*-Coding:utf-8-*-from multiprocessing import poolimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () time.sleep (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', time. CTime () if name = = ' main ': a=time.time () pool = pool (processes =3) # can run 3 processes at a time for I in range (3,8): pool.apply_async (Fu N1, (I,)) Pool.close () Pool.join () b=time.time () print ' Finish ', b-a
Results:
This is fun1 Mon June 15:15:38 2017this is fun1 mon June 15:15:38 2017this is fun1 Mon June 15:15:38 2017fun1 Finish Mon June 15:15:41 2017this is fun1 mon June 15:15:41 2017fun1 finish Mon June 15:15:42 2017this is fun1 mon June 05 15:15:42 2017fun1 finish Mon June 15:15:43 2017fun1 finish Mon June 15:15:47 2017fun1 finish Mon June 15:15:49 2017f Inish 11.1370000839Process finished with exit code 0
As can be seen from the above results, set the 3 running process upper limit, 15:15:38 this time to start three processes, when the first process ends (the parameter is 3 seconds that process), will add a new process, so loop, until the process pool run and then execute the main process statement b=time.time () print ' Finish ', b-a. Use non-blocking apply_async (), and then compare block apply ()
#-*-Coding:utf-8-*-from multiprocessing import poolimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () time.sleep (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', time. CTime () if name = = ' main ': a=time.time () pool = pool (processes =3) # can run 3 processes at a time for I in range (3,8): pool.apply (FUN1, (I, )) Pool.close () Pool.join () b=time.time () print ' Finish ', b-a
Results:
This is fun1 Mon June 15:59:26 2017fun1 finish Mon June 15:59:29 2017this is fun1 mon June 15:59:29 2017fun1 Finish Mon June 15:59:33 2017this is fun1 mon June 15:59:33 2017fun1 finish Mon June 15:59:38 2017this is fun1 mon June 05 1 5:59:38 2017fun1 finish Mon June 15:59:44 2017this is fun1 mon June 15:59:44 2017fun1 finish Mon June 15:59:51 2017f Inish 25.1610000134Process finished with exit code 0
As you can see, blocking is when a process finishes, and then the next process, generally we use non-blocking apply_async ()
2. Multiple process Pools
Above is the use of a single process pool, for multiple process pools, we can use a for loop, directly see the code
#-*-Coding:utf-8-*-from multiprocessing import poolimport timedef fun1 (t): print ' This is fun1 ', Time.ctime () time.sleep (t) print ' fun1 finish ', Time.ctime () def fun2 (t): print ' This is fun2 ', Time.ctime () time.sleep (t) print ' fun2 finish ', time. CTime () if name = = ' main ': a=time.time () pool = pool (processes =3) # can run 3 processes at the same time for fun in [fun1,fun2]: for I in range (3 , 8): Pool.apply_async (fun, (I,)) Pool.close () Pool.join () b=time.time () print ' Finish ', b-a
Results:
This is fun1 Mon June 16:04:38 2017this is fun1 mon June 16:04:38 2017this is fun1 Mon June 16:04:38 2017fun1 Finish Mon June 16:04:41 2017this is fun1 mon June 16:04:41 2017fun1 finish Mon June 16:04:42 2017this is fun1 mon June 05 16:04:42 2017fun1 finish Mon June 16:04:43 2017this is fun2 mon June to 16:04:43 2017fun2 finish Mon June 05 16:04:46 2017 This is fun2 Mon June 16:04:46 2017fun1 finish Mon June 16:04:47 2017this is fun2 mon June 16:04:47 2017fun1 Finish Mon June 16:04:49 2017this is fun2 mon June 16:04:49 2017fun2 finish Mon June 16:04:50 2017this is fun2 mon June 05 1 6:04:50 2017fun2 finish Mon June 16:04:52 2017fun2 finish Mon June 16:04:55 2017fun2 finish Mon June 16:04:57 2017fi Nish 19.1670000553Process finished with exit code 0
See, run fun2 after the fun1 runs.
In addition, for cases without parameters, direct Pool.apply_async (funtion), no need to write parameters.
In the process of learning to write programs, have encountered without if _name_ = = ' _main_ ': and run the program directly, so that the result will be wrong, after querying, on Windows to use the process module, you must write the code about the process in the current. py file if _name_ = = ' _main_ ': The process module under Windows can be used normally under the statement. Unix/linux is not required. The reason for this is that when executed, the py you write will be read into execution as module. So, be sure to judge whether it is _main_. That is, to:
If name = = ' main ': # do something.
I don't know what I'm doing here, I'm looking forward to understanding.