Process pools and callback functions

Source: Internet
Author: User
Tags terminates

One, Process pool (focus)

In the use of Python for system management, especially the simultaneous operation of multiple file directories, or remote control of multiple hosts, parallel operation can save a lot of time. Multi-process is one of the means to achieve concurrency, the problems to be noted are:

1, it is obvious that the tasks that need concurrent execution are usually much larger than the number of cores

2, an operating system can not be unlimited open process, usually a few cores open several processes

3, the process open too much, the efficiency will be reduced (open process is to occupy the system resources, and the process of opening the number of redundant cores can not be parallel)

For example, when the number of objects is small, can be directly used in multiprocessing process dynamic genetic multiple processes, more than 10 is OK, but if it is hundreds, thousands of ... Manually to limit the number of processes is too cumbersome, at this time can play the role of process pool.

1. Create a class for the process pool: If you specify Numprocess as 3, the process pool creates three processes from scratch and then uses the three processes all the way to perform all tasks without opening other processes

Pool ([numprocess  

2, Parameter introduction:

1 Numprocess: The number of processes to be created, if omitted, the value of Cpu_count () will be used by default (Cpu_count (), the number of cores in the OS module that can view the computer) 2 initializer: is the callable object to execute at the start of each worker process. The default is None3 Initargs: is the parameter group to pass to initializer

3, the Main method:

P.apply (func [, args [, Kwargs]): Executes func (*args,**kwargs) in a pool worker process and returns the result. It should be emphasized that this operation does not execute the Func function in all pool worker processes. If you want to execute the Func function concurrently with different parameters, you must call the P.apply () function from a different thread or use the P.apply_async () P.apply_async (func [, args [, Kwargs]]): Executes func (*args,**kwargs) in a pool worker process and returns the result. The result of this method is an instance of the AsyncResult class, and callback is a callable object that receives input parameters. When the result of Func becomes available, the understanding is passed to callback. Callback does not prohibit any blocking operations, otherwise it will receive results from other asynchronous operations.   P.close (): Closes the process pool to prevent further action. If all operations persist, they will complete p.jion () before the worker process terminates: waits for all worker processes to exit. This method can only be called after close () or teminate ()

4. Other methods (Learn)

The return value of Method Apply_async () and Map_async () is an instance of Asyncresul obj. The instance has the following method Obj.get (): Returns the result and waits for the result to arrive if necessary. Timeout is optional. If it has not arrived within the specified time, a one will be raised. If an exception is thrown in a remote operation, it is raised again when this method is called. Obj.ready (): If the call is complete, return trueobj.successful (): Returns True if the call completes without throwing an exception, or if this method is called before the result is ready, throws an exception obj.wait ([timeout]): Waits for the result to become available. Obj.terminate (): Immediately terminates all worker processes without performing any cleanup or end of any pending work. If P is garbage collected, this function is called automatically

5. Application

1) Apply synchronous execution: block type

From multiprocessing import poolimport os,timedef work (n):    print ('%s run '%os.getpid ())    Time.sleep (3)    return n**2if __name__ = = ' __main__ ':    p=pool (3) #进程池中从无到有创建三个进程, it has been these three processes in the execution task    res_l=[] for I in    range (10 ):        res=p.apply (work,args= (i)) #同步运行, blocking, until this task is complete, get to res        res_l.append (res)    print (res_l)

2) Apply_async Asynchronous execution: non-blocking

From multiprocessing import poolimport os,timedef work (n):    print ('%s run '%os.getpid ())    Time.sleep (3)    return n**2if __name__ = = ' __main__ ':    p=pool (3) #进程池中从无到有创建三个进程, it has been these three processes in the execution task    res_l=[] for I in    range (10 ):        Res=p.apply_async (work,args= (i)) #同步运行, block until the task is executed and get to res        res_l.append (res)    #异步apply_ Async Usage: If you use asynchronous-committed tasks, the main process needs to use Jion, wait for the process pool tasks to finish processing, and then collect the results with get, otherwise, the main process ends, the process pool may not have time to execute, and then end up together    P.close ()  #禁止往进程池内再添加任务    P.join () for    res in res_l:        print (Res.get ()) #使用get来获取apply_aync的结果, if apply, there is no get method, Because apply is executed synchronously, get the result immediately and no need to get it at all

3) Detailed: Apply_async and apply

Apply_async and apply

4) Improved previous link loops

Service Side Client

Open multiple clients concurrently, the service side only 3 different PID at the same time, when one of the clients ends, another client will come in and be processed by one of 3 processes

Second, the return function

scenario where a callback function is required: once any of the tasks in the process pool have been processed, inform the main process immediately: I'm done, you can handle my results. The main process calls a function to process the result, which is the callback function

We can put the time-consuming (blocking) task into the process pool and then specify the callback function (the main process is responsible for executing) so that the main process eliminates the I/O process when executing the callback function, and the result of the task is directly obtained.

Download #pip3 install requests#requests module (download in cmd) from multiprocessing import poolimport requestsimport osimport timedef get _page (URL):    print (' <%s> is getting [%s] '% (os.getpid (), URL))    response=requests.get (URL)    Time.sleep (2)    print (' <%s> was done [%s] '% (os.getpid (), URL))    return {' url ': url, ' text ': Response.text} def parse_page (res):    print (' <%s> parse [%s] '% (Os.getpid (), res[' URL '))) with    open (' db.txt ', ' a ') as F:        parse_res= ' url:%s size:%s\n '% (res[' url '],len (res[' text '))        F.write (parse_res) if __name__ = = ' __main__ ':    p=pool (4)    urls = [        ' https://www.baidu.com ',        ' http:/ /www.openstack.org ',        ' https://www.python.org ',        ' https://help.github.com/',        ' HTTP// www.sina.com.cn/'    ]    for URL in URLs:        p.apply_async (get_page,args= (URL,), callback=parse_page)    P.close ()    p.join ()    print (' Master ', Os.getpid ())

If you wait for all the tasks in the process pool to finish executing in the main process and then process the results uniformly, you do not need a callback function

Crawler Case:

Crawler Case

Process pools and callback functions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.