Python multi-process usage summary, python Process summary

Last Update:2016-11-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Multi-Process in python is mainly usedMultiprocessingThis library. This library may cause problems when using multiprocessing. Manager (). Queue. We recommend that you upgrade python to a later version, for example, 2.7.11. For details, refer to "python version upgrade".

For details about how to use a thread pool in python, refer to python thread pool implementation.

I. multi-process usage

1. The fork function can be used in linux.

#!/bin/env pythonimport osprint 'Process (%s) start...' % os.getpid()pid = os.fork()if pid==0:    print 'I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid())    os._exit(1)else:    print 'I (%s) just created a child process (%s).' % (os.getpid(), pid)

Output

Process (22246) start...I (22246) just created a child process (22247).I am child process (22247) and my parent is 22246.

2. Use multiprocessing

#!/bin/env pythonfrom multiprocessing import Processimport osimport timedef run_proc(name):    time.sleep(3)    print 'Run child process %s (%s)...' % (name, os.getpid())if __name__=='__main__':    print 'Parent process %s.' % os.getpid()    processes = list()    for i in range(5):        p = Process(target=run_proc, args=('test',))        print 'Process will start.'        p.start()        processes.append(p)        for p in processes:        p.join()    print 'Process end.'

Output

Parent process 38140.Process will start.Process will start.Process will start.Process will start.Process will start.Run child process test (38141)...Run child process test (38142)...Run child process test (38143)...Run child process test (38145)...Run child process test (38144)...Process end.real    0m3.028suser    0m0.021ssys     0m0.004s

2. Process pool

1. Use multiprocessing. Pool for non-blocking

#!/bin/env pythonimport multiprocessingimport timedef func(msg):    print "msg:", msg    time.sleep(3)    print "end"if __name__ == "__main__":    pool = multiprocessing.Pool(processes = 3)    for i in xrange(3):        msg = "hello %d" %(i)        pool.apply_async(func, (msg, ))    print "Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~"    pool.close()    pool.join()    # behind close() or terminate()    print "Sub-process(es) done."

Running result

Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~msg: hello 0msg: hello 1msg: hello 2endendendSub-process(es) done.real    0m3.493suser    0m0.056ssys     0m0.022s

2. Use multiprocessing. Pool to block the version

#!/bin/env pythonimport multiprocessingimport timedef func(msg):    print "msg:", msg    time.sleep(3)    print "end"if __name__ == "__main__":    pool = multiprocessing.Pool(processes = 3)    for i in xrange(3):        msg = "hello %d" %(i)        pool.apply(func, (msg, ))          print "Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~"    pool.close()    pool.join()    # behind close() or terminate()    print "Sub-process(es) done."

Running result

msg: hello 0endmsg: hello 1endmsg: hello 2endMark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~Sub-process(es) done.real    0m9.061suser    0m0.036ssys     0m0.019s

The main differences are apply_async and apply functions. The former is non-blocking and the latter is blocking. It can be seen that the multiple of the running time difference is the number of process pools.

3. Use multiprocessing. Pool and follow the results

import multiprocessingimport timedef func(msg):    print "msg:", msg    time.sleep(3)    print "end"    return "done" + msgif __name__ == "__main__":    pool = multiprocessing.Pool(processes=4)    result = []    for i in xrange(3):        msg = "hello %d" %(i)        result.append(pool.apply_async(func, (msg, )))    pool.close()    pool.join()    for res in result:        print ":::", res.get()    print "Sub-process(es) done."

Running result

msg: hello 0msg: hello 1msg: hello 2endendend::: donehello 0::: donehello 1::: donehello 2Sub-process(es) done.real    0m3.526suser    0m0.054ssys     0m0.024s

4. Use multiprocessing. Pool in the class

Errors may occur when you use the process pool in

PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

This prompt is because the multiprocessing. Pool uses Queue communication. All data entering the Queue must be serializable (picklable), including custom class instances. As follows:

#!/bin/env pythonimport multiprocessingclass SomeClass(object):    def __init__(self):        pass    def f(self, x):        return x*x    def go(self):        pool = multiprocessing.Pool(processes=4)        #result = pool.apply_async(self.f, [10])             #print result.get(timeout=1)                   print pool.map(self.f, range(10))SomeClass().go()

Run prompt

Traceback (most recent call last):  File "4.py", line 18, in <module>    SomeClass().go()  File "4.py", line 16, in go    print pool.map(self.f, range(10))  File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 251, in map    return self.map_async(func, iterable, chunksize).get()  File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 567, in get    raise self._valuecPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

Solution: (1)

#!/bin/env pythonimport multiprocessingdef func(x):    return x*xclass SomeClass(object):    def __init__(self,func):        self.f = func    def go(self):        pool = multiprocessing.Pool(processes=4)        #result = pool.apply_async(self.f, [10])        #print result.get(timeout=1)        print pool.map(self.f, range(10))SomeClass(func).go()

Output result:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

(2) In general, if we write the processing logic in the class and want to minimize code changes, we can use the following method:

#!/bin/env pythonimport multiprocessingclass SomeClass(object):    def __init__(self):        pass    def f(self, x):        return x*x    def go(self):        result = list()        pool = multiprocessing.Pool(processes=4)        for i in range(10):            result.append(pool.apply_async(func, [self, i]))        pool.close()        pool.join()        for res in result:            print res.get(timeout=1)   def func(client, x):    return client.f(x)SomeClass().go()

Output result:

0149162536496481

Note the following when using solution (2,If the SomeClass instance contains any nonserializable data, an error is reported. Generally, the res. get () error is returned. At this time, you need to check whether the Code has any unserializable variables. If yes, you can change it to a global variable.

3. Use the thread pool in multiple processes

In one scenario, multi-process and multi-thread needs to be used: in CPU-intensive scenarios, the ip address processing speed is around 0.04 seconds, and the running time of a single thread is about 3m32s, the CPU usage of a single process is 100%. The process pool (size = 10) takes about 6 m50s, of which only one process has 90% CPU usage, and the other is around 30%; the thread pool (size = 10) is about 4 m39s, with a single CPU usage of 100%

It can be seen that the use of multi-process is not dominant at this time, but is slower. Because switching between processes consumes most of the resources and time, it takes only 0.04 seconds for an ip address. Because the thread pool can only use single-core CPU, the speed of increasing the number of threads cannot be improved. Therefore, multi-process and multi-thread combination should be used at this time.

def run(self):    self.getData()    ipNums = len(self.ipInfo)    step = ipNums / multiprocessing.cpu_count()    ipList = list()    i = 0    j = 1    processList = list()    for ip in self.ipInfo:        ipList.append(ip)        i += 1        if i == step * j or i == ipNums:            j += 1            def innerRun():                wm = Pool.ThreadPool(CONF.POOL_SIZE)                for myIp in ipList:                    wm.addJob(self.handleOne, myIp)                wm.waitForComplete()            process = multiprocessing.Process(target=innerRun)            process.start()            processList.append(process)            ipList = list()    for process in processList:        process.join()

If the machine has 8 CPUs, 8 processes are used to add a thread pool. The speed is increased to 35 s. The utilization of 8 CPUs is around 50%, and the average CPU usage of the machine is around 75%.

4. multi-process communication

Personal use of more is Manager, other especially distributed multi-process can learn Liao Xuefeng official website http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001386832973658c780d8bfa4c6406f83b2b3097aed5df6000

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python multi-process usage summary, python Process summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python multi-process usage summary, python Process summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support