The question of how Python's multiprocessing really work

Last Update:2016-04-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

It is well known that because of the existence of the global Lock (GIL) Problem of Python (Cpython), thread parallelism is not achievable. The multiprocessing module uses multi-process rather than multi-threading to achieve parallelism, solves the Gil problem, and, to a certain extent, eases the situation.

However, the multiprocess itself still has some functional bottlenecks. One of the important things is that memory cannot be shared between processes (the memory can be shared between threads). This means that when exchanging data between processes, the data needs to be packaged, passed, and unpacked. In the context of Python, it is:

"Pickle from main process to the subprocess;

Depickle from subprocess to a object in memory;

Pickle and return to the main process;

Depickle from main process and return to memory "

(For details, please refer to the Spit groove under this question)

Therefore, multi-process performance is not as good as a single process when there is a need to share huge packets between processes.

In addition, when the program that needs to run itself is not computationally intensive but IO-intensive, multi-process increased read and write will offset the gain of the operation speed, if the program complexity does not need to be solved at all in parallel, then the process (pool) time is likely to be slower than the running program itself, in addition, in the process pool Multiprocessing. the choice of N of Pool (n), if more than the number of cores of the current CPU ( Multiprocessing.cpu_count () ) is chosen, then the effort to switch between processes can be greatly inefficient.

Create a visual impression of threading and process relationships, which you can refer to in this article.

Get a quick and complete picture of Python's Global Lock (GIL) problem, and refer to this nice blog post.

To understand the use of multiprocess , I have done some tests to test the environment as a 4-core MacBook Air. As follows:

 from Import Process, Manager, Pool

1 deff (L):2 L.reverse ()3     return4     5 defMain ():6L1 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)]7L2 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)] 8L3 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)] 9L4 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)] TenL5 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)]  OneL6 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)]  AL7 = [Random.randrange (0, 100000, 1) forIinchRange (0, 100000)]  -s =time.time () -      forLinch[L1, L2, L3, L4, L5, L6, L7]: the f (L) -     Print "%s seconds"% (Time.time ()-s) -s =time.time () - map (f, [L1, L2, L3, L4, L5, L6, L7]) +     Print "%s seconds"% (Time.time ()-s) -p = Pool (4) +s =time.time () A P.map (f, [L1, L2, L3, L4, L5, L6, L7]) at     Print "%s seconds"% (Time.time ()-s) -     return

That is to test the operation time of F () on L1, L2, L3, L4, L5, L6, L7 7 lists respectively. First the loop operation, then the very useful map () function in Python, and finally the multiprocessing process pool multiprocessing. Pool.map () --4 worker processes are established in the process pool, which means that the next task is randomly assigned to 4 processes.

Each time the operation is re-timed, the results are as follows:

>>> Main ()0.00250101089478 seconds0.000663995742798 seconds0.907639980316 Seconds

Many processes are surprisingly slow. map () has a great efficiency boost relative to the loop operation.

So not all tasks are suitable for multiple processes (as for why the list inversion is not suitable for multi-process, I do not understand.) ）。 In recent experiments, I need to find ways to improve the efficiency of the future test will be put up, please look forward to ~

The question of how Python's multiprocessing really work

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The question of how Python's multiprocessing really work

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The question of how Python's multiprocessing really work

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support