< reprint > Why is it recommended to use multiple processes instead of multithreading in Python?

Source: Internet
Author: User

recently looking at Python multi-threading, often we will hear the veteran said:"Python under the multi-threading is chicken, recommend the use of multi-process!"        "But why do you say so?"

To know it, but also to know the reason why. So we have the following in-depth study:

first, the background is emphasized:
1. What is Gil?
The full name of the Gil is the global interpreter lock, the source of which is the first consideration of Python design and the decision to make for data security.
2. Each CPU can only execute one thread at a time (multithreading in a single-core CPU is only concurrency, not parallelism, concurrency and parallelism are all the concepts of simultaneous processing of multi-channel requests at the macro level. But concurrency differs from parallelism in that two or more events occur at the same time, while concurrency refers to two or more events that occur within the same interval. )

In Python multithreading, the way each thread executes:
1. Get Gil
2. Execute the code until sleep or a Python virtual machine suspends it.
3. Release Gil

It can be seen that a thread wants to execute, must first get the Gil, we could think of Gil as a "pass", and in a Python process, the Gil has only one. A thread that does not have a pass will not be allowed to enter CPU execution.         

in python2.x, the Gil's release logic is that the current thread meets the IO operation or the ticks count reaches a maximum of (ticks can be seen as a counter of Python itself, specifically for the Gil, which is zeroed after each release, this count can be        Sys.setcheckinterval to adjust), to be released.

and each time the Gil Lock is released, the thread is competing for locks and switching threads, consuming resources.        And because the Gil Lock exists, a process in Python can always execute only one thread at a time (the thread that gets the Gil can execute), which is why the multithreading efficiency of Python is not high on multicore CPUs.
So is Python's multithreading completely useless?         
here we discuss the following categories:
1, CPU-intensive code (a variety of cycle processing, counting, etc.), in this case, due to the calculation of more work, the ticks count will soon reach the threshold, and then trigger the Gil's release and re-competition (multiple threads switching back and forth is of course the need to consume resources), so Multithreading under Python is not friendly to CPU-intensive code.         

2, IO-intensive code (file processing, web crawler, etc.), multi-threading can effectively improve efficiency (single-threaded IO operation will be IO wait, resulting in unnecessary time wasted, and open multi-threaded can thread a wait, automatically switch to the threads B, can not waste CPU resources, To improve program execution efficiency). so Python's multithreading is more friendly to IO-intensive code.         

in python3.x, Gil does not use the ticks count, instead uses the timer (the current thread releases the Gil after the execution time reaches the threshold), which is more friendly to the CPU-intensive program, but still does not solve the problem that the Gil causes only one thread to execute at the same time, So efficiency is still not satisfactory.         
Please note: Multi-core multithreading is worse than single-core multithreading, because the single-core multi-threaded, every time the Gil is released, the thread that wakes up can get to the Gil lock, so it can be executed seamlessly, but multicore, after CPU0 release Gil, Other threads on the CPU will compete, but the Gil may be immediately CPU0, causing several other CPUs to wake up to wait until the switch time and then go to the scheduled state, causing thread bumps (thrashing), resulting in less efficiency
back to the beginning of the question: often we hear veteran say:"Python wants to take full advantage of multi-core CPU, use multi-process", what is the reason?

The reason is that each process has its own Gil, independent of each other, so that it can be executed in real parallel, so in Python, multi-Process execution is more efficient than multi-threading (only for multicore CPUs).       
So here's the conclusion: multi-core, want to do parallel lifting efficiency, more common method is to use multi-process , can effectively improve the efficiency of execution

Reprint Address: http://bbs.51cto.com/thread-1349105-1.html

< reprint > Why is it recommended to use multiple processes instead of multithreading in Python?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.