Python multi-process fast or multithreaded fast?

Source: Internet
Author: User
The following small series for everyone to bring a Python multi-process and multi-threaded who is faster (detailed). Small series feel very good, now share to everyone, also for everyone to make a reference. Let's take a look at it with a little knitting.

python3.6

Threading and multiprocessing

Quad core + Samsung 250G-850-SSD

Since programming with multi-process and multi-threading, it is not clear who is faster. Many on the web say that Python is more process faster because of the Gil (Global interpreter Lock). But when I was writing the code, the test time was multithreaded faster, so what was going on? Recently do Word segmentation work, the original code speed is too slow, want to speed up, so to explore the effective method (text at the end of the code and)

Here comes a diagram of the program's results, which shows the threads and processes who are faster

Some definitions

Parallelism means that two or more events occur at the same time. Concurrency refers to two or more events that occur at the same time interval

A thread is the smallest unit that the operating system can perform operations on. It is included in the process and is the actual operating unit of the process. The execution instance of a program is a process.

Implementation process

And the multithreading inside the python obviously got to get Gil, execute code, and finally release Gil. So because Gil, multi-threaded time does not get, in fact, it is a concurrent implementation, that is, multiple events, within the same time interval occurs.

But the process has an independent Gil, so it can be implemented in parallel. Therefore, for multi-core CPUs, it is theoretically more efficient to use multi-process resources.

Real problems

Python's multi-threaded figure is often seen in tutorials on the web. such as the web crawler tutorial, port scanning tutorial.

Here with the port scan, you can implement the following script with multiple processes, you will find Python multi-process faster. So, isn't that a contradiction to our analysis?

Import sys,threadingfrom Socket Import *host = "127.0.0.1" If Len (sys.argv) ==1 Else sys.argv[1]portlist = [I for I in rang E (1,1000)]scanlist = []lock = Threading. Lock () print (' Please waiting ... From ', host) def Scanport (port):  try:    TCP = socket (af_inet,sock_stream)    tcp.connect ((host,port)  ) Except:    pass  else:    if Lock.acquire ():      print (' [+]port ', Port, ' open ')      lock.release ()  Finally:    tcp.close () for P in portlist:  t = Threading. Thread (target=scanport,args= (P,))  Scanlist.append (t) for I in Range (len (portlist)):  Scanlist[i].start () For I in range (len (portlist)):  Scanlist[i].join ()

Who's faster?

Because of the problem with Python locks, threads compete for locks, switch threads, and consume resources. So, take a bold guess:

In CPU-intensive tasks, multiple processes are faster or better, while IO-intensive, multi-threading can effectively improve efficiency.

Let's take a look at the following code:

Import timeimport Threadingimport multiprocessingmax_process = 4max_thread = Max_processdef Fun (n,n2): #cpu密集型 for I in Range (0,n): for J in range (0, (int.) (N*N*N*N2)): T = i*jdef Thread_main (n2): Thread_list = [] for I in range (0,max _thread): T = Threading. Thread (target=fun,args= (50,N2)) Thread_list.append (t) start = Time.time () print (' [+] much thread start ') for I in t Hread_list:i.start () for I in Thread_list:i.join () print (' [-] much thread use ', Time.time ()-start, ' s ') def Proce Ss_main (n2): p = multiprocessing. Pool (max_process) for I in Range (0,max_process): P.apply_async (func = fun,args= (50,n2)) start = Time.time () print (' [+] Much process start ') p.close () #关闭进程池 p.join () #等待所有子进程完毕 print (' [-] much process use ', time.time ()-start, ' s ') if Nam e== ' main ': Print ("[++]when n=50,n2=0.1:") Thread_main (0.1) Process_main (0.1) print ("[++]when n=50,n2=1:") thread_main (1) Process_main (1) print ("[++]when n=50,n2=10:") Thread_main (10) Process_main) 

The results are as follows:

As you can see, the gap is getting bigger when CPU usage is getting higher (the more code loops). Verify our Guess

CPU and IO-intensive

1, CPU-intensive code (various cycle processing, counting, etc.)

2, IO-intensive code (file processing, web crawler, etc.)

Judging method:

1, directly see CPU utilization, hard disk IO read and write speed

2, calculate more->cpu; time waiting for more (such as web crawler)->io

3, please self-Baidu

"Recommended"

1. Multi-process and multithreaded instances in Python (i)

2. Is it recommended to use multiple processes instead of multithreading in Python? Reasons to share recommended multi-process

3. Python multi-process and multithreaded instances (ii) Programming methods

4. Detailed description of Python processes, threads, and schedules

5. Python thread pool/process pool for concurrent programming

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.