Who is faster (detailed description) for python multi-process and multi-thread? Who is more

Source: Internet
Author: User

Who is faster (detailed description) for python multi-process and multi-thread? Who is more

Python3.6

Threading and multiprocessing

Quad-core + Samsung 250G-850-SSD

Since multi-process and multi-thread programming, I have never understood who is faster. Many people on the Internet say that python multi-process is faster, because GIL (Global interpreter lock ). But when I write code, the test time is faster than multithreading, so what is the problem? The word splitting work has been done recently. The original code is too slow to speed up, so I want to explore effective methods (there are code and at the end of this Article)

Here is a program result diagram to show who is faster in the thread and process.

Some definitions

Parallelism means that two or more events occur at the same time. Concurrency refers to the occurrence of two or more events at the same time interval.

A thread is the smallest unit that the operating system can schedule operations. It is included in the process and is the actual operating unit of the process. The execution instance of a program is a process.

Implementation Process

The multiple threads in python obviously need to get GIL, execute code, and finally release GIL. Therefore, GIL cannot be obtained when multithreading occurs. In fact, it is a concurrent implementation, that is, multiple events occur at the same time interval.

But the process has independent GIL, so it can be implemented in parallel. Therefore, for multi-core CPUs, multi-process is used theoretically to make better use of resources.

Practical problems

Python multithreading is often seen in online tutorials. For example, Web Crawler tutorial and Port Scan tutorial.

Taking port scanning as an example, you can use multi-process to implement the following script, and you will find that python multi-process is faster. Isn't it the opposite of our analysis?

import sys,threadingfrom socket import *host = "127.0.0.1" if len(sys.argv)==1 else sys.argv[1]portList = [i for i in range(1,1000)]scanList = []lock = threading.Lock()print('Please waiting... From ',host)def scanPort(port):  try:    tcp = socket(AF_INET,SOCK_STREAM)    tcp.connect((host,port))  except:    pass  else:    if lock.acquire():      print('[+]port',port,'open')      lock.release()  finally:    tcp.close()for p in portList:  t = threading.Thread(target=scanPort,args=(p,))  scanList.append(t)for i in range(len(portList)):  scanList[i].start()for i in range(len(portList)):  scanList[i].join()

Who is faster?

Because of the python lock issue, the thread will consume resources to compete for the lock and switch the thread. So let's make a bold guess:

In CPU-intensive tasks, multiple processes are faster or have better results. IO-intensive and multi-thread can effectively improve efficiency.

Let's take a look at the following code:

Import timeimport threadingimport multiprocessingmax_process = 4max_thread = max_processdef fun (n, n2): # cpu-intensive for I in range (0, n): for j in range (0, (int) (n * n2): t = I * jdef thread_main (n2): thread_list = [] for I in range (0, max_thread): t = threading. thread (target = fun, args = (50, n2) thread_list.append (t) start = time. time () print ('[+] much thread start') for I in thread_list: I. start () for I in thread_list: I. join () print ('[-] much thread use', time. time ()-start,'s ') def process_main (n2): p = multiprocessing. pool (max_process) for I in range (0, max_process): p. apply_async (func = fun, args = (50, n2) start = time. time () print ('[+] much process start') p. close () # close the process pool p. join () # Wait for all sub-processes to finish print ('[-] much process use', time. time ()-start,'s ') if _ name __= =' _ main _ ': print ("[++] When n = 50, n2 = 0.1: ") thread_main (0.1) process_main (0.1) print (" [++] When n = 50, n2 = 1: ") thread_main (1) process_main (1) print ("[++] When n = 50, n2 = 10:") thread_main (10) process_main (10)

The result is as follows:

As you can see, when the cpu usage is getting higher and higher (more code loops), the gap is getting bigger and bigger. Verify our conjecture

CPU and IO-intensive

1. CPU-intensive code (various cyclic processing, counting, etc)

2. IO-intensive code (File Processing, web crawler, etc)

Judgment Method:

1. Check the CPU usage and hard disk I/O read/write speed.

2. More computing-> CPU; more time wait (such as Web Crawler)-> IO

3. Baidu

Who is faster (detailed) than the above python multi-process and multi-thread is all the content shared by Alibaba Cloud. I hope to give you a reference and support for the customer's house.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.