Play Python (7) Python multi-process, multi-threaded comparison

Last Update:2018-06-25 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Some time ago in a project, the project itself is not difficult, but the data there is a data interface service provider there, which means that the front-end access to data requires at least two HTTP requests, the first is the front-end to the back-end of the request, the second is the backend to the data interface request. Sometimes, after the backend receives a request from the front end, it may be necessary to request a number of interfaces, according to the traditional serial execution of the request method, the user experience is definitely very bad, but also a great waste of computing resources, just the previous period of time to learn the process and threading knowledge, so I spent some time, Several feasible schemes were tested and compared. At first I used real network IO to test, found that this method is affected by the network environment is relatively large, in order to be fair, with sleep (0.02) instead of network IO, the next introduction to the program and test results.

Basic scenario: Serial execution

import timedef wget(flag):    time.sleep(0.02) #模拟网络io    print(flag)count = 100 #进行100次请求start = time.time() #开始时刻for i in range(count):    wget(i)end = time.time() #结束时刻cost = end - start #耗时print(‘cost:‘ + str(spend))

Ultimately, it takes more than 2s, and there's no doubt that this is the least efficient solution, just as a reference.

Improved Scenario: Multithreading

The multithreaded timing method obviously cannot replicate the serial execution of the test scheme, because after each thread starts, if the call to join () block the main thread, then the equivalent of serial execution, if not call join (), then end of time will be all the threads completed before the completion of the test results must be denied. So I used a stupid way: at the end of each thread to print the current timestamp, the last time stamp on the console minus the time the thread began to execute the timestamp, is run time-consuming.

import timeimport threadingmutex=threading.Lock() #初始化锁对象def wget():    time.sleep(0.02) #模拟网络io    mutex.acquire() #加锁    print(‘endtime:‘+str(time.time())) #当前线程的结束时刻    mutex.release() #释放锁count = 100 #进行100次请求start = time.time() #开始时刻print(‘starttime:‘+str(start))for i in range(count):    t = threading.Thread(target=wget)    t.start()

The final result is about 0.08s, which is obviously much better than serial execution, but the efficiency of the program does not grow linearly as the number of threads increases, because the overhead of thread creation switching is destroyed in extreme cases. This scenario should be used with caution if the number of threads is not controllable.

Improvement Scenario: Thread pool

To address the shortcomings of the multithreading mentioned above, the thread pool is used here.

import timeimport threadingfrom concurrent.futures import ThreadPoolExecutor, as_completed, wait, ALL_COMPLETEDmutex=threading.Lock() #初始化锁对象def wget():    time.sleep(0.02) #模拟网络io    mutex.acquire() #加锁    print(threading.currentThread())    mutex.release() #释放锁size = 40 #线程池大小count = 100 #进行100次请求start = time.time() #开始时刻pool = ThreadPoolExecutor(max_workers=size) #线程池对象tasks = [pool.submit(wget) for i in range(count)]wait(tasks, return_when=ALL_COMPLETED) #等待所有线程完成end = time.time() #结束时刻print(‘spend:‘ + str(end-start))

After testing, it was found that when the number of thread pool threads was set to 40 o'clock, the time-consuming was minimal, about 0.08s, equivalent to the previous scenario, but as the number of threads continued to increase, the thread pool's stability was revealed.

Improved scenario: Co-process asynchronous

import asyncioimport timeasync def wget(flag): #async关键字表明这是一个异步操作    await asyncio.sleep(0.02) #await相当于yield from    print(flag)count = 100 #进行100次请求start = time.time() #开始时刻loop = asyncio.get_event_loop() #事件循环对象tasks = [wget(i) for i in range(count)]loop.run_until_complete(asyncio.wait(tasks)) #等待所有协程完成loop.close()end = time.time() #结束时刻print(‘spend:‘ + str(end - start))

The result of this program is quite shocking to me, no use of any multithreading technology, all the operations on one thread to complete, time consuming 0.06s, is the least time-consuming in all scenarios. The more exciting advantage of the process is that the cost of the co-creation is completely negligible compared to the thread, which means that more tasks can be handled by using multiple threads.

Summarize

Obviously in these scenarios, the process is the most advantageous, in the future if there is time, I will also be multi-course multi-process synergy test.

Play Python (7) Python multi-process, multi-threaded comparison

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Play Python (7) Python multi-process, multi-threaded comparison

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support