Python Thread pool Usage

Source: Internet
Author: User

Traditional multithreaded scenarios use the "instant Create, instant destroy" strategy. Although the time to create a thread has been greatly shortened compared to the creation process, if the task submitted to the thread is a short execution time and is executed very frequently, the server will be constantly creating threads and destroying the state of the thread.

The run time of a thread can be divided into 3 parts: The start time of the thread, the run time of the thread body, and the time the thread was destroyed. In a multithreaded scenario, if a thread cannot be reused, it means that each creation requires 3 processes to be started, destroyed, and run. This inevitably increases the system's corresponding time and reduces the efficiency.

Using the thread pool:
Because threads are pre-created and placed in the thread pool, and are not destroyed but are scheduled to process the next task after the current task has been processed, it is possible to avoid creating threads multiple times, thus saving the overhead of thread creation and destruction, resulting in better performance and system stability.

Try using thread pooling to implement crawlers

The thread pool class library needs to be installed before use:

Pip Install ThreadPool

#!/usr/bin/env python#Coding:utf-8#@Time: 2018/4/19 16:06#@Author: Chenjisheng#@File: 17zwd_sample.py#@Mail: [email protected] fromBs4ImportBeautifulSoupImportThreadPoolImportRequestsImportThreadingImportDatetimebaseurl="http://hz.17zwd.com/sks.htm?cateid=0&page="#Reptile FunctiondefgetResponse (URL): Target= BaseURL +URL Content=Requests.get (target). Text Soup= BeautifulSoup (Content,'lxml') Tags= Soup.find_all ('Div', attrs={"class":"Huohao-img-container"})     forTaginchTags:imgurl= Tag.find ('img'). Get ('data-original')        #print (Imgurl)#defines a thread of 10StartTime =Datetime.datetime.now () pool= ThreadPool. ThreadPool (10)#defining tasks for the thread pooltasks = Threadpool.makerequests (GetResponse, [str (x) forXinchRange (1, 11)])#To start a task using the thread pool[Pool.putrequest (Task) forTaskinchtasks]pool.wait () Endtime=Datetime.datetime.now () alltime= (Endtime-starttime). SecondsPrint("Total thread pool time is: {} seconds". Format (alltime))#Traditional ThreadingStarttime1 =Datetime.datetime.now () tasklist= [Threading. Thread (Target=getresponse (str (x))) forXinchRange (1, 11)] forIinchTasklist:i.start () forIinchtasklist:i.join () endtime1=Datetime.datetime.now () alltime1= (Endtime1-starttime1). SecondsPrint("traditional threads are always time consuming: {} seconds". Format (alltime1))if __name__=="__main__":    Pass

Final execution Result: thread pool takes 3 seconds, traditional threads take 9 seconds;

The difference is still quite big ha;

Python Thread pool Usage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.