1 #Multi-threaded crawler2 #use of the map function3 #From multiprocessing.dummy import Pool4 #Pool=pool (4)5 #results = Pool.map (crawl function, url list)6 #Example Demo:7 fromMultiprocessing.dummyImportPool as ThreadPool8 ImportRequests9 Import TimeTen One defgetsource (URL): AHTML =requests.get (URL) - -URLs = [] the - forIinchRange (1,21): -NewPage ='http://tieba.baidu.com/p/3522395718?pn='+Str (i) - urls.append (newpage) + -Time1 =time.time () + A forIinchURLs: at Print(i) - GetSource (i) -Time2=time.time () - Print('single thread time consuming:'+ STR (time2-time1)) - - #Enable multithreading inPool = ThreadPool (4) -time3=time.time () toResults =Pool.map (getsource,urls) + pool.close () - Pool.join () theTime4 =time.time () * Print('Parallel time consuming:'+ STR (time4-time3)) $ Panax Notoginseng #Output Result: - #Single thread Time: 20.18715476989746 the #concurrent Time: 5.100291728973389
Multi-threaded Crawler