ImportRequests fromlxmlImportetree fromMultiprocessingImportpoolheaders= { 'user-agent':'mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/55.0.2883.87 safari/537.36'}defgetlinks (URL): Wb_data= Requests.get (url,headers=headers) Soup=etree. HTML (wb_data.text) links= Soup.xpath ('//div[@class = "Job-info"]') forLinkinchLinks:href= Link.xpath ('h3/a/@href') [0] getinfo (HREF)defgetinfo (URL):Try: Wb_data= Requests.get (url,headers=headers) Soup=etree. HTML (Wb_data.text) requires= Soup.xpath ('//div[@class = "Job-title-left"]') forRequireinchrequires:a= Require.xpath ('Div[1]/span[3]/text ()') Print(a)except: href='https://www.liepin.com'+url getinfo (href)if __name__=='__main__': URLs= ['https://www.liepin.com/zhaopin/?pubtime=&ckid=e49b6fdfb6a698ed&fromsearchbtn=2&compkind=& isanalysis=&init=-1&searchtype=1&dqs=&industrytype=&jobkind=&sortflag=15& Degradeflag=0&industries=&salary=&compscale=&clean_condition=&key=python%e7%88%ac%e8%99%ab &headckid=e49b6fdfb6a698ed&d_pagesize=40&sitag=tf7woi2y6f2s2rnhw3wfuw~fa9rxquzc5ikjpxc-ycixw&d _headid=ca30a54749a469a7967ac218f6204031&d_ckid=ca30a54749a469a7967ac218f6204031&d_sfrom=search_prime &D_CURPAGE=2&CURPAGE={0}'. Format (str (i)) forIinchRange (0,4)] forUrlinchurls:getlinks (URL)
Perfect for the first time XPath construction incomplete href hunting nets