Environment: Windows 7 64-bit; Python2.7;ide pycharm2016.1
Function:
Batch download Baidu paste bar Some of the pages of all the posts in all the pictures
How to use:
1. Install the python2.7, install the RE module, install the URLLIB2 module
2. Copy the following source code to save as tbimgidownloader.py file
3. Open a post and copy its URL
4. Open the file tbimgidownloader.py enter the URL in the single quotation mark on line 37th to save
5. Double-click tbimgidownloader.py
Description
1. The program can download about 50 posts in the picture
2. Image name automatically saved as time + bit order
3. If you cannot run, please contact us
4. Do not copy the line number when copying the source code (I've done this before.-_-| | | )
5. Feel good classmates don't forget to recommend Oh!
1 #!/usr/bin/env python2 #Coding=utf-83 4 ImportRe,time5 ImportUrllib2,urllib6 7 8 deftiebaimgidownloader (URL):9 " "Ten Post- paste jpg format Picture downloader: One formal parameters The URL address of a certain post A Save picture to this directory after running - " " - thePattern = R'img class= "bde_image". *?src= "(. *?jpg)"'#Regular expression to crawl links -fstr = Urllib2.urlopen (URL). Read ()#read post page source code for STR to FSTR -Urllist = Re.findall (PATTERN,FSTR)#Crawl all JPG links that match regular expressions and save them in Urllist -Urllist =list (set (urllist)) + - Print 'crawl A total of%d image links'%len (Urllist),'\ n' + Ai = 1 at forFurlinchurllist: -Timestr = Time.strftime ('%y%m%d%h%m%s') -Urllib.urlretrieve (furl,timestr+'0%d.jpg'%i#Download the picture one by one and name it the current time + ordinal - Print 'saved Pictures', timestr+'0%d.jpg\n'%I -I+=1 - in Print 'Picture Download Complete! \n\n\n' - to returnTrue + - the def __main__(): * Print '\n\t\t\t Welcome to the use of post-bar jpg format Picture Downloader! \ n' $ Panax Notoginsenghtml = Urllib.urlopen ("'). Read ()#read the page source code of a certain page ...... ..... Paste the URL area ......... ........... - " "URL Example the 1.http://tieba.baidu.com/f?kw=%be%cf%e6%ba%b5t&fr=ala0&loc=rec Small Bow + 2.http://tieba.baidu.com/f?kw=%e9%9e%a0%e5%a9%a7%e7%a5%8e&ie=utf-8&pn=200 Small Bow A 3.http://tieba.baidu.com/f?kw=%e5%a3%81%e7%ba%b8&ie=utf-8&tab=good Boutique Wallpaper the " " +Pattern = R'a href= "(. p.[0-9]*)"' #Regular expression to crawl the URL of Level two Web page -Urllist = Re.findall (pattern, HTML)#Crawl All Level two Web URLs and return to list $Urllist = List (set (Urllist))#Delete a duplicate Level Two Web page URL $Preurl = R'http://tieba.baidu.com' #prefix URLs for level two web URLs - Print 'Crawl%d level two pages \ n'%Len (urllist) - the forUrloneinchurllist: -Tiebaimgidownloader (Preurl + urlone)#download images from level two Web pagesWuyi the return0 - Wu - if __name__=='__main__': About __main__()
PostScript: This article is my original, reproduced please indicate the source, thank you for your cooperation
Bulk download with Python paste image attached source code