Continue to tinker with the crawler, today posted a code, crawl point Network "Beauty" under the label of the picture, the original image.
#-*-Coding:utf-8-*-#---------------------------------------# program: dot Beauty picture Crawler # version: 0.2 # Author: Zippera # Date: 2013- 07-26 # language: Python 2.7 # Description: Can set download number of pages #---------------------------------------import urllib2 import urllib import R E Pat = re.compile (' <div class= "feed-big-img" >\n.*?imgsrc= "(ht.*?)
\".*?') NEXTURL1 = "http://www.diandian.com/tag/%E7%BE%8E%E5%A5%B3?page=" Count = 1 while count < 2:print "page" + STR (count) + "\ n" myurl = nexturl1 + str (count) Myres = Urllib2.urlopen (myurl) mypage = Myres.read () Ucpage = MyPa Ge.decode ("Utf-8") #转码 mat = Pat.findall (ucpage) If Len (MAT): CNT = 1 for item in MAT:P Rint "Page" + str (count) + "No." + str (CNT) + "URL:" + item + "\ n" cnt = 1 FNP = Re.compile (' (\w{10}\.\w+) $ ') FNR = Fnp.findall (item) If Fnr:fname = Fnr[0] Urllib.urlretrieve (item, fname) Else
: print "No data" Count + 1
How to: Create a new folder, save the code as a name.py file, and run Python name.py to download the picture to a folder.