Python crawler Getting Started: Beauty image crawler code sharing,
Continue to repeat the crawlers. Today, I posted a code to crawl the images and source images under the "beauty" tab of diandian.com.
#-*-Coding: UTF-8-*-# --------------------------------------- # program: dianmei image crawler # version: 0.2 # Author: zippera # Date: # language: Python 2.7 # description: number of pages that can be downloaded # ------------------------------------- import urllib2import urllibimport re pat = re. compile ('<div class = "feed-big-img"> \ n. *? Imgsrc = "(ht .*?) \".*? ') Nexturl1 = "http://www.diandian.com/tag/%E7%BE%8E%E5%A5%B3? Page = "count = 1 while count <2: print" Page "+ str (count) +" \ n "myurl = nexturl1 + str (count) myres = urllib2.urlopen (myurl) mypage = myres. read () ucpage = mypage. decode ("UTF-8") # transcoding mat = pat. findall (ucpage) if len (mat): cnt = 1 for item in mat: print "Page" + str (count) + "No. "+ str (cnt) +" url: "+ item +" \ n "cnt + = 1 fnp = re. compile ('(\ w {10 }\. \ w +) $ ') fnr = fnp. findall (item) if fnr: fname = fnr [0] urllib. urlretrieve (item, fname) else: print "no data" count + = 1
Usage: Create a folder, save the Code as the name. py file, and run python name. py to download the image to the folder.