This article mainly introduces the python crawler getting started tutorial, the little girl image crawler code sharing. This article takes the collection and capturing the little girl image on the dot net as an example. if you need a friend, you can refer to continue crawling, today, I posted a code to crawl the image and the source image under the "beauty" tab of the dot network.
#-*-Coding: UTF-8-*-# --------------------------------------- # Program: dianmei Image crawler # Version: 0.2 # Author: zippera # Date: # Language: Python 2.7 # description: number of pages that can be downloaded # ------------------------------------- import urllib2import urllibimport re pat = re. compile ('\ N .*? Imgsrc = "(ht .*?) \".*? ') Nexturl1 = "http://www.diandian.com/tag/%E7%BE%8E%E5%A5%B3? Page = "count = 1 while count <2: print" Page "+ str (count) +" \ n "myurl = nexturl1 + str (count) myres = urllib2.urlopen (myurl) mypage = myres. read () ucpage = mypage. decode ("UTF-8") # transcoding mat = pat. findall (ucpage) if len (mat): cnt = 1 for item in mat: print "Page" + str (count) + "No. "+ str (cnt) +" url: "+ item +" \ n "cnt + = 1 fnp = re. compile ('(\ w {10 }\. \ w +) $ ') fnr = fnp. findall (item) if fnr: fname = fnr [0] urllib. urlretrieve (item, fname) else: print "no data" count + = 1
Usage: create a folder, save the code as the name. py file, and run python name. py to download the image to the folder.