An overview
Reference http://www.cnblogs.com/abelsu/p/4540711.html got a python capture of a single Web page, but Python has been upgraded to an all-in-one version. The reference has been invalidated and is largely unused. Modified the next, re-implement the web image capture.
Two codes
#Coding=utf-8#The urllib module provides an interface for reading Web page dataImportUrllib#the RE module mainly contains regular expressionsImportReImportUrllib.parseImporturllib.request#define a gethtml () functiondefgethtml (URL): page= Urllib.request.urlopen (URL)#Urllib.urlopen () method to open a URL addresshtml = Page.read ()#the Read () method is used to read the data on the URLhtml = Html.decode ('UTF8') #print (HTML) returnHTMLdefgetimg (HTML): Reg= R'img.* src= "(. +?\.jpg)"' #regular expression, get the picture addressImgre = Re.compile (reg)#Re.compile () can compile a regular expression into a regular expression object.imglist= Re.findall (imgre,html)#the Re.findall () method reads data in HTML that contains Imgre (regular expressions) #Pass the filtered image address through the For loop and save to local #The core is the Urllib.urlretrieve () method, which directly downloads the remote data to the local, and the image is incremented by x in turn namedx =0 forImgurlinchImglist:urllib.request.urlretrieve (Imgurl,'E:\Raumrot\%s.jpg'%x) x+=1Print(imgurl) HTML= Gethtml ("http://raumrot.com/photo-set-landing/") getimg (HTML)
Three-effect
Python crawler web Images