1 #First we can get the entire page information to download the picture2 #Coding=utf-83 #The urllib module provides an interface for reading Web page data, and we can read the data on WWW and FTP as if it were a local file .4 ImportUrllib5 ImportRe6 #First, we define a gethtml () function:7 defgethtml (URL):8 #Urllib.urlopen () method to open a URL address9page =urllib.urlopen (URL)Ten #The Read () method is used to read the data on the URL, pass a URL to the gethtml () function, and download the entire page. The execution program will print out the entire page. OneHTML =Page.read () A returnHTML - - #created the getimg () function to filter the desired picture connection in the entire page obtained the defgetimg (HTML): - #use regular expressions to remove the URL of a picture from a page -Reg = R'src= "(. +?\.jpg)" Pic_ext' - #Re.compile () can compile regular expressions into a regular expression object +Imgre =Re.compile (REG) - #the Re.findall () method reads data in HTML that contains Imgre (regular expressions) +Imglist =Re.findall (imgre,html) A #The acquired picture connection is traversed through a for loop, in order to make the picture's file name look more canonical and rename it, and the naming convention is added 1 by the x variable atx =0 - forImgurlinchimglist: - #Urllib.urlretrieve () method to download remote data directly to a local -Urllib.urlretrieve (Imgurl,'%s.jpg'%x) -X+=1 - #The URLs we want to crawl may not be the same for each URL, so different regular expressions are required. inhtml = gethtml ("http://tieba.baidu.com/p/2460150866") - Printgetimg (HTML) to +Http://www.cnblogs.com/fnng/p/3576154.html#Top
Update
If you want to download it in the file you specify, you only need to modify the
Urllib.urlretrieve (Imgurl, '%s.jpg '% x)
Can
Urlretrieve (Url,path)
Change path to what you want, as I want to put in F:\pic.
Urllib.urlretrieve (Imgurl, ' f:/pic/%s.jpg '% x)
You can have it.
Python implements simple crawler functions