Writing a simple web crawler using Python (i)

Source: Internet
Author: User

Finally have the time to do with the Python knowledge learned to write a simple web crawler, this example is mainly implemented with Python crawler from the Baidu Gallery to download beautiful pictures, and saved in the local, gossip less, directly posted the corresponding code as follows:

-------------------------------------------------------------------------------------------

#coding =utf-8#  import Urllib and re modules  import urllibimport re#  define the class to get the URL of Baidu library;   class  gethtml:    def __init__ (Self,url):         self.url = url    def gethtml (self):         page = urllib.urlopen (Self.url)          Html = page.read ()         return html #  Defines the class that handles the Gethtml class gethtml return value (the link address of a picture of a beautiful woman in a Baidu gallery);#  This class mainly implements the extraction of the image link address and the download of the corresponding picture (the downloaded picture is stored directly locally);          class getimg:    def __init__ (self,html):         self.html = html    def  Getimg (self):         reg = r ' "Thumblargeurl"  :  "(. +?\.jpg) "'         imgre = re.compile (Reg,re. S|re. M)         imglist = re.findall (imgre,self.html)          # print imglist        x  = 1        for imgurl in imglist:             urllib.urlretrieve (Imgurl, '%s.jpg '  %  X)             y = x+1             print  '%s picture download complete, download section%s, please later ... '  % (x, y)             x+=1         x-=1        print  '--------This download completed, total download picture% S-Zhang---------'  %x#  define the program's main entry   if __name__==  ' __main__ ':    url =  ' http://image.baidu.com/channel?c=%E7%BE%8E%E5%A5%B3#%E7%BE%8E% E5%a5%b3 "    test = gethtml (URL)     p =  Test.gethtml ()     m = getimg (P)     m.getimg ()

This article is from the "Simple New Life" blog, please be sure to keep this source http://857768.blog.51cto.com/847768/1641193

Writing a simple web crawler using Python (i)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.