Python download page Image

Source: Internet
Author: User

Today, I want to download many images, which is difficult to manually, so I wrote a small program. During this period, many problems were encountered.

The most important one is that some web pages will return 403 forbidden, which will be resolved after the header information is added. Record.

Here we use regular expressions, urllib web page programming, and other knowledge. I haven't used it for a long time. It's a review.

Code

 #-*-Encoding: UTF-8 -*-

Import re, urllib2

Def getpage (URL ):
'''Download the HTML code of the file and find the core code on the first floor '''
Opener = urllib2.build _ opener ()
# Error 403 and garbled characters are displayed if no header information is added
Opener. addheaders = [('user-agent', 'mozilla/5.0 ')];
Htmlall = opener. Open (URL). Read ()
Reg1floor = '<Div class = "msgfont"> (.*?) </Div>'
Html = Re. Search (reg1floor, htmlall)
Html = html. Group ()
# The file storage encoding and file editing encoding are both UTF-8, so decode once. Otherwise, garbled characters will appear, but the results will not be affected.
Return html. Decode ('utf-8 ')

Def getimg (URL ):
'''Image address from the core code, and download, save, and name '''
Regimg = ''
Dir = 'f: \ my_document \ Desktop \ temp \\'
Pagehtml = getpage (URL)
# Find all image addresses
Imglist = Re. findall (regimg, pagehtml)
# Print imglist
For index in xrange (1, Len (imglist) + 1 ):
Finename = dir + STR (INDEX) + '.jpg'
Urllib. urlretrieve (imglist [index-1], finename)
Print finename + 'OK! '


If _ name _ = '_ main __':
Getimg ('HTTP: // response ')

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.