Python captures Douban images and automatically saves the example

Source: Internet
Author: User
Python captures Douban images and automatically saves the example for learning. The example uses the beautifulsoup library to analyze the HTML code. beautifulsoup is an HTMLXML parser that can be used for web crawler environments Python 2.7.6, BS4, it can be run in powershell or command line. Make sure that the BS module is installed.

The code is as follows:


#-*-Coding: utf8 -*-
#2013.12.36 19:41 wnlo-c209
# Capture the image of dbmei.com.

From bs4 import BeautifulSoup
Import OS, sys, urllib2

# Create a folder, just learned yesterday
Path = OS. getcwd () # obtain the directory where the script is located
New_path = OS. path. join (path, u 'doubanque ')
If not OS. path. isdir (new_path ):
OS. mkdir (new_path)


Def page_loop (page = 0 ):
Url = 'http: // www.dbmeizi.com /? P = % s' % page
Content = urllib2.urlopen (url)

Soup = BeautifulSoup (content)

My_girl = soup. find_all ('IMG ')

# Added the end detection, which is hard to write ....
If my_girl = []:
Print U' all captured has been completed'
Sys. exit (0)

Print u'start grabbing'
For girl in my_girl:
Link = girl. get ('src ')
Flink = 'http: // www.dbmeizi.com/'+ link

Print flink
Content2 = urllib2.urlopen (flink). read ()
With open (u'watercress sister '+'/'+ flink [-11:], 'wb') as code: # learned on OSC
Code. write (content2)
Page = int (page) + 1
Print U' start to capture the next page'
Print the % s page '% page
Page_loop (page)

Page_loop ().

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.