[Python] web crawler (vi): A simple Baidu bar paste of the small reptile
#-*-Coding:utf-8-*-#---------------------------------------# program: Baidu paste Stick Crawler # version: 0.1 # Author: Why # Date: 2013-05-1 4 # language: Python 2.7 # Action: Enter the address with paging, remove the last number, and set the start and end pages. # function: Download all pages in the corresponding page number and save as HTML file. #---------------------------------------Import String, Urllib2 #定义百度函数 def baidu_tieba (url,begin_page,end_page): For I in range (Begin_page, end_page+1): SName = String.zfill (i,5) + '. html ' #自动填充成六位的文件名 print ' Downloading section ' + str (i) + ' pages and storing them as ' + SName + ' ... ' F = open (SName, ' w+ ') m = urllib2.urlopen (url + str (i)) . Read () F.write (M) f.close () #--------Enter the parameters here------------------# This is the address of a post in Baidu Bar in Shandong University #bd url = ' http://tieba.baidu.com/p/2296017831?pn= ' #iPostBegin = 1 #iPostEnd = Bdurl = str (raw_input (U ') Please enter the address of the bar, remove pn= The following number: \ n ')) begin_page = Int (raw_input (U ' Enter the number of pages to start: \ n ')) end_page = Int (raw_input (U ' Please enter end of page: \ n ')) #--------Enter the parameters here--- ---------------#调用 Baidu_tieba (bdurl,begin_Page,end_page)
The above is [Python] web crawler (vi): A simple Baidu paste the content of the small reptile, more relevant content please pay attention to topic.alibabacloud.com (www.php.cn)!