python妹子圖簡單爬蟲執行個體

最後更新：2016-06-10 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

本文執行個體講述了python妹子圖簡單爬蟲實現方法。分享給大家供大家參考。具體如下：

#!/usr/bin/env python#coding: utf-8import urllibimport urllib2import osimport reimport sys#顯示下載進度def schedule(a,b,c):  '''''  a:已經下載的資料區塊  b:資料區塊的大小  c:遠程檔案的大小  '''  per = 100.0 * a * b / c  if per > 100 :    per = 100  print '%.2f%%' % per#擷取html源碼def getHtml(url):  page = urllib.urlopen(url)  html = page.read()  return html#下載圖片def downloadImg(html, num, foldername):  picpath = '%s' % (foldername) #下載到的本地目錄  if not os.path.exists(picpath): #路徑不存在時建立一個    os.makedirs(picpath)  target = picpath+'/%s.jpg' % num  myItems = re.findall('
',html,re.S)  print 'Downloading image to location: ' + target  urllib.urlretrieve(myItems[0], target, schedule)#正則匹配分頁def findPage(html):  myItems = re.findall('(\d*)', html, re.S)  return myItems.pop()#正則匹配列表def findList(html):  myItems = re.findall('.*?
', html, re.S)  return myItems#總下載def totalDownload(modelUrl):  listHtml5 = getHtml(modelUrl)  listContent = findList(listHtml)  for list in listContent:    html = getHtml('http://www.mzitu.com/' + str(list[0]))    totalNum = findPage(html)    for num in range(1, int(totalNum)+1):      if num == 1:        url = 'http://www.mzitu.com/' + str(list[0])        html5 = getHtml(url)        downloadImg(html5, str(num), str(list[1]))      else:        url = 'http://www.mzitu.com/' + str(list[0]) + '/'+str(num)        html5 = getHtml(url)        downloadImg(html5, str(num), str(list[1]))if __name__ == '__main__':  listHtml = getHtml('http://www.mzitu.com/model')  #這是其中一個模組的url，可以添加不同的模組url從而達到整站爬取。  for model in range(1, int(findPage(listHtml))+1):    if model == 1:      modelUrl = 'http://www.mzitu.com/model'      totalDownload(modelUrl)    else:      modelUrl = 'http://www.mzitu.com/model/page/' + str(model)      totalDownload(modelUrl)  print "Download has finished."

希望本文所述對大家的Python程式設計有所協助。



本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

python妹子圖簡單爬蟲執行個體

.*?

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support