python妹子圖簡單爬蟲執行個體

來源:互聯網
上載者:User
本文執行個體講述了python妹子圖簡單爬蟲實現方法。分享給大家供大家參考。具體如下:

#!/usr/bin/env python#coding: utf-8import urllibimport urllib2import osimport reimport sys#顯示下載進度def schedule(a,b,c):  '''''  a:已經下載的資料區塊  b:資料區塊的大小  c:遠程檔案的大小  '''  per = 100.0 * a * b / c  if per > 100 :    per = 100  print '%.2f%%' % per#擷取html源碼def getHtml(url):  page = urllib.urlopen(url)  html = page.read()  return html#下載圖片def downloadImg(html, num, foldername):  picpath = '%s' % (foldername) #下載到的本地目錄  if not os.path.exists(picpath): #路徑不存在時建立一個    os.makedirs(picpath)  target = picpath+'/%s.jpg' % num  myItems = re.findall('

',html,re.S) print 'Downloading image to location: ' + target urllib.urlretrieve(myItems[0], target, schedule)#正則匹配分頁def findPage(html): myItems = re.findall('(\d*)', html, re.S) return myItems.pop()#正則匹配列表def findList(html): myItems = re.findall('

.*?

', html, re.S) return myItems#總下載def totalDownload(modelUrl): listHtml5 = getHtml(modelUrl) listContent = findList(listHtml) for list in listContent: html = getHtml('http://www.mzitu.com/' + str(list[0])) totalNum = findPage(html) for num in range(1, int(totalNum)+1): if num == 1: url = 'http://www.mzitu.com/' + str(list[0]) html5 = getHtml(url) downloadImg(html5, str(num), str(list[1])) else: url = 'http://www.mzitu.com/' + str(list[0]) + '/'+str(num) html5 = getHtml(url) downloadImg(html5, str(num), str(list[1]))if __name__ == '__main__': listHtml = getHtml('http://www.mzitu.com/model') #這是其中一個模組的url,可以添加不同的模組url從而達到整站爬取。 for model in range(1, int(findPage(listHtml))+1): if model == 1: modelUrl = 'http://www.mzitu.com/model' totalDownload(modelUrl) else: modelUrl = 'http://www.mzitu.com/model/page/' + str(model) totalDownload(modelUrl) print "Download has finished."

希望本文所述對大家的Python程式設計有所協助。

  • 聯繫我們

    該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

    如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.