Recently began to learn python, people learn a language imperceptible First command often that is Hello world!
Here too, using Python to write a small reptile, climbing sister pictures of the "photos." The sense of efficiency is slow, it should be the problem of the code,
This caught dead the!!!
Need to install two good libraries, respectively requests and beautifulsoup, installation is very simple back up!
Yes, that's it.
The code can be tested and passed under Python 2.7.8 and Python 3.4.1!
#coding: utf-8import requestsfrom bs4 import beautifulsoupimport redownpath = "/jiaoben/python/meizitu/pic/" import urllibhead = {' user-agent ': ' mozilla/5.0 ( windows; u; windows nt 6.1; en-us; rv:1.9.1.6) Gecko/20091201 Firefox/ 3.5.6 '}timeout = 5photoname = 0c = '. jpeg ' pwd= '/jiaoben/python/meizitu/pic/' for x in range (1,4): site = "http://www.meizitu.com/a/qingchun_3_%d.html" %x page = requests.session (). Get (Site,headers=head,timeout=timeout) coding = (page.encoding) content = page.content#.decode (Coding). Encode (' Utf-8 ') contentsoup = beautifulsoup (Content) jpg = contentsoup.find_ All (' img ', {' class ': ' scrollloading '}) for photo in jpg: Photoadd = photo.get (' Data-original ') PhotoName +=1 Name = (str ( Photoname) +c) r = requests.get (photoadd,stream=true) With open (pwd+name, ' WB ') as fd: for Chunk in r.iter_content (): fd.write (Chunk) print ("You have download %d photos" % Photoname)
#在放一个以前用shell写的下载煎蛋妹子图片的代码吧!
The page number itself nested A for can download multi-point sister map ~ Lazy to get!
#!/bin/bashdate= ' date +%y%m%d-%h:%m ' pitcure_address= "/jiaoben/python/meizitu/pic" BROWSER= "Mozil5.0 ( windows nt 6.1; wow64; rv:32.0) gecko/20100101 firefox/32.0 "read -p " down number: " pagewebsite=" http://jandan.net/ooxx/page-"${page}" "SOURCE_WEBSITE="/HTTP/ Jandan.net/ooxx "#read -p " down number: " page#photo= ' curl -a $" BROWSER " -m 10 -e $SOURCE _website $WEBSITE |awk -f "://" '/.jpg/ {print $3} ' |awk ' {fs= ' \ '}{print $1} ' ' #curl -A ' mozil5.0 ( windows nt 6.1; wow64; rv:32.0) gecko/20100101 firefox/32.0 " -m 10 -e "Http://jandan.net/ooxx" "http://jandan.net/ooxx/page-1210" |awk -f "://" '/.jpg/ {print $3} ' |awk -f "\" " ' {print $1} ' photo= ' curl -a $ ' BROWSER " -m 10 -e $SOURCE _website $WEBSITE |awk -f "://" '/.jpg/ {print $3} ' |awk - F "\" " ' {print $1} ' number= ' $photo |wc -l ' for i in $photodowget -q -T 10 -P $PITCURE _address $i >/dev/nullname= ' echo ${i##*/} ' number =$ ((number+1))/bin/mv ${pitcure_address}${i##*/} ${pitcure_address} $page-$number. Jpgdone
This article is from the "Statby blog" blog, make sure to keep this source http://statby.blog.51cto.com/7588140/1569640
Python shell reptile Sister pictures