Python version is 2.7.9, in WIN8 test success, is crawl a bit slow, originally want to use multi-threaded, something just. Template home Site URL parameters and page does not match, too lazy to do analysis, you change the code in the URL it. Big God don't squirt!
Copy the Code code as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
# by Ustcwq
# 2015-03-15
Import Urllib,urllib2,os,time
From BS4 import BeautifulSoup
Start = Time.clock ()
Path = OS.GETCWD () +u '/template Home crawl template/'
If not Os.path.isdir (path):
Os.mkdir (PATH)
url = "http://www.cssmoban.com/cssthemes/index_80.shtml" # How does the number after index in the source site be arranged?
Theme_url = ' http://www.cssmoban.com/cssthemes/'
Response = Urllib2.urlopen (URL)
Soup = BeautifulSoup (response)
result = Soup.select (' p[class= "title"] a ')
Print result
For item in Result:
link = item[' href ']
# down_name = item.text # file name
New_url = Theme_url+link.split ('/') [-1]
Response = Urllib2.urlopen (New_url)
Soup = BeautifulSoup (response)
result = Soup.select ('. btn a ')
Down_url = result[1][' href '] # file link
Local = Path+time.strftime ('%y%m%d%h%m%s ', Time.localtime (Time.time ())) + '. Zip '
Urllib.urlretrieve (Down_url, local) # Remote Save function
End = Time.clock ()
Print U ' template crawl complete! '
Print U ' altogether time: ', end-start,u ' s '
The above mentioned is the whole content of this article, I hope you can like.