之前給大家分享的python 多線程抓取網頁,我覺的大家看了以後,應該會對python 抓取網頁有個很好的認識,不過這個只能用python 來抓取到網頁的原始碼,如果你想用做python 下載檔案的話,上面的可能就不適合你了,最近我在用python 做檔案下載的時候就遇到這個問題了,不過最終得以解決,為了讓大家以後碰過這個問題有更好的解決辦法,我把代碼發出來:
from os.path import basename
from urlparse import urlsplit
def url2name(url):
return basename(urlsplit(url)[2])
def download(url, localFileName = None):
localName = url2name(url)
req = urllib2.Request(url)
r = urllib2.urlopen(req)
if r.info().has_key('Content-Disposition'):
# If the response has Content-Disposition, we take file name from it
localName = r.info()['Content-Disposition'].split('filename=')[1]
if localName[0] == '"' or localName[0] == "'":
localName = localName[1:-1]
elif r.url != url:
# if we were redirected, the real file name we take from the final URL
localName = url2name(r.url)
if localFileName:
# we can force to save the file as specified name
localName = localFileName
f = open(localName, 'wb')
f.write(r.read())
f.close()
download(r'你要下載的python檔案的url地址')
趕快去試試把,可以在本地運行python去下載一些自己想要的pdf檔案吧。
文章連結:http://www.cnpythoner.com/post/pythonurldown.html 轉載請保留,謝謝!