A simple Python program for downloading Web resource downloads such as Pdf/txt/ppt.
ImportUrllibImportUrllib2ImportReImportSocket###################### #You may change here###############BaseURL =' ########## ' #请自行添加下载网页地址Format =' (pdf|txt|cc|ppt|pptx) ' #下载格式, you can add it yourself######################################################### def downfunc(Blocknum, BlockSize, totalsize): "' callback function @blocknum: the data block @blocksize: The size of the data block @totalsize: The size of the downloaded file "Percent =100.0* Blocknum * blocksize/totalsizeifPercent > -: percent = - Print "Download Complete ^^~" Else:Print "Downloaded%.2f%% ..."% percent def download(Downurl, Localfilename=none): #m = Re.search (' (\w+.pdf) ', Downurl)m = Re.search (' (\w+.%s) '% Format,downurl)ifLocalFilename = =None: LocalFilename = M.group (0)Print("Downloading"+ LocalFilename)Try: Urllib.urlretrieve (Downurl, Localfilename,downfunc)exceptSocket.timeout:Print "Download Timeout"Socket.setdefaulttimeout ( -)#打开页面page = Urllib2.urlopen (BaseURL)# Read page information that contains HTML source contentPage_inform = Page.read ()# Get a list of resources#list_of_res = Re.findall (R ' href=.* "(. *\.pdf) ', Page_inform)List_of_res = Re.findall (R ' href=.* ' (. *\.%s) '% format, page_inform)# Remove Duplicate ResourcesList_of_res = List (set (List_of_res))# Download by resource list individually forResinchList_of_res:downurl = res[0]ifdownurl[0:4] !=' http ': Downurl = baseurl+downurl Download (downurl)
Program Download:
Web Resource Downloader
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Web Resource Downloader