Use Python+xpath to get the download link for https://pypi.python.org/pypi/lxml/2.3/:
After using requests to get the HTML, analyze the tags found in HTML to find the link in <table class= "list" >...</table>
They were then given<trclass="Odd"> and <tr class=" even"> content, which can be written as XPath when using XPath ('//table[@class = ' list ']/ tr[@class = "even" or "odd"]/td/span/a[1]/@href ')
Import re
Import requests
Import Urllib2
From lxml import etree
Url= ' https://pypi.python.org/pypi/lxml/2.3/'
head={' user-agent ': ' mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/31.0.1650.63 safari/537.36 '}
def gethtml (URL, *args):
html = requests.get (URL, *args). Content
return HTML
def writfile (cont):
Try
FD = open (' X.txt ', ' W ')
Try
Fd.write (cont)
Finally
Fd.close ()
Except IOError:
Print "File not existing!"
def ReadFile ():
Try
FD = open (' X.txt ', ' R ')
Try
All_the_text = Fd.read ()
Finally
Fd.close ()
Except IOError:
Print "File Open Error!"
Return All_the_text
html = gethtml (URL, head)
Writfile (HTML)
All_text = ReadFile ()
Dom = etree. HTML (All_text)
Url_list = Dom.xpath ('//table[@class = "List"]/tr[@class = "even" or "odd"]/td/span/a[1]/@href ')
For URL in url_list:
Print URL
After testing, the corresponding download link can be obtained normally. As a beginner, the code has a lot of inappropriate places, but also ask Daniel to correct them after reviewing.
This article is from the "Ignorance" blog, please be sure to keep this source http://huangfuligang.blog.51cto.com/9181639/1706837
Use Python+xpath to get download links