Use XPath to extract the contents of all tags, even if the label header is different
1 #-*-coding:utf8-*-2 ImportRe3 ImportOS4 fromlxmlImportetree5HTML =" "6 <! DOCTYPE html>7 8 9 <meta charset= "UTF-8" >Ten <title> Testing-General usage </title> One A <body> - <div id= "Content" > - <ul id= "useful" > the <li> me </li> - <ml> is </ml> - <li> who </li> - </ul> + <ul id= "useless" > - <li>who </li> + <li>am </li> A <li>i! </li> at </ul> - </div> - <div id= "Content" > - <ul id= "useful" ><li> you </li><ml> is </ml><li> who! </li> - </ul> - <ul id= "useless" ><li>who </li><li>you </li><li>are! </li> in </ul> - </div> to + </body> - the " " *selector =etree. HTML (HTML) $ forKinchRange (1,3):Panax NotoginsengChinese = selector.xpath ('//div[@id = "Content"][%s]/ul[@id = "useful"]//text ()'%k) -data ="". join ([each foreachinchChinese]) the中文版 = Selector.xpath ('//div[@id = "Content"][%s]/ul[@id = "useless"]//text ()'%k) +Data ="". join ([each foreachinch中文版]) A PrintData the PrintData
Results:
XPath extracts the contents of all tags in the directory, recursively//text ()