XML instance:
Version One:
<?XML version= "1.0" encoding= "UTF-8"?><Countryname= "Chain"><Provinces><Heilongjiangname= "Citys"><Haerbin/><Daqing/></Heilongjiang><Guangdongname= "Citys"><Guangzhou/><Shenzhen/><Huhai/></Guangdong><Taiwanname= "Citys"><Taibei/><Gaoxiong/></Taiwan><Xinjiangname= "Citys"><WuLuMuQiWaith= "Tianqi">Clear</WuLuMuQi></Xinjiang></Provinces></Country>
No spaces, line breaks, version
Examples of Python operation operations:
fromlxmlImportetreeclassR_xpath_xml (object):def __init__(self): Self.xmetrpa=etree.parse ('Info.xml')#reading XML Data Pass defXPXM (self): XPXLM=Self.xmetrpaPrintEtree.tostring (XPXLM)#Print XML DataRoot=xpxlm.getroot ()#get the root of the tree PrintRoot.tag,' ',#Print root tag name PrintRoot.items ()#Get Label property names and property values forAinchRoot##遍历根下一集级标签 PrintA.tag,a.items (), A.text,'the type to be printed is:', type (a)#print Label name, Label property, label data forBinchA:PrintB.tag,b.items (), B.text#, b forCinchB:PrintC.tag,c.items (), C.text#, C forDinchC:PrintD.tag,d.items (), D.test,dPrintXpxlm.xpath ('//node ()')#. Items () #.tag Print '====================================================================================================='xa=xpxlm.xpath ('//heilongjiang/*') Printxa forXbinchXA:PrintXb.tag,xb.items (), Xb.text XC=xpxlm.xpath ('//xinjiang/*') PrintXC forXdinchXC:PrintXd.tag,xd.items (), Xd.textif __name__=='__main__': Xpx=r_xpath_xml () XPX.XPXM ( )
Apply the For Loop traversal tag hierarchy, tag gets the tag name, items () gets [(' attribute name ', ' attribute value ') through the dictionary function],text gets the data between the label pairs. Tag,items (), text for the type: <type ' lxml.etree._element ' >
Printing results:
<country name="chain "><provinces>Country [('name','chain')]provinces [] None the printed type is:<type'lxml.etree._element'>Heilongjiang [('name','Citys')] Nonehaerbin [] nonedaqing [] noneguangdong [('name','Citys')] Noneguangzhou [] noneshenzhen [] nonehuhai [] nonetaiwan [('name','Citys')] Nonetaibei [] nonegaoxiong [] nonexinjiang [('name','Citys')] Nonewulumuqi [('Waith','Tianqi')] Clear [<element country at 0x2d47b20>, <element provinces at 0x2d47990>, <element Heilongjiang at 0x2d479b8>, & Lt Element Haerbin at 0x2d47558>, <element daqing at 0x2d47328>, <element Guangdong at 0x2d47300>, <Element Guangzhou at 0x2d476e8>, <element Shenzhen at 0x2d47530>, <element Huhai at 0x2d472d8>, <element Taiwan at 0x2d47260>, <element Taibei at 0x2d47238>, <element Gaoxiong at 0x2d47080>, <element Xinjiang at 0x2 D47710> <element WuLuMuQi at 0x2d47968>, u'\u6674']=====================================================================================================[<element Haerbin at 0x2d479b8>, <element daqing at 0x2d47148>]haerbin [] nonedaqing [] none[<element WuLuMuQi at 0x2d47968>] Type: <type'List'>WuLuMuQi [('Waith','Tianqi')] Sunny
XML instance:
Version two:
<?XML version= "1.0" encoding= "UTF-8"?><Countryname= "Chain"> <Provinces> <city:tablexmlns:city= "Http://www.w3school.com.cn/furniture"> <Heilongjiangname= "Citys"><City:haerbin/><city:daqing/></Heilongjiang> <Guangdongname= "Citys"><City:guangzhou/><City:shenzhen/><City:zhuhai/></Guangdong> <Taiwanname= "Citys"><City:taibei/><City:gaoxiong/></Taiwan> <Xinjiangname= "Citys"><City:wulumuqi>Clear</City:wulumuqi></Xinjiang> </city:table> </Provinces></Country>
Instance:
print xpxlm.xpath ('//node ()')
Printing results:
Space carriage return character, namespace.
[<element Country at 0x2e79b20>, ' \n ', <element provinces at 0x2e79990>, ' \n ', <element {http://www.w3school.com.cn/furniture}table at 0x2e79710>, ' \n ', < ; Element Heilongjiang at 0x2e799b8>, <element {http://www.w3school.com.cn/furniture}haerbin at 0x2e79328> <element {http://www.w3school.com.cn/furniture}daqing at 0x2e79968>, ' \n ', < Element Guangdong at 0x2e79530>, <element {Http://www.w3school.com.cn/furniture}guangzhou at 0x2e79300>, < Element {http://www.w3school.com.cn/furniture}shenzhen at 0x2e792d8>, <element {http://www.w3school.com.cn/ Furniture}zhuhai at 0x2e79260>, ' \n ', <element Taiwan at 0x2e79238>, <element {htt P://www.w3school.com.cn/furniture}taibei at 0x2e79080>, <element {http://www.w3school.com.cn/furniture} Gaoxiong at 0x2e79058>, ' \n ', <Element Xinjiang at 0x2e796e8>, <element {Http://www.w3school.com.cn/furniture}wulumuqi at 0x2e79558>, u ' \ u6674 ', ' \n ', ' \n ', ' \ n ']
Remove spaces:
Xp=xpxlm.xpath ('//node ()') PrintXp#. Items () #.tag forIinchXP:if "' inchIor '\ n' inchI:Continue Else: PrintI.tag
Remove whitespace line break symbols by judging
Output Result:
Provinces{city}tableheilongjiang{city}haerbin{city}daqingguangdong{city}guangzhou{city}shenzhen{city} Zhuhaitaiwan{city}taibei{city}gaoxiongxinjiang{city}wulumuqi
Python iterates through XML through the lxml library (tag, attribute name, attribute value, tag pair property) through XPath query