Python has three ways to parse XML, Sax,dom, and ElementTree, where elmenttree is easier to use and its APIs are more convenient and friendly. Good code availability, fast speed, low memory consumption.
The elements in the XML are mainly: Tag,value,attribute
An example of a simple Python parsing xml is as follows:
The XML file is: <?xml version= "1.0" encoding= "Utf-8"?> <info> <list id= ' 001 ' price= "" width= "60" >
The Python code is:
#-*-Coding:utf-8-*- import xml.etree.ElementTree as et def print_node (node):p rint "======================== =========== "print" node.tag:%s "% node.tagprint" node.text:%s "% node.textprint" Node.attribute:%s "%node.attrib# Read XML file def load_xml_file (filename): root =et.parse (filename) nodes = root.getiterator ("list") for node in Nodes:print_node (node) if __name__ = = ' __main__ ': load_xml_file (R ' Sample.xml ')
We can see that the XML attribute is a dictionary, so you can also iterate through the attribute in a dictionary to get the value of the specific property.
def print_node (node):p rint "===================================" print "node.tag:%s"% node.tagprint "node.text:%s"% Node.text#print "node.attr:%s"% node.attribmapattrib=node.attribfor key in Mapattrib:print "%s=%s"% (key,mapAttrib[ Key])
If you know the name of a specific attribute, you can also get the value of the property using the Get method:
def print_node (node):p rint "===================================" print "node.tag:%s"% node.tagprint "node.text:%s"% Node.textprint "id:%s"% Node.get ("id")
How do you traverse the elements of XML? ElementTree provides four methods, these four methods are Getiterator,getchildren,find and FindAll, respectively, the use of the following methods:
#-*-Coding:utf-8-*- import xml.etree.ElementTree as et def print_node (node):p rint "======================== =========== "print" node.tag:%s "% node.tagprint" node.text:%s "% node.textprint" id:%s "% Node.get (" id ") #读取xml文件 def load_xml_file (filename): root =et.parse (filename) #使用getiteratornodes = root.getiterator ("list") for node in Nodes:print_node (node) #使用getchildren () children = Nodes[0].getchildren () for child in Children:print_node (child) # Use the Find method Specnode =root.find ("list") if Specnode is not None:print_node (specnode) #使用findall方法node_findall = Root.findall ("List/name") [1]print_node (node_findall) if __name__ = = ' __main__ ': load_xml_file (R ' F:\python\ Advance\xml\sample.xml ')
Python Parsing xml file