ElementTree is a python XML processing module that provides a lightweight object model. It becomes part of the Python standard library after Python2.5, but it needs to be installed separately before Python2.4. The import Xml.etree.ElementTree operation is required when using the ElementTree module. ElementTree represents the entire XML node tree, and element represents a separate node in the number of nodes.
Building an XML file
ElementTree (tag), where tag represents the root node and initializes a ElementTree object. The Element (tag, attrib={}, **extra) function is used to construct a root node of the XML, where tag represents the name of the root node, and attrib is an option that represents the properties of the node. subelement (parent, tag, attrib={}, **extra) is used to construct a child node of an already existing node Element.text and Subelement.text represent the additional content properties of the element object, Element.tag and Element.attrib represent the label and attributes of the element object, respectively. Elementtree.write (file, encoding= ' Us-ascii ', Xml_declaration=none, Default_namespace=none, method= ' xml '), The function creates a new XML file and writes the number of nodes data to the XML file.
#encoding =utf-8import Xml.etree.ElementTree as et# new XML file Def buildnewsxmlfile (): #设置一个新节点 and set its label as root root = ET. Element ("root") #在root下新建两个子节点, set its name to Sina and chinabyte sina = ET. subelement (Root, "Sina") chinabyte = ET. subelement (Root, "Chinabyte") #在sina下新建两个子节点, set its node name to number and first Sina_number = ET. subelement (Sina, "number") Sina_number.text = "1" Sina_first = ET. subelement (Sina, "first") Sina_first.text = "http://roll.tech.sina.com.cn/internet_all/index_1.shtml" # Create a new two child node under Chinabyte, set its node name to number and first Chinabyte_number = ET. Subelement (Chinabyte, "number") Chinabyte_number.text = "1" Chinabyte_first = ET. Subelement (Chinabyte, "first") Chinabyte_first.text = "http://www.chinabyte.com/more/124566.shtml" # Save the node number information in ElementTree, and save it as an XML format file tree = ET. ElementTree (Root) tree.write ("Urlfile.xml")
parsing and modifying XML files
Elementtree.parse (source, parser=none), loads the XML file and returns the ElementTree object. Parser is an optional parameter, and if it is empty, the standard Xmlparser parser is used by default.
Elementtree.getroot () to get the root node. Returns the element object of the root node.
Element.remove (tag), remove the child node named tag under root
The following functions, ElementTree, and element objects are included. Find (Match), gets the first child node that matches the match, which can be either a tag name or a path. Returns an element FindText (Match,default=none), gets the content of the first configured match element FindAll (match), gets all the sub-nodes matching the match, Match can be a tag or a path, it returns a list containing the matching elements information iter (tag), creating a iterator with the current node as the root node.
Here is an XML file
<?xml version= "1.0"?><data> <country name= "Liechtenstein" > <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name= "Austria" direction= "E"/> <neighbor name= "Switzerland" direction= "W"/> </country> <country Name= "Singapore" > <rank>4</rank> <year>2011</year> <gdppc>59900 </gdppc> <neighbor name= "Malaysia" direction= "N"/> </country> <country name= " Panama "> <rank>68</rank> <year>2011</year> <gdppc>13600</ gdppc> <neighbor name= "Costa Rica" direction= "W"/> <neighbor name= "Colombia" direction= "E"/ > </country></data>
Now the code that parses the XML file
#解析Xml文件def Parsexmlfile (xml_name): #将XMl文件加载并返回一个ELementTree对象 tree = et.parse (xml_name) # Get the first element object that matches the Sina label sina = Tree.find ("contry") #得到sina的SubElement for Sub_tag in Sina: print Sub _tag.text #得到所有匹配sina标签的Element对象的list集合 list_contry = Tree.findall ("Contry") for Contry in List_contry : For Sub_tag in contry: print sub_tag.text# Modify XML file for rank in Tree.iter (' rank ') new_rank = Int ( Rank.text) +1 rank.text = str (new_rank) rank.set (' Updated ', ' yes ') Tree.write (xml_name)
The output for the first time is: 1,2008,14100 The second output is: 1,2008,14100,4,2011,59900,68,2011,13600 the modified XML file as
<?xml version= "1.0"?><data> <country name= "Liechtenstein" > <rank updated= "yes" >2 </rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name= " Austria "direction=" E "/> <neighbor name=" Switzerland "direction=" W "/> </country> < Country name= "Singapore" > <rank updated= "yes" >5</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor name= "Malaysia" direction= "N"/> </country > <country name= "Panama" > <rank updated= "yes" >69</rank> <year>2011< /year> <gdppc>13600</gdppc> <neighbor name= "Costa Rica" direction= "W"/> < Neighbor Name= "Colombia" direction= "E"/> </country></data>
Python's XML processing module ElementTree