Python uses the ElementTree module to process XML, and pythonelementtree
Preface
Recently, due to work requirements, we are using Python to send SOAP requests to test the performance of Web Services. Because SOAP is based on XML, it is inevitable to use python to process XML data. After comparing several solutions, we finally choose to usexml.etree.ElementTree
Module.
This article records the usagexml.etree.ElementTree
Several common operations of the module can be summarized to avoid forgetting them later. If you want to share your ideas and methods, you can refer to them for more information. Let's take a look at the detailed introduction.
Overview
Compared with other Python solutions for XML processing,xml.etree.ElementTree
The module (represented by ET below) is relatively simple and the interface is friendly.
The official document details the ET module. In general, the ET module can be divided into three parts: ElementTree class, Element class, and some XML functions.
XML can be viewed as a tree structure. ET uses the ElementTree class to represent the entire XML document and the Element class to represent a node of XML. Operations on XML documents are generally performed on ElementTree objects, while operations on XML nodes are generally performed on Element objects.
Parse XML files
The ET module supports constructing ElementTree objects from an XML file. For example, the content of example. XML in our xml file is as follows (this XML document will be used later ):
<?xml version="1.0" encoding="utf-8"?><data> <country name="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> <country name="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor name="Malaysia" direction="N"/> </country></data>
You can useparse()
Function to construct an ElementTree object from the specified XML file:
Import xml. etree. elementTree as ET # obtain the XML Document Object ElementTreetree = ET. parse ('example. xml ') # obtain the root node Elementroot = tree of the XML document object. getroot () # print the root node name print root. tag
After the ElementTree object is constructed from the XML file, you can also obtain its node, or continue to perform further operations on the node.
Parse XML strings
The fromstring () function of the ET module provides the ability to construct an Element object from an XML string.
xml_str = ET.tostring(root)print xml_strroot = ET.fromstring(xml_str)print root.tag
Then, we usetostring()
Function to convert the above constructed root object into a string, and then usefromstring()
The function re-constructs an Element object and assigns it to the root variable. In this case, root represents the root node of the entire XML file.
Construct XML
If we need to construct an XML document, we can use the Element class of the ET module andSubElement()
Function.
You can use the Element class to generate an Element object as the root node, and then useET.SubElement()
Function generation subnode.
a = ET.Element('a')b = ET.SubElement(a, 'b')b.text = 'leehao.me'c = ET.SubElement(a, 'c')c.attrib['greeting'] = 'hello'd = ET.SubElement(a, 'd')d.text = 'www.leehao.me'xml_str = ET.tostring(a, encoding='UTF-8')print xml_str
Output:
<?xml version='1.0' encoding='UTF-8'?><a><b>leehao.me</b><c greeting="hello" /><d>www.leehao.me</d></a>
If you want to output data to a file, you can continue to useElementTree.write()
Method to handle:
# Construct an ElementTree to use its write method tree = ET. ElementTree (a) tree. write ('A. xml', encoding = 'utf-8 ')
After execution, an XML file a. xml is generated:
<?xml version='1.0' encoding='UTF-8'?><a><b>leehao.me</b><c greeting="hello" /><d>www.leehao.me</d></a>
Search and update XML nodes
1. Search for XML nodes
The Element class providesElement.iter()
To find the specified node.Element.iter()
Recursively searches for all child nodes to find all nodes that meet the conditions.
# Obtain the XML Document Object ElementTreetree = ET. parse ('example. xml ') # obtain the root node Elementroot = tree of the XML document object. getroot () # Recursively search for all the neighbor subnodes for neighbor in root. iter ('neighbor '): print neighbor. attrib
Output:
{'direction': 'E', 'name': 'Austria'}{'direction': 'W', 'name': 'Switzerland'}{'direction': 'N', 'name': 'Malaysia'}
If you useElement.findall()
OrElement.find()
Method, it will only be searched from the direct subnode of the node, and will not be recursively searched.
for country in root.findall('country'): rank = country.find('rank').text name = country.get('name') print name, rank
Output:
Liechtenstein 1Singapore 4
2. Update the node
To update the node text, you can directly modifyElement.text
. To update node attributes, you can directly modifyElement.attrib
.
After updating the node, you can useElementTree.write()
Method To write the updated XML document to a file.
# Obtain the XML Document Object ElementTreetree = ET. parse ('example. xml ') # obtain the root node Elementroot = tree of the XML document object. getroot () for rank in root. iter ('rank '): new_rank = int (rank. text) + 1 rank. text = str (new_rank) rank. attrib ['updated'] = 'yes' tree. write ('output. xml', encoding = 'utf-8 ')
The newly generated output. xml file is as follows:
<?xml version='1.0' encoding='UTF-8'?><data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor direction="E" name="Austria" /> <neighbor direction="W" name="Switzerland" /> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor direction="N" name="Malaysia" /> </country></data>
Compare the example. xml file to see that the output. xml file has been updated.
Summary
The above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, please leave a message, thank you for your support.
References
- Https://docs.python.org/2/library/xml.html#xml-vulnerabilities
- Https://stackoverflow.com/questions/1912434/how-do-i-parse-xml-in-python