Python parsing XML File instance

Last Update:2017-04-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use the Xml.etree.ElementTree module to parse the XML file as follows. The ElementTree module provides two classes to accomplish this purpose:

ElementTree represents the entire XML file (a tree structure)
Element represents one of the elements in the tree (node)

We operate the following XML file: migapp.xml

We can import the ElementTree module in the following ways: Import Xml.etree.ElementTree as ET

Alternatively, you can import only the parse parser: from Xml.etree.ElementTree import parse

First you need to open an XML file, the local file uses the Open function, and if it is an Internet file, use Urlopen:

f = open (' Migapp.xml ', ' RT ', encoding= ' utf-8 ')

The XML is then parsed.

1 Parsing XML files

1.1 parsing the root element

Tree = Et.parse (f) root = Tree.getroot () print (' Root.tag = ', Root.tag) print (' Root.attrib = ', Root.attrib)

1.2 parsing the son of the root

For children in Root:      # can only parse the son of root, unable to parse root descendants    print (Child.tag)    print (child.attrib) # attrib is a dict

1.3 to parse the descendants of the root by index

Print (Root[1][1].tag) print (Root[1][1].text)

1.4 Iterative parsing of all specified element

for element in Root.iter (' Environment '):    print (Element.attrib)

1.5 A few useful ways

# Element.findall () parse out all the sons of the specified element # element.find () resolves the first son of the specified element # element.get () resolves the attribute of the specified element attribfor Environment in Root.findall (' Environment '):    first_variable = environment.find (' variable ')    print (first_ Variable.get (' name '))

2 Modifying an XML file

Suppose we need to add a property size= "50" to each text element, modify its text to "Benxin Tuzi", add a child element date= "2016/01/16"

For text in Root.iter (' text '):    text.set (' Size ', ' ' ")    text.text = ' benxin Tuzi '    text.append (ET. Element (' Date ', attrib={}, text= ' 2016/01/16 ')) tree.write (' Output.xml ')

Part of the migapp.xml :

The corresponding part of the output.xml :

3 Illustrative matters

Do not use xml.py as the file name, otherwise the following error will occur:

Importerror:no module named ' Xml.etree '; ' XML ' isn't a package

Analysis:

This is because the import will first look under the current path, at this time found the existence of the xml.py module, and we wrote the xml.py of course not a package

Attention:

After deleting xml.py, it still cannot be explained successfully, because XML.PYC is also generated in the current path, and the priority of the file is higher than xml.py, so the interpreter would prefer to look for it in Xml.pyc, so the file must also be deleted to successfully resolve the problem.

Conclusion:

The file name should try not to have the same name as the package name or module name, even if you do not use the module or package in the script, there may be strange errors.

Many of the parsing functions provided in the ElementTree module require a pre-read of the entire XML document into memory, which is not a good thing for large XML parsing, especially when we are reading XML from the network and pipeline, and non-blocking parsing is important. At this point, we can use the Xmlpullparse class in the ElementTree module to handle it. Of course we can also choose the Iterparse () of the ElementTree module instead, which does not need to read all the memory when parsing large XML.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python parsing XML File instance

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support