This article mainly introduces the use of ElementTree parsing XML Examples in Python, this article also explained the basic concept of XML, XML several analytic methods and elementtree analytic examples, the need for friends can refer to the following
"Introduction to Basic XML Concepts"
XML refers to Extensible Markup Language (extensible Markup Language).
XML is designed to transmit and store data.
Concept One:
The code is as follows:
# foo element's start tag
# foo Element end tag
# Note: Each start tag must have a corresponding closing tag to close it, or it can be written as
Concept two:
The code is as follows:
# elements can be nested to arbitrary parameters
# bar element is child element of Foo element
# end tag for parent element foo
Concept three:
The code is as follows:
The # foo element has an attribute of Lang, which is: EN; corresponding to the Python dictionary (name-value);
The # bar element has an attribute of Lang, which is: CH; there is also an id attribute, the value is: 001, placed in ' or ';
# The lang attribute in the bar element does not conflict with the Foo element, and each element has a separate set of attributes;
Concept four:
The code is as follows:
# elements can have text content
# Note: If an element has no textual content and no child elements, it is an empty element.
Concept Five:
The code is as follows:
# info element is the root node
a # list element is a child node
B
C
Concept Six:
The code is as follows:
# You can define the default namespace by declaring xmlns, and the feed element is in the Http://www.w3.org/2005/Atom namespace
# The title element is also. A namespace declaration affects not only the elements that currently declare it, but also all child elements of that element
You can also define a namespace by Xmlns:prefix declaration and take its name as prefix.
Each element of the namespace must then be explicitly declared using this prefix (prefix).
# Feed belongs to namespace Atom
Dive into Mark # Title element also belongs to this namespace
# xmlns (XML Name space)
"XML several parsing methods"
Common XML programming interfaces are Dom and sax, and the two interfaces handle XML files in different ways, and the use of the context is naturally different.
Python has three ways to parse Xml:sax,dom and ElementTree:
1.SAX (simple APIs for XML)
The Pyhton standard library contains a SAX parser, which Sax uses an event-driven model to process an XML file by triggering events in parsing XML and invoking user-defined callback functions. Sax is an event-based-driven API. Using SAX to parse XML documents involves two parts: parsers and event handlers.
The parser is responsible for reading the XML document and sending events to the event handler, such as element start and end events, while the event handler is responsible for handling the event.
Advantage: Sax streams read XML files faster and consume less memory.
Disadvantage: User implementation callback function (handler) is required.
2.DOM (Document Object Model)
Parses XML data into a tree in memory, manipulating XML by manipulating the tree. When parsing an XML document, a DOM parser once you read the entire document and keep all the elements in the document in a tree structure in memory, you can then use the different functions provided by DOM to read or modify the contents and structure of the document, or you can write the modified content to an XML file.
Advantage: The advantage of using DOM is that you do not need to track the status, because each node knows who is its parent and who is the child node.
Disadvantage: DOM needs to map the XML data to the tree in memory, one is slower, the other is the memory consumption, it is more troublesome to use!
3.ElementTree (element Tree)
ElementTree is like a lightweight DOM with a user-friendly API. Code availability is good, fast, and consumes less memory.
In comparison, the third method, that is, convenient and fast, we have been using it! How to parse XML with an element tree is described below:
"ElementTree Resolution"
Two kinds of implementations
ElementTree was born to process XML, and it has two implementations in the Python standard library.
One is a pure Python implementation, for example: Xml.etree.ElementTree
The other one is a bit faster: Xml.etree.cElementTree
Try to use the kind that C language implements, because it is faster and consumes less memory! You can write this in your program:
The code is as follows:
Try
Import Xml.etree.cElementTree as ET
Except Importerror:
Import Xml.etree.ElementTree as ET
Common methods
The code is as follows:
# When you want to get the property value, use the Attrib method.
# When you want to get the node value, use the text method.
# When you want to get the section roll call, use the tag method.
Sample XML
The code is as follows:
Book Message
Bookone
Python check
001
A
Booktwo
Python Learn
002
-
###########
# # Load XML
###########
Method One: Load file
The code is as follows:
root = Et.parse (' Book.xml ')
Method Two: Load string
The code is as follows:
Root = et.fromstring (xmltext)
###########
# # Get Node
###########
Method One: Get the specified node->getiterator () method
The code is as follows:
Book_node = root.getiterator (' list ')
Method Two: Get the specified node->findall () method
The code is as follows:
Book_node = Root.findall (' list ')
Method Three: Get the specified node->find () method
The code is as follows:
Book_node = root.find (' list ')
Method Four: Get son node->getchildren ()
The code is as follows:
For node in Book_node:
Book_node_child = Node.getchildren () [0]
Print Book_node_child.tag, ' => ', Book_node_child.text
###########
# # Example 01
###########
Copy code code as follows:
# Coding=utf-8
Try: # import Module
Import Xml.etree.cElementTree as ET
Except Importerror:
Import Xml.etree.ElementTree as ET
root = Et.parse (' book.xml ') # Parsing XML file
Books = Root.findall ('/list ') # Find child nodes of list in all root directories
For book_list in books: # Traversing results after lookup
print "=" * 30 # Output format
For books in Book_list: # Iterate over each child node to find out your attributes and values
If Book.attrib.has_key (' id '): # A sentence ID to make conditional judgments
Print "ID:", book.attrib[' id '] # Print out property values based on ID
Print Book.tag + ' => ' + book.text # output label and text content
print "=" * 30
Output results:
The code is as follows:
==============================
Head=> Bookone
name=> python Check
Number=> 001
Page=> 200
==============================
Head=> Booktwo
Name=> python Learn
Number=> 002
Page=> 300
==============================