Python class library 31[using Minidom read-write Xml]__python

Source: Internet
Author: User

A python-provided XML support
2 Industrial Standard XML parsing methods-sax and Dom. SAX (Simple API for XML) is event-based, and when an XML document is read sequentially, each encounter of an element triggers the corresponding event handler to handle it. Dom (document Object Model), by building a tree structure to represent the entire XML document, once the tree is built, you can provide an interface through the DOM to traverse the tree and extract the corresponding data.

Python also provides Python's unique XML parsing method, which is elementtree compared to sax and Dom easier to use and faster.

Python's XML modules are:

1) xml.dom.minidom

2) Xml.elementtree

3) Xml.sax + xml.dom

Two XML instances: (employees.xml)

<?xml version= "1.0" encoding= "UTF-8"?>











Three uses Xml.dom.minidom to read and write XML

1 use Xml.dom.minidom to parse xml:

Def testminidom ():
From Xml.dom import Minidom
doc = Minidom.parse ("Employees.xml")

# get root element: <employees/>
root = Doc.documentelement

# Get all children elements: <employee/> <employee/>
Employees = Root.getelementsbytagname ("employee")

For employee in Employees:
Print ("-------------------------------------------")
# element Name:employee
Print (Employee.nodename)
# element XML content: <employee><name>windows</name><age>20</age></employee>
# basically equal to Toprettyxml function
Print (Employee.toxml ())

Namenode = Employee.getelementsbytagname ("name") [0]
Print (Namenode.childnodes)
Print (Namenode.nodename + ":" + namenode.childnodes[0].nodevalue)
Agenode = Employee.getelementsbytagname ("Age") [0]
Print (Agenode.childnodes)
Print (Agenode.nodename + ":" + agenode.childnodes[0].nodevalue)

Print ("-------------------------------------------")
# Children nodes: \ is one text element
# [
# <dom Text node ' \ n ',
# <dom element:name at 0xc9e490>
# <dom Text node ' \ n ',
# <dom element:age at 0xc9e4f0>
# <dom Text node ' \ n ' >
# ]
For N in Employee.childnodes:
Print (n)

Testminidom ()

Run Result:

[<dom Text node ' Linux ']
[<dom Text node ']
<dom Text node "' \ n '" >
<dom Element:name at 0xc9f590>
<dom Text node "' \ n '" >
<dom Element:age at 0xc9f5f0>
<dom Text node "' \ n '" >
[<dom Text node ' windows ']
[<dom Text node "'"]
<dom Text node "' \ n '" >
<dom Element:name at 0xc9f6b0>
<dom Text node "' \ n '" >
<dom Element:age at 0xc9f710>
<dom Text node "' \ n '" >

2 use Xml.dom.minidom to generate XML:

Def generatexml ():
Import Xml.dom.minidom
Impl = Xml.dom.minidom.getDOMImplementation ()
Dom = Impl.createdocument (None, ' employees ', none)
root = Dom.documentelement
Employee = dom.createelement (' employee ')
Root.appendchild (Employee)

Namee=dom.createelement (' name ')
Namet=dom.createtextnode (' Linux ')
Namee.appendchild (Namet)
Employee.appendchild (Namee)

Agee=dom.createelement (' age ')
Aget=dom.createtextnode (' 30 ')
Agee.appendchild (Aget)
Employee.appendchild (Agee)

f= open (' Employees2.xml ', ' W ', encoding= ' utf-8 ')
Dom.writexml (F, addindent= ', newl= ' \ n ', encoding= ' utf-8 ')
F.close ()

Generatexml ()

Run Result:

<?xml version= "1.0" encoding= "Utf-8"?>

3 The use of xml.dom.minidom need to pay attention to

* Use Parse () or createdocument () to return the DOM object;
* Root Element can be obtained using the DocumentElement property of the DOM;
*dom is a tree structure, contains many nodes, in which element is a node, can contain the child elements,textnode is also a node, is the final child node;
* Each node has a Nodename,nodevalue,nodetype attribute, NodeValue is the value of the node and is valid only for Textnode. For Textnode, the text content to which you want it can be used:. Data property.
The *nodetype is the type of node and now has the following:

' Attribute_node ' cdata_section_node ' comment_node ' Document_fragment_node '

' Document_node ' document_type_node ' element_node ' entity_node ' Entity_reference_node '

' Notation_node ' processing_instruction_node ' Text_node '
*getelementsbytagname () can find the child elements according to the name;
*childnodes returns all the nodes, where all the text is Textnode and the ' \n\r ' and the spaces between the elements are textnode;
*writexml () addindent= ' represents the indentation of the child element, newl= ' \ n ' represents a newline between elements, and encoding= ' Utf-8 ' represents the encoded format of the generated XML (<?xml version= "1.0" encoding = "Utf-8"?>).






Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.