Python uses the xml. dom module to parse xml, pythonxml. dom

Last Update:2017-05-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. What is xml? What are the features?

Xml can be used to tag data and define data types. It is a source language that allows you to define your own markup language.

Example: del. xml

<?xml version="1.0" encoding="utf-8"?><catalog> <maxid>4</maxid> <login username="pytest" passwd='123456'>  <caption>Python</caption>  <item id="4">   <caption>test</caption>  </item> </login> <item id="2">  <caption>Zope</caption> </item></catalog>

The structure is similar to HTML hypertext markup language. However, they are designed for different purposes. hypertext markup language is designed to display data, and its focus is on the appearance of the data. It is designed to transmit and store data, with the focus on data content.

It has the following features:

• It is composed of tag pairs,<aa></aa>

• Tags can have attributes:<aa id='123'></aa>

• Tag pairs can embed data:<aa>abc</aa>

• Tags can be embedded into sub-tags (hierarchical)

Ii. Obtain tag attributes

# Coding: utf-8import xml. dom. minidomdom = xml. dom. minidom. parse ("del. xml ") # Open the xml document root = dom.doc umentElement # obtain the xml Document Object print" nodeName: ", root. nodeName # each node has its nodeName, nodeValue, and nodeType attributes print "nodeValue:", root. nodeValue # nodeValue is the value of the node. It is only valid for the text node print "nodeType:", root. nodeTypeprint "ELEMENT_NODE:", root. ELEMENT_NODE

NodeType is the node type. Catalog is of the ELEMENT_NODE type.

There are currently the following types:

'ATTRIBUTE_NODE''CDATA_SECTION_NODE''COMMENT_NODE''DOCUMENT_FRAGMENT_NODE''DOCUMENT_NODE''DOCUMENT_TYPE_NODE''ELEMENT_NODE''ENTITY_NODE''ENTITY_REFERENCE_NODE''NOTATION_NODE''PROCESSING_INSTRUCTION_NODE''TEXT_NODE'

Running result

nodeName: catalognodeValue: NonenodeType: 1ELEMENT_NODE: 1

3. Obtain sub-tags

#coding: utf-8import xml.dom.minidomdom = xml.dom.minidom.parse("del.xml") root = dom.documentElementbb = root.getElementsByTagName('maxid')print type(bb)print bbb = bb[0]print b.nodeNameprint b.nodeValue

Running result

<class 'xml.dom.minicompat.NodeList'>[<DOM Element: maxid at 0x2707a48>]maxidNone

4. Obtain tag attribute values

# Coding: utf-8import xml. dom. minidomdom = xml. dom. minidom. parse ("del. xml ") root = dom.doc umentElementitemlist = root. getElementsByTagName ('login') item = itemlist [0] print item. getAttribute ("username") print item. getAttribute ("passwd") itemlist = root. getElementsByTagName ("item") item = itemlist [0] # differentiate print items by location in itemlist. getAttribute ("id") item2 = itemlist [1] # differentiate print item2.getAttribute ("id") by location in itemlist ")

Running result

pytest12345642

5. obtain data between tag pairs

#coding: utf-8import xml.dom.minidomdom = xml.dom.minidom.parse("del.xml") root = dom.documentElementitemlist = root.getElementsByTagName('caption')item = itemlist[0]print item.firstChild.dataitem2 = itemlist[1]print item2.firstChild.data

Running result

Pythontest

Vi. Example

<?xml version="1.0" encoding="UTF-8" ?><users> <user id="1000001">  <username>Admin</username>  <email>admin@live.cn</email>  <age>23</age>  <sex>boy</sex> </user> <user id="1000002">  <username>Admin2</username>  <email>admin2@live.cn</email>  <age>22</age>  <sex>boy</sex> </user> <user id="1000003">  <username>Admin3</username>  <email>admin3@live.cn</email>  <age>27</age>  <sex>boy</sex> </user> <user id="1000004">  <username>Admin4</username>  <email>admin4@live.cn</email>  <age>25</age>  <sex>girl</sex> </user> <user id="1000005">  <username>Admin5</username>  <email>admin5@live.cn</email>  <age>20</age>  <sex>boy</sex> </user> <user id="1000006">  <username>Admin6</username>  <email>admin6@live.cn</email>  <age>23</age>  <sex>girl</sex> </user></users>

Output name, email, age, and sex

Reference Code

# -*- coding:utf-8 -*-from xml.dom import minidomdef get_attrvalue(node, attrname):  return node.getAttribute(attrname) if node else ''def get_nodevalue(node, index = 0): return node.childNodes[index].nodeValue if node else ''def get_xmlnode(node, name): return node.getElementsByTagName(name) if node else []def get_xml_data(filename = 'user.xml'): doc = minidom.parse(filename)  root = doc.documentElement user_nodes = get_xmlnode(root, 'user') print "user_nodes:", user_nodes user_list=[] for node in user_nodes:   user_id = get_attrvalue(node, 'id')   node_name = get_xmlnode(node, 'username')  node_email = get_xmlnode(node, 'email')  node_age = get_xmlnode(node, 'age')  node_sex = get_xmlnode(node, 'sex')  user_name =get_nodevalue(node_name[0])  user_email = get_nodevalue(node_email[0])  user_age = int(get_nodevalue(node_age[0]))  user_sex = get_nodevalue(node_sex[0])  user = {}  user['id'] , user['username'] , user['email'] , user['age'] , user['sex'] = (   int(user_id), user_name , user_email , user_age , user_sex  )  user_list.append(user) return user_listdef test_load_xml(): user_list = get_xml_data() for user in user_list :  print '-----------------------------------------------------'  if user:   user_str='No.:\t%d\nname:\t%s\nsex:\t%s\nage:\t%s\nEmail:\t%s' % (int(user['id']) , user['username'], user['sex'] , user['age'] , user['email'])   print user_strif __name__ == "__main__": test_load_xml()

Result

C:\Users\wzh94434\Desktop\xml>python user.pyuser_nodes: [<DOM Element: user at 0x2758c48>, <DOM Element: user at 0x2756288>, <DOM Element: user at 0x2756888>, <DOM Element: user at 0x2756e88>, <DOM Element: user at 0x275e4c8>, <DOM Element: user at 0x275eac8>]-----------------------------------------------------No.: 1000001name: Adminsex: boyage: 23Email: admin@live.cn-----------------------------------------------------No.: 1000002name: Admin2sex: boyage: 22Email: admin2@live.cn-----------------------------------------------------No.: 1000003name: Admin3sex: boyage: 27Email: admin3@live.cn-----------------------------------------------------No.: 1000004name: Admin4sex: grilage: 25Email: admin4@live.cn-----------------------------------------------------No.: 1000005name: Admin5sex: boyage: 20Email: admin5@live.cn-----------------------------------------------------No.: 1000006name: Admin6sex: grilage: 23Email: admin6@live.cn

VII. Summary

Minidom. parse (filename) load and read the XML file doc.doc umentElement to get the XML file object node. getAttribute (AttributeName) gets the XML node attribute value node. getElementsByTagName (TagName) obtains the node of the XML node object set. childNodes # Return to the subnode list. Node. childNodes [index]. nodeValue get the XML node value node. firstChild # access the first node. It is equivalent to pagexml. childNodes [0] doc = minidom. parse (filename) doc. toxml ('utf-8') returns the text Node in xml format of the Node. attributes ["id"]. name # Is the above "id". value # Attribute value access element attribute

Well, the above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, you can leave a message, thank you for your support.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python uses the xml. dom module to parse xml, pythonxml. dom

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python uses the xml. dom module to parse xml, pythonxml. dom

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support