Translating dictionaries into XML and related namespace parsing in Python

Last Update:2016-06-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Although the Xml.etree.ElementTree library is often used for parsing work, it can also create XML documents. For example, consider the following function:

From Xml.etree.ElementTree import elementdef dict_to_xml (tag, D): "Turn a simple dict of key/value pairs into XML" Elem = Element (tag) for key, Val in D.items (): Child  = Element (key)  child.text = str (val)  elem.append (child) return Elem

Here is an example of use:

>>> s = {' name ': ' GOOG ', ' shares ': +, ' price ': 490.1}>>> e = dict_to_xml (' stock ', s) >>> E
  
  
   
  >>>

The conversion result is an Element instance. For I/O operations, it is easy to convert it to a byte string using the ToString () function in Xml.etree.ElementTree. For example:

>>> from Xml.etree.ElementTree import tostring>>> tostring (e) B '
 
  
  
   
    
   490.1
  
   
  
   
    
   GOOG
  
   ' >>>

If you want to add attribute values to an element, you can use the set () method:

>>> e.set (' _id ', ' 1234 ') >>> ToString (e) B '
 
  
  
   
    
   490.1
  
   
  
   
    
   GOOG
  
   ' >>>

If you still want to maintain the order of the elements, consider constructing a ordereddict instead of a common dictionary. When creating XML, you are restricted to constructing values of string types only. For example:

def dict_to_xml_str (tag, D):  '  Turn a simple dict of key/value pairs into XML ' parts  = [' <{}&G t; '. Format (TAG)]  for Key, Val in D.items ():    parts.append (' <{0}>{1}
 
   '. Format (key,val))  Parts.append ("
 
   . Format (tag))  return". Join (Parts)

The problem is that if you do it manually, you may encounter some trouble. For example, what happens when a dictionary contains some special characters in its value?

>>> d = {' name ': '
 
  
   
  }>>> # String creation>>> dict_to_xml_str (' item ', d) '
  c4> ' >>>
  # Proper XML creation>>> e = Dict_to_xml (' Item
   
    
    
     '
    
   
    , d) >>> ToString (e) B '
  
   
   
    
    
     
   
    
  
   >>>

Note that the characters ' < ' and ' > ' are replaced by < and > in the following example of the program

For reference only, if you need to convert these characters manually, you can use the Escape () and unescape () functions in Xml.sax.saxutils. For example:

>>> from xml.sax.saxutils import Escape, unescape>>> Escape ('
 
  
   
  ) '
  
   
    
   ' >>> unescape (_) '
   
    
     
    >>>

In addition to creating the correct output, there is another reason to recommend that you create an Element instance instead of a string, which is not so easy to construct a larger document using a string combination. The Element instance can be processed in many ways without considering parsing the XML text. That is, you can do all of your work on a high-level data structure and output it as a string at the end.

Parsing XML documents with namespaces
If you parse this document and execute a normal query, you will find that this is not so easy, because all the steps have become quite cumbersome.

>>> # Some queries that work>>> doc.findtext (' author ') ' David Beazley ' >>> doc.find (' content ' )
 
  
   
  >>> # A query involving a namespace (doesn ' t work) >>> doc.find (' content/html ') >> > # Works if fully qualified>>> doc.find (' content/{http://www.w3.org/1999/xhtml}html ')
  
   
    
   > >> # doesn ' t work>>> doc.findtext (' Content/{http://www.w3.org/1999/xhtml}html/head/title ') >> > # Fully qualified>>> doc.findtext (' content/{http://www.w3.org/1999/xhtml}html/' ... ' {http://www.w3.org/1999/xhtml}head/{http://www.w3.org/1999/xhtml}title ') ' Hello World ' >>>

You can simplify this process by wrapping the namespace processing logic as a tool class:

Class XmlNamespaces:  def __init__ (self, **kwargs):    self.namespaces = {}    for name, Uri in Kwargs.items ():      self.register (name, URI)  def register (self, Name, URI):    self.namespaces[name] = ' {' +uri+ '} '  def __ call__ (self, Path):    return Path.format_map (self.namespaces)

Use this class in the following way:

>>> ns = xmlnamespaces (html= ' http://www.w3.org/1999/xhtml ') >>> doc.find (ns (' content/{html}html ') )
 
  
   
  >>> doc.findtext (ns (' Content/{html}html/{html}head/{html}title ')) ' Hello World ' >> >

Discuss
Parsing XML documents that contain namespaces can be tedious. The xmlnamespaces above simply allows you to use the abbreviated name instead of the full URI to make it a little more concise.

Unfortunately, there is no way to get the namespace information in the basic ElementTree parsing. However, if you use the Iterparse () function, you can get more information about the scope of the namespace processing. For example:

>>> from Xml.etree.ElementTree import iterparse>>> to evt, Elem in Iterparse (' Ns2.xml ', (' End ', ' Start -ns ', ' End-ns '): ... print (evt, elem) ... end
 
  
   start-ns (', ' http://www.w3.org/1999/xhtml ') End End End end 
    
     
      
       
       
        end-ns noneend 
        
         end 
         
          >>> Elem # This is the topmost element 
          
           >>>

Finally, if you want to work with XML text in addition to other advanced XML features, use the namespace, it is recommended that you use the Lxml function library instead of ElementTree. For example, lxml provides better support for validating documents with DTDs, better XPath support, and some other advanced XML features. This section really just teaches you how to make XML parsing a little bit simpler.



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Translating dictionaries into XML and related namespace parsing in Python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Translating dictionaries into XML and related namespace parsing in Python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support