Translating dictionaries into XML and related namespace parsing in Python

Source: Internet
Author: User
Although the Xml.etree.ElementTree library is often used for parsing work, it can also create XML documents. For example, consider the following function:

From Xml.etree.ElementTree import elementdef dict_to_xml (tag, D): "Turn a simple dict of key/value pairs into XML" Elem = Element (tag) for key, Val in D.items (): Child  = Element (key)  child.text = str (val)  elem.append (child) return Elem

Here is an example of use:

>>> s = {' name ': ' GOOG ', ' shares ': +, ' price ': 490.1}>>> e = dict_to_xml (' stock ', s) >>> E
  
  
   
  >>>
 
  

The conversion result is an Element instance. For I/O operations, it is easy to convert it to a byte string using the ToString () function in Xml.etree.ElementTree. For example:

>>> from Xml.etree.ElementTree import tostring>>> tostring (e) B '
 
  
  
   
    
   490.1
  
   
  
   
    
   GOOG
  
   ' >>>
 
   
  
   
  

If you want to add attribute values to an element, you can use the set () method:

>>> e.set (' _id ', ' 1234 ') >>> ToString (e) B '
 
  
  
   
    
   490.1
  
   
  
   
    
   GOOG
  
   ' >>>
 
   
  
   
  

If you still want to maintain the order of the elements, consider constructing a ordereddict instead of a common dictionary. When creating XML, you are restricted to constructing values of string types only. For example:

def dict_to_xml_str (tag, D):  '  Turn a simple dict of key/value pairs into XML ' parts  = [' <{}&G t; '. Format (TAG)]  for Key, Val in D.items ():    parts.append (' <{0}>{1}
 
   '. Format (key,val))  Parts.append ("
 
   . Format (tag))  return". Join (Parts)

The problem is that if you do it manually, you may encounter some trouble. For example, what happens when a dictionary contains some special characters in its value?

>>> d = {' name ': '
 
  
   
  }>>> # String creation>>> dict_to_xml_str (' item ', d) '
  c4> ' >>>
  # Proper XML creation>>> e = Dict_to_xml (' Item
   
    
    
     '
    
   
    , d) >>> ToString (e) B '
  
   
   
    
    
     
   
    
  
   >>>
 
  

Note that the characters ' < ' and ' > ' are replaced by < and > in the following example of the program

For reference only, if you need to convert these characters manually, you can use the Escape () and unescape () functions in Xml.sax.saxutils. For example:

>>> from xml.sax.saxutils import Escape, unescape>>> Escape ('
 
  
   
  ) '
  
   
    
   ' >>> unescape (_) '
   
    
     
    >>>
   
    
  
   
 
  

In addition to creating the correct output, there is another reason to recommend that you create an Element instance instead of a string, which is not so easy to construct a larger document using a string combination. The Element instance can be processed in many ways without considering parsing the XML text. That is, you can do all of your work on a high-level data structure and output it as a string at the end.

Parsing XML documents with namespaces
If you parse this document and execute a normal query, you will find that this is not so easy, because all the steps have become quite cumbersome.

>>> # Some queries that work>>> doc.findtext (' author ') ' David Beazley ' >>> doc.find (' content ' )
 
  
   
  >>> # A query involving a namespace (doesn ' t work) >>> doc.find (' content/html ') >> > # Works if fully qualified>>> doc.find (' content/{http://www.w3.org/1999/xhtml}html ')
  
   
    
   > >> # doesn ' t work>>> doc.findtext (' Content/{http://www.w3.org/1999/xhtml}html/head/title ') >> > # Fully qualified>>> doc.findtext (' content/{http://www.w3.org/1999/xhtml}html/' ... ' {http://www.w3.org/1999/xhtml}head/{http://www.w3.org/1999/xhtml}title ') ' Hello World ' >>>
  
   
 
  

You can simplify this process by wrapping the namespace processing logic as a tool class:

Class XmlNamespaces:  def __init__ (self, **kwargs):    self.namespaces = {}    for name, Uri in Kwargs.items ():      self.register (name, URI)  def register (self, Name, URI):    self.namespaces[name] = ' {' +uri+ '} '  def __ call__ (self, Path):    return Path.format_map (self.namespaces)

Use this class in the following way:

>>> ns = xmlnamespaces (html= ' http://www.w3.org/1999/xhtml ') >>> doc.find (ns (' content/{html}html ') )
 
  
   
  >>> doc.findtext (ns (' Content/{html}html/{html}head/{html}title ')) ' Hello World ' >> >
 
  

Discuss
Parsing XML documents that contain namespaces can be tedious. The xmlnamespaces above simply allows you to use the abbreviated name instead of the full URI to make it a little more concise.

Unfortunately, there is no way to get the namespace information in the basic ElementTree parsing. However, if you use the Iterparse () function, you can get more information about the scope of the namespace processing. For example:

>>> from Xml.etree.ElementTree import iterparse>>> to evt, Elem in Iterparse (' Ns2.xml ', (' End ', ' Start -ns ', ' End-ns '): ... print (evt, elem) ... end
 
  
   start-ns (', ' http://www.w3.org/1999/xhtml ') End End End end 
    
     
      
       
       
        end-ns noneend 
        
         end 
         
          >>> Elem # This is the topmost element 
          
           >>>
           
          
         
        
       
   
      
  
     
 
    

  
 

Finally, if you want to work with XML text in addition to other advanced XML features, use the namespace, it is recommended that you use the Lxml function library instead of ElementTree. For example, lxml provides better support for validating documents with DTDs, better XPath support, and some other advanced XML features. This section really just teaches you how to make XML parsing a little bit simpler.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.