Tutorials for working with XML in Python

Source: Internet
Author: User
XML is more complex than JSON, and is not as much used in the web as it used to be, but there are still many places to use, so it is necessary to understand how to manipulate XML.
DOM vs SAX

There are two ways to manipulate XML: Dom and sax. Dom will read the entire XML into memory, parse into a tree, so take up memory, parsing slow, the advantage is that you can arbitrarily traverse the tree node. Sax is a stream pattern, side-reading side parsing, memory-intensive, parsing fast, the disadvantage is that we need to handle the event ourselves.

In normal situations, sax is preferred because the DOM is too much memory.

Parsing xml with sax in Python is very concise, and usually the events we care about are start_element,end_element and char_data, ready with these 3 functions, and then parse the XML.

As an example, when the SAX parser reads a node:

Python

will produce 3 events:

    1. Start_element events, while reading;
    2. Char_data event, when reading Python;
    3. The End_element event, when read.

Experiment with the code:

From Xml.parsers.expat import Parsercreateclass Defaultsaxhandler (object):  def start_element (self, Name, attrs):    print (' sax:start_element:%s, attrs:%s '% (name, str (attrs)))  def end_element (self, name):    print (' Sax: End_element:%s '% name)  def char_data (self, text):    print (' sax:char_data:%s '% text) XML = R ' "<?xml version=" 1.0 "?>
 
  
  
  1. Python
  2. Ruby
"' Handler = Defaultsaxhandler () parser = Parsercreate () Parser.returns_unicode = Trueparser.startelementhandler = Handler.start_elementparser. Endelementhandler = Handler.end_elementparser. Characterdatahandler = Handler.char_dataparser. Parse (XML)

When setting Returns_unicode to True, all element names and Char_data returned are Unicode, which is more convenient to handle internationalization.

It is important to note that when reading a large string of strings, Characterdatahandler may be called multiple times, so you need to save them and merge them in Endelementhandler.

In addition to parsing XML, how to generate XML? In the case of 99%, the XML structure that needs to be generated is very simple, so the simplest and most efficient way to generate XML is to stitch the strings together:

L = []l.append (R ' <?xml version= "1.0"?> ') l.append (R '
 
  
   
  ) l.append (Encode (' Some & Data ')) L.append (R
 
  ") return". Join (L)

What if you want to generate complex XML? It is recommended that you do not use XML to change to JSON.
Summary

When parsing XML, be careful to find out which node you are interested in, and save the node data when responding to the event. Once the parsing is complete, the data can be processed.

Practice the weather forecast in the XML format of Yahoo! To get the day and the last few days:

http://weather.yahooapis.com/forecastrss?u=c&w=2151330

Parameter w is the city code, to query a city code, you can search the city in weather.yahoo.com, the URL of the browser address bar contains the city code.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.