XML is more complex than JSON, and is not as much used in the web as it used to be, but there are still many places to use, so it is necessary to understand how to manipulate XML.
DOM vs SAX
There are two ways to manipulate XML: Dom and sax. Dom will read the entire XML into memory, parse into a tree, so take up memory, parsing slow, the advantage is that you can arbitrarily traverse the tree node. Sax is a stream pattern, side-reading side parsing, memory-intensive, parsing fast, the disadvantage is that we need to handle the event ourselves.
In normal situations, sax is preferred because the DOM is too much memory.
Parsing xml with sax in Python is very concise, and usually the events we care about are start_element,end_element and char_data, ready with these 3 functions, and then parse the XML.
As an example, when the SAX parser reads a node:
Python
will produce 3 events:
- Start_element events, while reading;
- Char_data event, when reading Python;
- The End_element event, when read.
Experiment with the code:
From Xml.parsers.expat import Parsercreateclass Defaultsaxhandler (object): def start_element (self, Name, attrs): print (' sax:start_element:%s, attrs:%s '% (name, str (attrs))) def end_element (self, name): print (' Sax: End_element:%s '% name) def char_data (self, text): print (' sax:char_data:%s '% text) XML = R ' "<?xml version=" 1.0 "?>
- Python
- Ruby
"' Handler = Defaultsaxhandler () parser = Parsercreate () Parser.returns_unicode = Trueparser.startelementhandler = Handler.start_elementparser. Endelementhandler = Handler.end_elementparser. Characterdatahandler = Handler.char_dataparser. Parse (XML)
When setting Returns_unicode to True, all element names and Char_data returned are Unicode, which is more convenient to handle internationalization.
It is important to note that when reading a large string of strings, Characterdatahandler may be called multiple times, so you need to save them and merge them in Endelementhandler.
In addition to parsing XML, how to generate XML? In the case of 99%, the XML structure that needs to be generated is very simple, so the simplest and most efficient way to generate XML is to stitch the strings together:
L = []l.append (R ' <?xml version= "1.0"?> ') l.append (R '
) l.append (Encode (' Some & Data ')) L.append (R
") return". Join (L)
What if you want to generate complex XML? It is recommended that you do not use XML to change to JSON.
Summary
When parsing XML, be careful to find out which node you are interested in, and save the node data when responding to the event. Once the parsing is complete, the data can be processed.
Practice the weather forecast in the XML format of Yahoo! To get the day and the last few days:
http://weather.yahooapis.com/forecastrss?u=c&w=2151330
Parameter w is the city code, to query a city code, you can search the city in weather.yahoo.com, the URL of the browser address bar contains the city code.