Python has some built-in XML parsing libraries. This article introduces the expat library in Python.
Expat supports dynamic parsing of XML. What is dynamic? That is, an XML string does not need to be completely input to expat. Even if it is only a part, expat can also sense the RESPONSE event. Event? For example, if a new element is detected (in essence, expat encounters '<'), or an element is detected to have been processed ('/> '). So -- expat does not necessarily need complete XML to work.
ViewProgram
1 Import XML. parsers. Expat 2 3 Class Exparser (object ): 4 ''' Parse roster XML ''' 5 Def _ Init __ (Self, xml_raw ): 6 ''' Init parser and setup handlers ''' 7 Self. parser = XML. parsers. Expat. parsercreate () 8 9 # Connect handlers 10 Self. parser. startelementhandler = Self. start_element 11 Self. parser. endelementhandler = Self. end_element 12 Self. parser. characterdatahandler = Self. char_data 13 Self. parser. parse (xml_raw) 14 Del (Xml_raw) 15 16 Def Start_element (self, name, attrs ): 17 ''' Start XML Element Handler ''' 18 Print 'Start: '+ name 19 20 Def End_element (self, name ): 21 ''' End XML Element Handler ''' 22 Print 'End: '+ name 23 24 Def Char_data (self, data ): 25 ''' Char XML Element Handler ''' 26 Print 'Data is '+ Data
Ah analysis:
An expat parser is defined in the construction of the 7-row exparser.
Line 10-12 sets the callback function of interest for the parser
Line 13 begins parsing our XML
The next step is to wait for expat resolution. Once the expat parser encounters an XML element, the element ends. When the element value event occurs, the start_element, end_element, and char_data functions will be called separately.
The 16-line parameter name, attrs representing the node name, node attribute (dictionary)
Line 20 parameter name indicates the node name
24-line parameter data indicates node data
After getting these values, you can do anything.
In particular, when we encounter large XML data blocks, we can use expat to dynamically parse this point to parse a large block.