In java, there are two ways for native to parse xml documents:Dom parsing and Sax Parsing
The Dom resolution function is powerful and supports addition, deletion, modification, and query. During the operation, the xml document is read to the memory as a document object. Therefore, it is suitable for small documents.
Sax parsing reads the content one by one from start to end, which is inconvenient to modify. However, it is applicable to read-only large documents.
This article focuses on the analysis of Sax, followed
Sax parses documents in an event-driven manner. Simply put, just like watching a movie in a cinema, reading it from start to end is over and cannot be rolled back (Dom can be read back and forth)
When watching a movie, every time you encounter a plot, a tear, and a pass, you will mobilize your brain and nerves to receive or process the information.
Similarly, some callback methods are triggered when the beginning and end of a document are read during the parsing process of Sax. You can process the corresponding events in these callback methods.
The four methods are as follows:StartDocument (), endDocument (), startElement (), endElement
In addition, it is not enough to read data from the node.Characters ()Method to carefully process the content contained in the element
These callback methods are combined to form a class, which is the trigger we need.
Generally, the file is read from the Main method, but the file is processed in the trigger. This is the so-called event-driven parsing method.
For example, in a trigger, read the document and parse the elements one by one. The content of each element is returned to the characters () method.
End Element reading. After all elements are read, end file parsing.
Now we start to create the trigger class. To create this class, we must first inherit DefaultHandler
Create a SaxHandler and overwrite the corresponding method:
Copy codeThe Code is as follows: import org. xml. sax. Attributes;
Import org. xml. sax. SAXException;
Import org. xml. sax. helpers. DefaultHandler;
Public class SaxHandler extends DefaultHandler {
/* This method has three parameters.
Arg0 is a character array that contains the element content.
Arg1 and arg2 are the start and end positions of the array respectively */
@ Override
Public void characters (char [] arg0, int arg1, int arg2) throws SAXException {
String content = new String (arg0, arg1, arg2 );
System. out. println (content );
Super. characters (arg0, arg1, arg2 );
}
@ Override
Public void endDocument () throws SAXException {
System. out. println ("\ n ............ End parsing document ............ ");
Super. endDocument ();
}
/* Arg0 is the namespace
Arg1 is a tag that contains a namespace. If there is no namespace, It is null.
Arg2 is a tag that does not contain a namespace */
@ Override
Public void endElement (String arg0, String arg1, String arg2)
Throws SAXException {
System. out. println ("End parsing element" + arg2 );
Super. endElement (arg0, arg1, arg2 );
}
@ Override
Public void startDocument () throws SAXException {
System. out. println ("............ Start parsing documents ............ \ N ");
Super. startDocument ();
}
/* Arg0 is the namespace
Arg1 is a tag that contains a namespace. If there is no namespace, It is null.
Arg2 is a tag that does not contain a namespace.
Arg3 is obviously a set of attributes */
@ Override
Public void startElement (String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {
System. out. println ("START parsing element" + arg2 );
If (arg3! = Null ){
For (int I = 0; I <arg3.getLength (); I ++ ){
// GetQName () is the property name,
System. out. print (arg3.getQName (I) + "= \" "+ arg3.getValue (I) + "\"");
}
}
System. out. print (arg2 + ":");
Super. startElement (arg0, arg1, arg2, arg3 );
}
}
XML document:
Copy codeThe Code is as follows: <? Xml version = "1.0" encoding = "UTF-8"?>
<Books>
<Book id = "001">
<Title> Harry Potter </title>
<Author> j k. Rowling </author>
</Book>
<Book id = "002">
<Title> Learning XML </title>
<Author> Erik T. Ray </author>
</Book>
</Books>
TestDemo test class:
Copy codeThe Code is as follows: import java. io. File;
Import javax. xml. parsers. SAXParser;
Import javax. xml. parsers. SAXParserFactory;
Public class TestDemo {
Public static void main (String [] args) throws Exception {
// 1. instantiate the SAXParserFactory object
SAXParserFactory factory = SAXParserFactory. newInstance ();
// 2. Create a parser
SAXParser parser = factory. newSAXParser ();
// 3. Obtain the document to be parsed, generate the parser, and finally parse the document
File f = new File ("books. xml ");
SaxHandler dh = new SaxHandler ();
Parser. parse (f, dh );
}
}
Output result:
Copy codeThe Code is as follows :............ Start parsing documents ............
Start parsing element books
Books:
Start parsing element book
Id = "001" book:
Start parsing element title
Title: Harry Potter
End parsing element title
Start parsing element author
Author: j k. Rowling
End parsing element author
End parsing element book
Start parsing element book
Id = "002" book:
Start parsing element title
Title: Learning XML
End parsing element title
Start parsing element author
Author: Erik T. Ray
End parsing element author
End parsing element book
End parsing element books
............ End parsing document ............
Although the above shows the execution process correctly, the output is messy.
To perform this process more clearly, we can also rewrite SaxHandler to restore the original xml document.
Override SaxHandler class:
Copy codeThe Code is as follows: import org. xml. sax. Attributes;
Import org. xml. sax. SAXException;
Import org. xml. sax. helpers. DefaultHandler;
Public class SaxHandler extends DefaultHandler {
@ Override
Public void characters (char [] arg0, int arg1, int arg2) throws SAXException {
System. out. print (new String (arg0, arg1, arg2 ));
Super. characters (arg0, arg1, arg2 );
}
@ Override
Public void endDocument () throws SAXException {
System. out. println ("\ n end resolution ");
Super. endDocument ();
}
@ Override
Public void endElement (String arg0, String arg1, String arg2)
Throws SAXException {
System. out. print ("</");
System. out. print (arg2 );
System. out. print ("> ");
Super. endElement (arg0, arg1, arg2 );
}
@ Override
Public void startDocument () throws SAXException {
System. out. println ("START Parsing ");
String s = "<? Xml version = \ "1.0 \" encoding = \ "UTF-8 \"?> ";
System. out. println (s );
Super. startDocument ();
}
@ Override
Public void startElement (String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {
System. out. print ("<");
System. out. print (arg2 );
If (arg3! = Null ){
For (int I = 0; I <arg3.getLength (); I ++ ){
System. out. print ("" + arg3.getQName (I) + "= \" "+ arg3.getValue (I) + "\"");
}
}
System. out. print ("> ");
Super. startElement (arg0, arg1, arg2, arg3 );
}
}
Execution result:
It looks much better now. Restoring it can better illustrate the parsing process.