Solutions for parsing xml using sax in java

Source: Internet
Author: User

In java, there are two ways for native to parse xml documents:Dom parsing and Sax Parsing

The Dom resolution function is powerful and supports addition, deletion, modification, and query. During the operation, the xml document is read to the memory as a document object. Therefore, it is suitable for small documents.

Sax parsing reads the content one by one from start to end, which is inconvenient to modify. However, it is applicable to read-only large documents.

This article focuses on the analysis of Sax, followed

Sax parses documents in an event-driven manner. Simply put, just like watching a movie in a cinema, reading it from start to end is over and cannot be rolled back (Dom can be read back and forth)

When watching a movie, every time you encounter a plot, a tear, and a pass, you will mobilize your brain and nerves to receive or process the information.

Similarly, some callback methods are triggered when the beginning and end of a document are read during the parsing process of Sax. You can process the corresponding events in these callback methods.

The four methods are as follows:StartDocument (), endDocument (), startElement (), endElement

In addition, it is not enough to read data from the node.Characters ()Method to carefully process the content contained in the element

These callback methods are combined to form a class, which is the trigger we need.

Generally, the file is read from the Main method, but the file is processed in the trigger. This is the so-called event-driven parsing method.

For example, in a trigger, read the document and parse the elements one by one. The content of each element is returned to the characters () method.

End Element reading. After all elements are read, end file parsing.

Now we start to create the trigger class. To create this class, we must first inherit DefaultHandler

Create a SaxHandler and overwrite the corresponding method:

Copy codeThe Code is as follows: import org. xml. sax. Attributes;
Import org. xml. sax. SAXException;
Import org. xml. sax. helpers. DefaultHandler;

Public class SaxHandler extends DefaultHandler {

/* This method has three parameters.
Arg0 is a character array that contains the element content.
Arg1 and arg2 are the start and end positions of the array respectively */
@ Override
Public void characters (char [] arg0, int arg1, int arg2) throws SAXException {
String content = new String (arg0, arg1, arg2 );
System. out. println (content );
Super. characters (arg0, arg1, arg2 );
}

@ Override
Public void endDocument () throws SAXException {
System. out. println ("\ n ............ End parsing document ............ ");
Super. endDocument ();
}

/* Arg0 is the namespace
Arg1 is a tag that contains a namespace. If there is no namespace, It is null.
Arg2 is a tag that does not contain a namespace */
@ Override
Public void endElement (String arg0, String arg1, String arg2)
Throws SAXException {
System. out. println ("End parsing element" + arg2 );
Super. endElement (arg0, arg1, arg2 );
}

@ Override
Public void startDocument () throws SAXException {
System. out. println ("............ Start parsing documents ............ \ N ");
Super. startDocument ();
}

/* Arg0 is the namespace
Arg1 is a tag that contains a namespace. If there is no namespace, It is null.
Arg2 is a tag that does not contain a namespace.
Arg3 is obviously a set of attributes */
@ Override
Public void startElement (String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {
System. out. println ("START parsing element" + arg2 );
If (arg3! = Null ){
For (int I = 0; I <arg3.getLength (); I ++ ){
// GetQName () is the property name,
System. out. print (arg3.getQName (I) + "= \" "+ arg3.getValue (I) + "\"");
}
}
System. out. print (arg2 + ":");
Super. startElement (arg0, arg1, arg2, arg3 );
}
}

XML document:
Copy codeThe Code is as follows: <? Xml version = "1.0" encoding = "UTF-8"?>
<Books>
<Book id = "001">
<Title> Harry Potter </title>
<Author> j k. Rowling </author>
</Book>
<Book id = "002">
<Title> Learning XML </title>
<Author> Erik T. Ray </author>
</Book>
</Books>

TestDemo test class:
Copy codeThe Code is as follows: import java. io. File;

Import javax. xml. parsers. SAXParser;
Import javax. xml. parsers. SAXParserFactory;

Public class TestDemo {

Public static void main (String [] args) throws Exception {
// 1. instantiate the SAXParserFactory object
SAXParserFactory factory = SAXParserFactory. newInstance ();
// 2. Create a parser
SAXParser parser = factory. newSAXParser ();
// 3. Obtain the document to be parsed, generate the parser, and finally parse the document
File f = new File ("books. xml ");
SaxHandler dh = new SaxHandler ();
Parser. parse (f, dh );
}
}

Output result:
Copy codeThe Code is as follows :............ Start parsing documents ............

Start parsing element books
Books:

Start parsing element book
Id = "001" book:

Start parsing element title
Title: Harry Potter
End parsing element title

Start parsing element author
Author: j k. Rowling
End parsing element author

End parsing element book

Start parsing element book
Id = "002" book:

Start parsing element title
Title: Learning XML
End parsing element title

Start parsing element author
Author: Erik T. Ray
End parsing element author

End parsing element book

End parsing element books

............ End parsing document ............

Although the above shows the execution process correctly, the output is messy.

To perform this process more clearly, we can also rewrite SaxHandler to restore the original xml document.

Override SaxHandler class:

Copy codeThe Code is as follows: import org. xml. sax. Attributes;
Import org. xml. sax. SAXException;
Import org. xml. sax. helpers. DefaultHandler;

Public class SaxHandler extends DefaultHandler {

@ Override
Public void characters (char [] arg0, int arg1, int arg2) throws SAXException {
System. out. print (new String (arg0, arg1, arg2 ));
Super. characters (arg0, arg1, arg2 );
}

@ Override
Public void endDocument () throws SAXException {
System. out. println ("\ n end resolution ");
Super. endDocument ();
}

@ Override
Public void endElement (String arg0, String arg1, String arg2)
Throws SAXException {
System. out. print ("</");
System. out. print (arg2 );
System. out. print ("> ");
Super. endElement (arg0, arg1, arg2 );
}

@ Override
Public void startDocument () throws SAXException {
System. out. println ("START Parsing ");
String s = "<? Xml version = \ "1.0 \" encoding = \ "UTF-8 \"?> ";
System. out. println (s );
Super. startDocument ();
}

@ Override
Public void startElement (String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {

System. out. print ("<");
System. out. print (arg2 );

If (arg3! = Null ){
For (int I = 0; I <arg3.getLength (); I ++ ){
System. out. print ("" + arg3.getQName (I) + "= \" "+ arg3.getValue (I) + "\"");
}
}
System. out. print ("> ");
Super. startElement (arg0, arg1, arg2, arg3 );
}

}

Execution result:

It looks much better now. Restoring it can better illustrate the parsing process.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.