Examples of methods for parsing XML files using DOM and sax in Java _java

Source: Internet
Author: User
Tags event listener flush stub xml parser stringbuffer

DOM4J Introduction
DOM4J's project address: http://sourceforge.net/projects/dom4j/?source=directory

DOM4J is a simple open source repository for XML, XPath, and XSLT, which is based on the Java platform and uses the Java Collection framework to fully integrate Dom,sax and JAXP.

The use of dom4j
After downloading the DOM4J project, unzip and add its jar package (my current version, called Dom4j-1.6.1.jar), to the class path below.

(Properties->java build Path-> Add External JARs ... )。

It can then be programmed using the APIs it provides.

Program Instance 1
The first program, which generates the XML document in Java code, is the following code:

Package com.example.xml.dom4j;
Import Java.io.FileOutputStream;

Import Java.io.FileWriter;
Import org.dom4j.Document;
Import Org.dom4j.DocumentHelper;
Import org.dom4j.Element;
Import Org.dom4j.io.OutputFormat;

Import Org.dom4j.io.XMLWriter; /** * DOM4J Framework Learn to use DOM4J framework to create XML documents and output save */public class Dom4jtest1 {public static void main (string[] args) throws

    Exception {//The first way: Create a document and create a root element//Create document: Using a helper class document document = Documenthelper.createdocument ();
    Create the root node and add it to the document Element root = documenthelper.createelement ("student");

    Document.setrootelement (root);
    The second way: Create a document and set the root element of the document node element root2 = documenthelper.createelement ("student");

    Document Document2 = documenthelper.createdocument (ROOT2);
    Add attribute Root2.addattribute ("name", "Zhangsan");
    Add child nodes: Add returns this element helloelement = root2.addelement ("Hello");

    Element worldelement = root2.addelement ("World");
    Helloelement.settext ("Hello Text"); WorldeLement.settext ("World text");
    Output//Output to console XMLWriter XMLWriter = new XMLWriter ();

    Xmlwriter.write (document); Output to File//format OutputFormat format = new OutputFormat ("", true);/set indent to 4 spaces and a second line to true XMLWriter XMLWriter
    2 = new XMLWriter (New FileOutputStream ("Student.xml"), format);

    Xmlwriter2.write (Document2); Another way to output, remember to call the Flush () method, otherwise the output of the file shows a blank XMLWriter XmlWriter3 = new XMLWriter (New FileWriter ("Student2.xml"), fo
    Rmat);
    Xmlwriter3.write (Document2);
    Xmlwriter3.flush ();

 The close () method can also}

Program Console output:

<?xml version= "1.0" encoding= "UTF-8"?>
<student/>

An XML document that is generated:

<?xml version= "1.0" encoding= "UTF-8"?> <student name=

"Zhangsan" >
   
 

Program Instance 2
Program Instance 2, read the XML document and parse it, and output its content.

First, the documents to be analyzed are as follows:

<?xml version= "1.0" encoding= "UTF-8"?> <students name= "Zhangsan" >  


Output after code runs:

Root:students Total child
count:6
Hello child:3 the world
Attr:name=wangwu
Iterative output-----------------------
Lisi
lisi2
lisi3
wangwu
wangwu2
null
-----------------------root:students with Domreader


Sax Parsing xml
Here are the steps for Sax to implement entity resolution
The following uses XmlReader to resolve
(a) First step: Create a new factory class SAXParserFactory, the code is as follows:
SAXParserFactory factory = Saxparserfactory.newinstance ();
(ii) Step two: Let the factory class produce a Sax parsing class saxparser with the following code:
SAXParser parser = Factory.newsaxparser ();
(iii) Step three: A XmlReader instance is obtained from the Saxpsrser code as follows:
XMLReader reader = Parser.getxmlreader ();
(iv) The fourth step: to write their own handler registered to the XmlReader, the general most important is ContentHandler, the code is as follows:
Reader.setcontenthandler (this);
(v) Step fifth: After an XML document or resource becomes a Java-capable InputStream stream, parsing begins with the following code:
Reader.parse (New InputSource (IS));


The following uses SAXParser to resolve
(a) First step: Create a new factory class SAXParserFactory, the code is as follows:
SAXParserFactory factory = Saxparserfactory.newinstance ();
(ii) Step two: Let the factory class produce a Sax parsing class saxparser with the following code:
SAXParser parser = Factory.newsaxparser ();
(iii) Step three: After an XML document or resource becomes a Java-capable InputStream stream, parsing begins with the following code:
Parser.parse (Is,this);
It is estimated that everyone saw the ContentHandler, the following specific talk
Before parsing begins, you need to register a contenthandler with Xmlreader/saxparser, which is equivalent to an event listener, that defines a number of methods in ContentHandler
Sets a locator object that can locate the location where the document content event occurs
public void Setdocumentlocator (Locator Locator)

Used to process document resolution start events
public void Startdocument () throws Saxexception

Handles the element start event, from which you can get the URI of the element's namespace, the element name, the attribute class table, and so on.
public void Startelement (string Namespacesuri, String LocalName, String qName, Attributes atts) throws Saxexception

Handles the element end event, from which you can get the URI of the element's namespace, the element name, and so on
public void EndElement (string Namespacesuri, String localname, String qName) throws Saxexception

Handles the character contents of an element, from which content can be obtained
public void characters (char[] ch, int start, int length) throws Saxexception
By the way, the method in the XmlReader is introduced.
Register to handle XML document resolution events ContentHandler
public void Setcontenthandler (ContentHandler handler)

Start parsing an XML document
public void Parse (Inputsorce input) throws Saxexception

It's about the same. Next step to explain the parsing
We still use the previous chapter's code
First we create a person class to store the user's information

 package Com.example.demo; 
 
Import java.io.Serializable; 
  public class Person implements Serializable {/** * * */private static final long serialversionuid = 1L; 
  Private String _id; 
  Private String _name; 
 
  Private String _age; 
  Public String get_id () {return _id; 
  } public void set_id (String _id) {this._id = _id; 
  Public String get_name () {return _name; 
  } public void Set_name (String _name) {this._name = _name; 
  Public String Get_age () {return _age; 
  } public void Set_age (String _age) {this._age = _age; } 
} 

Next   We want to implement a ContentHandler to parse XML
to implement a ContentHandler typically requires the following steps
1, declaring a class, inheriting DefaultHandler. DefaultHandler is a base class in which a contenthandler is simply implemented. We just need to rewrite the inside method.
2, rewrite startdocument () and enddocument (), generally put the initialization before the formal parsing into startdocument (), and the finishing work is put into enddocument ().
3, overriding startelement (), the XML parser calls this function when it encounters the tag inside XML. It is often within this function to manipulate some data by judging the value of the LocalName.
4, overriding the characters () method, which is a callback method. After the parser finishes startelement (), the content of the node is parsed and the parameter ch[] is the content of the node.
5, rewrite EndElement () method, this method corresponds to Startelement (), after parsing a tag node, execute this method, parse a tag, call this process to restore and clear related information
First    Create a new class to inherit DefaultHandler and override the following methods

The public class Sax_parserxml extends DefaultHandler {/** * triggers this event when it begins parsing the declaration of an XML file, and can do some initialization work * */@Ove Rride public void Startdocument () throws Saxexception {//TODO auto-generated method stub super.startdocumen 
     
  T (); /** * When beginning to parse the start tag of an element, it triggers this event */@Override public void startelement (string uri, String LocalName, String QName, Attributes Attributes) throws Saxexception {//TODO auto-generated method stub Super.start 
  Element (URI, LocalName, qName, attributes); 
   /** * Triggers this event when reading the text element. * */@Override public void characters (char[] ch, int start, int length) throws Saxexception {//TODO Au 
  to-generated method Stub super.characters (ch, start, length); /** * This event is triggered when the end tag is read */@Override public void EndElement (string uri, string LocalName, String QName) throws Saxexception {//TODO auto-generated method stub super.endelement(URI, LocalName, QName); 
 } 
 
}

First we create a list to hold the parsed person data

list<person> persons; 

But? Where do you initialize it? We can initialize it in startdocument () because it triggers this event when we start parsing the declaration of the XML file, so it's appropriate to put it here.

/** 
 * When you start parsing the declaration of an XML file, it triggers this event, and you can do some initialization work 
 * * 
/@Override public 
void Startdocument () throws saxexception { 
  //TODO auto-generated method Stub 
  super.startdocument (); 
  Initialize list 
  persons = new arraylist<person> (); 
} 

And then we're going to start parsing.

/** 
   * When you start parsing the start tag of an element, it triggers this event 
   * 
  /@Override public 
  void Startelement (String uri, String LocalName, String qName, 
      Attributes Attributes) throws Saxexception { 
    //TODO auto-generated method stub 
    Super.startelement (URI, LocalName, qName, attributes); 
 
    If read is the person tag begins to store 
    if (localname.equals ("person")) {person 
      = new person (); 
      PERSON.SET_ID (Attributes.getvalue ("id")); 
    } 
    Curnode = LocalName; 
  } 

In the code above, LocalName represents the element name that is currently resolved to

Step 
//1. Determines whether the person element is 
//2. Create a new 
man object//3. Gets the ID added to the 
persons object Curnode is used to save the current element name in characters
/** 
   * This event is triggered when the text element is read. 
   * * 
  /@Override public 
  void characters (char[] ch, int start, int length) 
      throws Saxexception { 
    //TODO Au to-generated method Stub 
    super.characters (CH, start, length); 
 
    if (person!= null) { 
      //Remove current element corresponding value 
      string txt = new string (ch, start, length); 
      Determines whether the element is name 
      if (curnode.equals ("name")) { 
        //adds the fetched value to the person object 
        person.set_name (TXT); 
      } else if (Curnode.equals ("Age")) { 
        person.set_age (TXT);}}} 
   

The next step is to introduce what you need to do at the end of the tag.

/** 
 * When the end tag is read it triggers this event 
 * 
/@Override public 
void EndElement (String uri, string LocalName, String QName) 
    throws saxexception { 
  //TODO auto-generated method stub 
  super.endelement (Uri, LocalName, qName); 
 
  If it is and the person is not empty, add to list 
  if (localname.equals ("person") && person!= null) { 
    persons.add; C11/>person = null; 
  } 
 
  Curnode = ""; 
} 

The parsing thing is over, and the process is
1. A startelement method is invoked when an element starts
2. Next call to the characters method, which can be used to get the value of the element
3. When an element ends, the EndElement method is invoked
After parsing is over, we need to write a method to get the list saved after parsing.

Public list<person> ReadXML (InputStream is) { 
 
    SAXParserFactory factory = Saxparserfactory.newinstance (); 
    try { 
      SAXParser parser = Factory.newsaxparser (); 
 
      The first method 
      //Parser.parse (is, this); 
 
      The second method 
      XMLReader reader = Parser.getxmlreader (); 
      Reader.setcontenthandler (this); 
      Reader.parse (New InputSource (IS)); 
 
    catch (Exception e) { 
      //Todo:handle Exception 
      e.printstacktrace (); 
    } 
 
    return persons; 
  } 

The code above does not explain that the content can be parsed as long as the InputStream object is passed in.
After reading the code, I'll give you the complete code.

Package com.example.demo.Utils; 
Import Java.io.InputStream; 
Import java.util.ArrayList; 
 
Import java.util.List; 
Import Javax.xml.parsers.SAXParser; 
 
Import Javax.xml.parsers.SAXParserFactory; 
Import org.xml.sax.Attributes; 
Import Org.xml.sax.InputSource; 
Import org.xml.sax.SAXException; 
Import Org.xml.sax.XMLReader; 
 
Import Org.xml.sax.helpers.DefaultHandler; 
 
Import Com.example.demo.Person; 
  public class Sax_parserxml extends DefaultHandler {list<person> persons; 
  Person of person; 
 
  Current node String Curnode; 
    Public list<person> ReadXML (InputStream is) {SAXParserFactory factory = Saxparserfactory.newinstance (); 
 
      try {saxparser parser = Factory.newsaxparser (); 
 
      The first method//Parser.parse (is, this); 
      The second method XMLReader reader = Parser.getxmlreader (); 
      Reader.setcontenthandler (this); 
 
    Reader.parse (New InputSource (IS)); 
    catch (Exception e) {//Todo:handle Exception  E.printstacktrace (); 
  return persons; /** * When you start parsing the declaration of an XML file, it triggers this event, and you can do some initialization work * */@Override public void Startdocument () throws Saxexce 
    ption {//TODO auto-generated Method Stub super.startdocument (); 
  Initialize list persons = new arraylist<person> (); /** * When beginning to parse the start tag of an element, it triggers this event */@Override public void startelement (string uri, String LocalName, String QName, Attributes Attributes) throws Saxexception {//TODO auto-generated method stub Super.start 
 
    Element (URI, LocalName, qName, attributes); 
      If read is the person tag begins to store if (localname.equals ("person")) {person = new person (); 
    PERSON.SET_ID (Attributes.getvalue ("id")); 
  } Curnode = LocalName; 
   /** * Triggers this event when reading the text element. * */@Override public void characters (char[] ch, int start, int length) throws Saxexception {//TODO Au to-generated method Stub super.characters (cH, start, length); 
      if (person!= null) {//Remove current element corresponding value string txt = new string (ch, start, length); 
      Determines whether the element is name if (curnode.equals ("name")) {//Adds the fetched value to the Person object Person.set_name (TXT); 
      else if (curnode.equals ("Age")) {person.set_age (TXT); }}/** * When the end tag is read it triggers this event */@Override public void EndElement (string uri, String Localn Ame, String qName) throws Saxexception {//TODO auto-generated Method stub super.endelement (URI, Localna 
 
    Me, qName); If the person ends and the person is not empty, add to list if (Localname.equals ("person") && person!= null) {Persons.add 
      son); 
    person = null; 
  } Curnode = ""; 
 } 
 
}

Write a method call the next class

list<person> persons = new Sax_parserxml (). ReadXML (is); 
      StringBuffer buffer = new StringBuffer (); 
      for (int i = 0; i < persons.size (); i++) {Person person 
        =persons.get (i) 
        ; Buffer.append ("ID:" + person.get_id () + "  "); 
        Buffer.append ("Name:" + person.get_name () + "  "); 
        Buffer.append ("Age:" + person.get_age () + "\ n"); 
      } 
      Toast.maketext (activity, buffer, Toast.length_long). Show (); 

If you see the following interface to explain the success of the resolution ~

Summary:

Dom (file object Model) Resolution: The parser reads the entire document and then constructs a tree structure that resides in memory, and then the code can manipulate the tree structure based on the DOM interface.  
 
Benefits: The entire document reads into memory for ease of operation: support for a variety of functions such as modification, deletion, and recurrence of permutations.  
 
Disadvantage: Read the entire document into memory, leaving too many unwanted nodes, wasting memory and space.  
 
Usage: Once you have read the document, you will need to work on the document multiple times and with sufficient hardware resources (memory, CPU).  
 
Sax parsing occurs to resolve problems with DOM parsing. It features a: 
 
Advantage: the need to transfer the entire document without implementation, taking up less resources. Especially in embedded environments, such as Android, it is highly recommended to use Sax parsing.  
 
Disadvantage: Unlike DOM parsing, the document resides in memory for a long time, and the data is not persistent. If the data is not saved after the event, the data is lost.  
 
Usage: machine has performance limits

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.