Four ways to generate and parse XML documents in Java (Introduction + advantages and disadvantages comparison + example)

Four ways to generate and parse XML documents in Java (Introduction + advantages and disadvantages comparison + example) _java

Last Update:2017-01-18 Source: Internet

Author: User

Tags object model xpath

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

As we all know, there are more and more ways to parse XML, but there are four main approaches: DOM, SAX, Jdom, and dom4j

The following first gives the jar package download addresses for these four methods

DOM: It's all in the Java JDK now, in the Xml-apis.jar bag.

sax:http://sourceforge.net/projects/sax/

Jdom:http://jdom.org/downloads/index.html

dom4j:http://sourceforge.net/projects/dom4j/

Introduction and analysis of advantages and disadvantages

1. DOM (Document Object Model)

DOM is the official consortium standard for representing XML documents in a platform-and language-independent way. A DOM is a collection of nodes or pieces of information organized in a hierarchical structure. This hierarchy allows developers to find specific information in the tree. Parsing the structure usually requires loading the entire document and the construction hierarchy before any work can be done. Because it is based on the information hierarchy, the DOM is considered to be tree based or object-based.

Advantages

① allows applications to make changes to data and structs.

② access is bidirectional and can be navigated up and down in the tree at any time, obtaining and manipulating any part of the data.

Shortcomings

① usually need to load the entire XML document to construct the hierarchy, consuming resources large.

2. SAX (simple APIs for XML)

The advantages of Sax processing are very similar to the advantages of streaming media. Analysis can begin immediately, rather than wait for all data to be processed. Also, because an application checks data only when it reads data, it does not need to store the data in memory. This is a great advantage for large documents. In fact, an application doesn't even have to parse the entire document; it can stop parsing when a condition is satisfied. In general, Sax is much faster than its replacement dom.

Select Dom or sax? Choosing a DOM or Sax parsing model is a very important design decision for developers who need to write their own code to work with XML documents. Dom uses a tree-structured approach to accessing XML documents, and Sax uses the event model.

The DOM parser converts an XML document into a tree containing its contents, and can traverse the tree. The advantage of using the DOM parsing model is that it is easy to program, the developer only needs to invoke the build instructions, and then use the navigation APIs to access the required tree nodes to complete the task. You can easily add and modify elements in the tree. However, because of the need to process the entire XML document with the DOM parser, the performance and memory requirements are high, especially when you encounter a large XML file. Because of its traversal capabilities, DOM parsers are often used in services where XML documents require frequent changes.

The SAX parser employs an event-based model that triggers a series of events when parsing an XML document, and when a given tag is found, it can activate a callback method that tells the method that the label has been found. Sax's requirements for memory are usually lower because it lets developers decide for themselves what tag to process. Especially when developers only need to work with some of the data contained in the document, Sax has a better ability to expand. But coding is difficult when using a SAX parser, and it is difficult to access multiple different data in the same document at the same time.

Advantage

① does not have to wait for all the data to be processed, the analysis can begin immediately.

② only checks the data when it is read and does not need to be saved in memory.

③ can stop parsing when a condition is satisfied without having to parse the entire document.

④ high efficiency and performance, can resolve documents larger than system memory.

Shortcomings

① requires the application to take charge of the processing logic of tag (such as maintaining parent/child relationships, etc.), the more complex the document the more complex the program.

② one-way navigation, unable to locate the document hierarchy, it is difficult to access different parts of the same document data, do not support XPath.

3. JDOM (java-based Document Object Model)

The goal of Jdom is to become a Java-specific document model that simplifies interaction with XML and is faster than using DOM. Because it is the first Java-specific model, JDOM has been vigorously promoted and promoted. is considering using the Java specification Request JSR-102 to eventually use it as a "java standard extension". Jdom has been developed since the beginning of the 2000.

There are two main differences between Jdom and Dom. First, Jdom uses only specific classes instead of interfaces. This simplifies the API in some ways, but it also limits flexibility. Second, the API uses a lot of collections classes, simplifying the use of Java developers who are already familiar with these classes.

The Jdom document declares that its purpose is "to use 20% (or less) energy to solve 80% (or more) java/xml problems" (assuming 20% according to the learning curve). Jdom is of course useful for most java/xml applications, and most developers find APIs much easier to understand than DOM. Jdom also includes a fairly extensive review of program behavior to prevent users from doing anything that is meaningless in XML. However, it still requires you to fully understand XML in order to do something beyond the basics (or even to understand some cases of errors). This may be more meaningful than learning a DOM or Jdom interface.

The jdom itself does not contain a parser. It typically uses the SAX2 parser to parse and validate the input XML document (although it can also represent the previously constructed DOM as input). It contains converters to output jdom representations as SAX2 event streams, DOM models, or XML text documents. Jdom is the open source that is released under the Apache license variant.

Advantages

① uses specific classes rather than interfaces to simplify the DOM API.

② a large number of Java collection classes are used to facilitate Java developers.

Shortcomings

① has no better flexibility.

② performance is poor.

4. DOM4J (Document Object Model for Java)

Although DOM4J represents a completely independent development result, it was initially an intelligent branch of Jdom. It incorporates a number of features beyond the representation of basic XML documents, including integrated XPath support, XML schema support, and event-based processing for large documents or streaming documents. It also provides the option to build the document representation, which has parallel access through the DOM4J API and the standard DOM interface. It has been under development since the second half of 2000.

To support all of these features, DOM4J uses interfaces and abstract basic class methods. DOM4J uses the collections class in the API heavily, but in many cases it also provides workarounds to allow for better performance or more direct coding methods. The immediate benefit is that although DOM4J has paid the cost of more complex APIs, it provides much greater flexibility than jdom.

When adding flexibility, XPath integration, and goals for large document processing, DOM4J's goal is the same as Jdom: ease of use and intuitive operation for Java developers. It is also dedicated to becoming a more complete solution than jdom, achieving the goal of essentially addressing all java/xml issues. When this goal is completed, it is less stressed than jdom to prevent improper application behavior.

Dom4j is a very, very good Java XML API with excellent performance, powerful features and extreme ease of use, and it is also an open-source software. Now you can see that more and more Java software is using dom4j to read and write XML, and it is particularly worth mentioning that even Sun's JAXM is using DOM4J.

Advantages

① a large number of Java collection classes to facilitate Java developers, while providing some alternative methods of performance.

② supports XPath.

③ has very good performance.

Shortcomings

① a large number of interfaces, the API is more complex.

Second, the comparison

1. DOM4J performance is best, even Sun's JAXM is also using DOM4J. Many open source projects currently use DOM4J, such as the famous hibernate also use dom4j to read XML configuration files. If portability is not considered, then use DOM4J.

2. Jdom and Dom performed poorly during performance testing, while the memory overflowed while testing 10M documents, but was portable. It is also worth considering using DOM and jdom in small document situations. Although Jdom developers have explained that they expect to focus on performance issues before the formal release, it does not merit a performance perspective. In addition, Dom is still a very good choice. DOM implementations are widely used in a variety of programming languages. It is also the basis for many other XML-related standards because it is officially recommended by the Consortium (as opposed to a non-standard Java model), so it may be needed in some types of projects (such as using DOM in JavaScript).

3. Sax behaves better, depending on its specific parsing mode-event-driven. A sax detects the incoming XML stream, but it is not loaded into memory (of course, when the XML stream is read, some of the documents are temporarily hidden in memory).

My opinion: If the XML document is large and does not consider the portability problem, it is recommended to use DOM4J, if the XML document is small, Jdom is recommended, and sax is considered if you need to process it without saving the data. But in any case, or that sentence: the best for their own, if time permitting, we suggest that these four methods are tried and then choose a suitable for their own.

Third, the example

To save space, the four methods and differences for building XML documents are not given for the time being, and only the code for parsing XML documents is given, if the complete project (establishing XML document + parsing xml+ test comparison) is required.

The following XML content is parsed here:

<?xml version= "1.0" encoding= "UTF-8"?>
<users>
  <user id= "0" >
    <name>Alexia< /name>
    <age>23</age>
    <sex>Female</sex>
  </user>
  <user id= "1" >
    <name>Edward</name>
    <age>24</age>
    <sex>Male</sex>
  </user>
  <user id= "2" >
    <name>wjm</name>
    <age>23</age>
    <sex>Female</sex>
  </user>
  <user id= "3" >
    <name>wh</name>
    <age>24</age>
    <sex>Male</sex>
  </user>
</users>

First, define an interface for XML document resolution:

/**
 * @author Alexia
 * *
 defines an interface for parsing XML documents/Public
interface XmlDocument {
  
  /**
   * Parsing XML documents * * 
   @param filename
   *      file full path name
  /public void Parserxml (String fileName);
}

1. Dom Example

Package com.xml;
Import java.io.FileNotFoundException;
Import Java.io.FileOutputStream;
Import java.io.IOException;
Import Java.io.PrintWriter;
Import Javax.xml.parsers.DocumentBuilder;
Import Javax.xml.parsers.DocumentBuilderFactory;
Import javax.xml.parsers.ParserConfigurationException;
Import Javax.xml.transform.OutputKeys;
Import Javax.xml.transform.Transformer;
Import javax.xml.transform.TransformerConfigurationException;
Import javax.xml.transform.TransformerException;
Import Javax.xml.transform.TransformerFactory;
Import Javax.xml.transform.dom.DOMSource;
Import Javax.xml.transform.stream.StreamResult;
Import org.w3c.dom.Document;
Import org.w3c.dom.Element;
Import Org.w3c.dom.Node;
Import org.w3c.dom.NodeList;

Import org.xml.sax.SAXException;

  /** * @author Alexia * * DOM parse XML document/public class Domdemo implements XmlDocument {private document document; public void Parserxml (String fileName) {try {documentbuilderfactory dbf = Documentbuilderfactory.newiNstance ();
      Documentbuilder db = Dbf.newdocumentbuilder ();
      Document document = Db.parse (FileName);
      
      NodeList users = Document.getchildnodes ();
        for (int i = 0; i < users.getlength (); i++) {Node user = Users.item (i);
        
        NodeList UserInfo = User.getchildnodes ();
          for (int j = 0; J < Userinfo.getlength (); j + +) {node node = Userinfo.item (j);
          
          NodeList Usermeta = Node.getchildnodes ();
              for (int k = 0; k < usermeta.getlength (); k++) {if Usermeta.item (k). Getnodename ()!= "#text")
          System.out.println (Usermeta.item (k). Getnodename () + ":" + Usermeta.item (k). Gettextcontent ());
        } System.out.println ();
    A catch (FileNotFoundException e) {e.printstacktrace ());
    catch (Parserconfigurationexception e) {e.printstacktrace (); The catch (Saxexception e) {e.printstacktrace ();
    catch (IOException e) {e.printstacktrace ();

 }
  }
}

2. Sax Example

Package com.xml;
Import Java.io.FileInputStream;
Import java.io.FileNotFoundException;
Import Java.io.FileOutputStream;
Import java.io.IOException;
Import Java.io.InputStream;
Import Java.io.OutputStream;

Import Java.io.StringWriter;
Import javax.xml.parsers.ParserConfigurationException;
Import Javax.xml.parsers.SAXParser;
Import Javax.xml.parsers.SAXParserFactory;
Import Javax.xml.transform.OutputKeys;
Import Javax.xml.transform.Result;
Import Javax.xml.transform.Transformer;
Import javax.xml.transform.TransformerConfigurationException;
Import Javax.xml.transform.sax.SAXTransformerFactory;
Import Javax.xml.transform.sax.TransformerHandler;

Import Javax.xml.transform.stream.StreamResult;
Import org.xml.sax.Attributes;
Import org.xml.sax.SAXException;
Import Org.xml.sax.helpers.AttributesImpl;

Import Org.xml.sax.helpers.DefaultHandler;  /** * @author Alexia * * SAX parse XML document/public class Saxdemo implements XmlDocument {public void Parserxml (String FileName) {SAXParserFactory SAXFAC = Saxparserfactory.newinstance ();
      try {saxparser saxparser = Saxfac.newsaxparser ();
      InputStream is = new FileInputStream (fileName);
    Saxparser.parse (IS, new Mysaxhandler ());
    catch (Parserconfigurationexception e) {e.printstacktrace ();
    catch (Saxexception e) {e.printstacktrace ();
    catch (FileNotFoundException e) {e.printstacktrace ();
    catch (IOException e) {e.printstacktrace ();
  Class Mysaxhandler extends DefaultHandler {Boolean hasattribute = false;

  Attributes Attributes = null;
  public void Startdocument () throws Saxexception {//System.out.println ("document started printing");
  public void Enddocument () throws Saxexception {//System.out.println ("End of document Printing");  public void Startelement (string uri, String localname, String qName, Attributes Attributes) throws Saxexception
    {if (Qname.equals ("users")) {return;
 } if (Qname.equals ("user")) {return;   } if (Attributes.getlength () > 0) {this.attributes = attributes;
    This.hasattribute = true; } public void EndElement (string uri, String localname, String qName) throws Saxexception {if (Hasattribu Te && (attributes!= null)) {for (int i = 0; i < attributes.getlength (); i++) {System.out.print
      (Attributes.getqname (0) + ":" + attributes.getvalue (0)); }} public void characters (char[] ch, int start, int length) throws Saxexception {System.out.print (n
  EW String (CH, start, length));

 }
}

3. jdom Example

Package com.xml;
Import java.io.FileNotFoundException;
Import Java.io.FileOutputStream;
Import java.io.IOException;

Import java.util.List;
Import org.jdom2.Document;
Import org.jdom2.Element;
Import org.jdom2.JDOMException;
Import Org.jdom2.input.SAXBuilder;

Import Org.jdom2.output.XMLOutputter; /** * @author Alexia * * * JDOM Parse XML document */public class Jdomdemo implements XmlDocument {public void Parserxml (

    String fileName) {Saxbuilder builder = new Saxbuilder ();
      try {Document document = Builder.build (FileName);
      Element users = document.getrootelement ();

      List userlist = Users.getchildren ("user");
        for (int i = 0; i < userlist.size (); i++) {Element user = (Element) userlist.get (i);

        List userInfo = User.getchildren ();
              for (int j = 0; J < Userinfo.size (); j + +) {System.out.println ((Element) UserInfo.get (j)). GetName ()

        + ":" + (Element) UserInfo.get (j)). GetValue ()); } SyStem.out.println ();
    } catch (Jdomexception e) {e.printstacktrace ();
    catch (IOException e) {e.printstacktrace ();

 }

  }
}

4. dom4j Example

Package com.xml;
Import Java.io.File;
Import Java.io.FileWriter;
Import java.io.IOException;
Import Java.io.Writer;

Import Java.util.Iterator;
Import org.dom4j.Document;
Import org.dom4j.DocumentException;
Import Org.dom4j.DocumentHelper;
Import org.dom4j.Element;
Import Org.dom4j.io.SAXReader;

Import Org.dom4j.io.XMLWriter; /** * @author Alexia * * DOM4J parsing XML document/public class Dom4jdemo implements XmlDocument {public void Parserxml (St
    Ring fileName) {File Inputxml = new file (fileName);

    Saxreader Saxreader = new Saxreader ();
      try {Document document = Saxreader.read (Inputxml);
      Element users = document.getrootelement (); for (Iterator i = Users.elementiterator (); I.hasnext ();)
        {element user = (Element) I.next (); for (Iterator J = user.elementiterator (); J.hasnext ();)
          {element node = (element) J.next ();
        System.out.println (Node.getname () + ":" + Node.gettext ());
      } System.out.println (); }
    catch (Documentexception e) {System.out.println (E.getmessage ());

 }
  }

}

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More