As we all know, there are more and more ways to parse XML now, but there are four main methods: DOM, SAX, Jdom and dom4j.
Here are the first four ways to give the jar package
DOM: It's all in the Java JDK now, in the Xml-apis.jar bag.
sax:http://sourceforge.net/projects/sax/
Jdom:http://jdom.org/downloads/index.html
dom4j:http://sourceforge.net/projects/dom4j/
First, introduction and analysis of advantages and disadvantages
1. DOM(Document Object Model)
DOM is the official standard for representing XML documents in a platform-and language-neutral way. The DOM is a collection of nodes or pieces of information that are organized in a hierarchical structure. This hierarchy allows developers to look for specific information in the tree. Analyzing the structure usually requires loading the entire document and constructing the hierarchy before any work can be done. Because it is based on the information hierarchy, the DOM is considered to be tree-based or object-based.
Advantages
① allows applications to make changes to data and structures.
② access is bidirectional and can be navigated up and down in the tree at any time, capturing and manipulating any part of the data.
Disadvantages
① usually needs to load the entire XML document to construct the hierarchy, consuming resources.
2. SAX (simple API for XML)
The advantages of Sax processing are very similar to the advantages of streaming media. The analysis can begin immediately, rather than waiting for all data to be processed. Also, because the application examines data only when it is being read, it does not need to store the data in memory. This is a huge advantage for large documents. In fact, the application doesn't even have to parse the entire document; it can stop parsing when a condition is met. In general, Sax is much faster than its surrogate dom.
Select Dom or choose Sax? Choosing the DOM or Sax parsing model is a very important design decision for developers who need to write their own code to handle XML documents. The DOM accesses the XML document in a tree-structured way, and Sax uses the event model.
The DOM parser transforms an XML document into a tree containing its contents and can traverse the tree. The advantage of parsing a model with DOM is that programming is easy, and developers simply need to invoke the build instructions and then use the navigation APIs to access the desired tree nodes to complete the task. It is easy to add and modify elements in the tree. However, because of the need to process the entire XML document when using the DOM parser, the performance and memory requirements are high, especially when encountering large XML files. Because of its traversal capabilities, DOM parsers are often used in services where XML documents require frequent changes.
The SAX parser uses an event-based model that can trigger a sequence of events when parsing an XML document, and when a given tag is found, it can activate a callback method that tells the method that the label has been found. Sax's requirements for memory are usually low because it lets the developer decide which tag to process. Especially if the developer only needs to work with some of the data contained in the document, Sax has a better ability to scale. But it is difficult to code with a SAX parser, and it is difficult to access multiple different data in the same document at the same time.
Advantage
① does not need to wait for all data to be processed, the analysis can begin immediately.
② only examines data when it is read, and does not need to be stored in memory.
③ can stop parsing when a condition is met, without having to parse the entire document.
④ high efficiency and performance to parse documents larger than system memory.
Disadvantages
① requires the application to be responsible for the processing logic of the tag (for example, maintaining a parent/child relationship, etc.), and the more complex the document is.
② One-way navigation, cannot locate the document hierarchy, it is difficult to access different parts of the same document at the same time, XPath is not supported.
3. JDOM (java-based Document Object Model)
The purpose of Jdom is to become a Java-specific document model that simplifies interacting with XML and is faster than using DOM implementations. Because it is the first Java-specific model, JDOM has been vigorously promoted and promoted. is considering using Java Spec request JSR-102 to end it as a Java standard extension. Since the beginning of 2000, Jdom has been developed.
There are two main differences between Jdom and Dom. First, Jdom uses only specific classes rather than interfaces. This simplifies the API in some ways, but it also limits flexibility. Second, the API uses the collections class extensively, simplifying the use of Java developers who are already familiar with these classes.
The purpose of the Jdom document declaration is to "use 20% (or less) effort to resolve 80% (or more) java/xml problems" (assuming 20% based on the learning curve). Jdom is certainly useful for most java/xml applications, and most developers find the API much easier to understand than the DOM. Jdom also includes fairly extensive checks of program behavior to prevent users from doing anything meaningless in XML. However, it still requires you to fully understand the XML to do something that goes beyond basic work (or even understand some of the errors in some cases). This may be more meaningful than learning the DOM or Jdom interface.
The jdom itself does not contain parsers. It usually uses the SAX2 parser to parse and validate the input XML document (although it can also use the previously constructed DOM representation as input). It contains converters to output jdom representations to SAX2 event streams, DOM models, or XML text documents. Jdom is an open source published under the Apache license variant.
Advantages
① simplifies the DOM API by using specific classes rather than interfaces.
② uses a large number of Java collection classes to facilitate Java developers.
Disadvantages
The ① has no good flexibility.
② performance is poor.
4. DOM4J (Document Object Model for Java)
Although DOM4J represents a completely independent development result, initially it is an intelligent branch of Jdom. It incorporates a number of features beyond the basic XML document representation, including integrated XPath support, XML schema support, and event-based processing for large documents or streaming documents. It also provides options for building a document representation, which has parallel access through the DOM4J API and the standard DOM interface. It has been in development since the second half of 2000.
To support all of these features, DOM4J uses interfaces and abstract basic class methods. DOM4J uses the collections class in the API extensively, but in many cases it also provides workarounds to allow for better performance or more direct encoding methods. The direct benefit is that while DOM4J has paid the cost of a more complex API, it offers much greater flexibility than jdom.
The goal of DOM4J is the same as jdom when adding flexibility, XPath integration, and the goal of large document processing: Ease of use and intuitive operation for Java developers. It is also committed to becoming a more complete solution than jdom, achieving the goal of essentially addressing all java/xml issues. When this goal is completed, it is less stressed than jdom to prevent incorrect application behavior.
Dom4j is a very, very good Java XML API that features excellent performance, power, and extreme ease of use, and is also an open source software. Now you can see that more and more Java software is using dom4j to read and write XML, especially to mention that even Sun's JAXM is using DOM4J.
Advantages
① uses a large number of Java collection classes to facilitate Java developers, while providing some alternative approaches to high performance.
② supports XPath.
The ③ has good performance.
Disadvantages
① uses a lot of interfaces, and the API is more complex.
Second, Comparison
1. dom4j Performance is the best, even Sun's JAXM is also in use dom4j. Many open source projects currently employ dom4j, such as the famous Hibernate, which also uses DOM4J to read XML configuration files. If portability is not considered, then dom4j is used.
2. Jdom and Dom perform poorly during performance testing, and memory overflows when testing 10M documents, but portable . In the case of small documents it is also worth considering DOM and Jdom. Although the developers of Jdom have stated that they expect to focus on performance issues before the full release, there is really no merit in the performance point of view. In addition, Dom is still a very good choice. DOM implementations are widely used in many programming languages. It is also the basis for many other XML-related standards, because it is formally recommended (as opposed to a non-standard Java model), so it may also be needed in some types of projects (such as using the DOM in JavaScript).
3. SAX behaves better, depending on its specific parsing mode-event driven. A sax detects the incoming XML stream, but does not load into memory (of course, some documents are temporarily hidden in memory when the XML stream is read).
my view: dom4j is recommended if the XML document is large and does not consider portability, and if the XML document is smaller it is recommended to use Jdom and sax if you need to deal with it in a timely manner without having to save the data. But in any case, or that sentence: suitable for their own is the best, if time permitting, we suggest that all four ways to try and then choose a suitable for their own.
Iii. examples
In order to save space, here for the time being not given the four ways to build XML documents and differences, only to parse the XML document code, if the need for a complete project (build XML document + Analytic xml+ Test comparison), can go to my csdn download.
Here is an example of parsing the following XML content:
<?xml version= "1.0" encoding= "UTF-8"?><users> <user id= "0" > <name>alexia</name > <age>23</age> <sex>Female</sex> </user> <user id= "1" > <name>Edward</name> <age>24</age> <sex>Male</sex> </ user> <user id= "2" > <name>wjm</name> <age>23</age> <sex >Female</sex> </user> <user id= "3" > <name>wh</name> < age>24</age> <sex>Male</sex> </user></users>
The interface for parsing XML documents is defined first:
1/** 2 * @author Alexia 3 * 4 * Interface defining XML document Resolution 5 */ 6 public interface XmlDocument {7 8 /** 9 * Parse XML Document * * * @param fileName12 * file full path name */14 public void Parserxml (String FileName); 15}
1. Dom Example
Package Com.xml;import Java.io.filenotfoundexception;import Java.io.fileoutputstream;import java.io.IOException; Import Java.io.printwriter;import Javax.xml.parsers.documentbuilder;import Javax.xml.parsers.documentbuilderfactory;import Javax.xml.parsers.parserconfigurationexception;import Javax.xml.transform.outputkeys;import Javax.xml.transform.transformer;import Javax.xml.transform.transformerconfigurationexception;import Javax.xml.transform.transformerexception;import Javax.xml.transform.transformerfactory;import Javax.xml.transform.dom.domsource;import Javax.xml.transform.stream.streamresult;import Org.w3c.dom.document;import Org.w3c.dom.element;import Org.w3c.dom.node;import org.w3c.dom.nodelist;import org.xml.sax.saxexception;/** * @author Alexia * * DOM parsing XML document */PUBL IC class Domdemo implements XmlDocument {private document document; public void Parserxml (String fileName) {try {documentbuilderfactory dbf = Documentbuilderfactory.newin Stance (); Documentbuilder db = Dbf.newdocumentbuilder (); Document document = Db.parse (FileName); NodeList users = Document.getchildnodes (); for (int i = 0; i < users.getlength (); i++) {Node user = Users.item (i); NodeList userInfo = User.getchildnodes (); for (int j = 0; J < Userinfo.getlength (); j + +) {node node = Userinfo.item (j); NodeList Usermeta = Node.getchildnodes (); for (int k = 0; k < usermeta.getlength (); k++) {if (Usermeta.item (k). Getnodename ()! = "#text") System.out.println (Usermeta.item (k). Getnodename () + ":" + Use Rmeta.item (k). Gettextcontent ()); } System.out.println (); }}} catch (FileNotFoundException e){E.printstacktrace (); } catch (Parserconfigurationexception e) {e.printstacktrace (); } catch (Saxexception e) {e.printstacktrace (); } catch (IOException e) {e.printstacktrace (); } }}
2. Sax Example
Package Com.xml;import Java.io.fileinputstream;import Java.io.filenotfoundexception;import Java.io.fileoutputstream;import Java.io.ioexception;import Java.io.inputstream;import Java.io.OutputStream;import Java.io.stringwriter;import Javax.xml.parsers.parserconfigurationexception;import Javax.xml.parsers.SAXParser; Import Javax.xml.parsers.saxparserfactory;import Javax.xml.transform.outputkeys;import Javax.xml.transform.Result ; Import Javax.xml.transform.transformer;import Javax.xml.transform.transformerconfigurationexception;import Javax.xml.transform.sax.saxtransformerfactory;import Javax.xml.transform.sax.transformerhandler;import Javax.xml.transform.stream.streamresult;import Org.xml.sax.attributes;import Org.xml.sax.SAXException;import Org.xml.sax.helpers.attributesimpl;import org.xml.sax.helpers.defaulthandler;/** * @author Alexia * * SAX Parsing XML document */ public class Saxdemo implements XmlDocument {public void Parserxml (String fileName) {saxparserfactory SAXFAC = SaxparserfacTory.newinstance (); try {saxparser saxparser = Saxfac.newsaxparser (); InputStream is = new FileInputStream (fileName); Saxparser.parse (IS, new Mysaxhandler ()); } catch (Parserconfigurationexception e) {e.printstacktrace (); } catch (Saxexception e) {e.printstacktrace (); } catch (FileNotFoundException e) {e.printstacktrace (); } catch (IOException e) {e.printstacktrace (); }}}class Mysaxhandler extends DefaultHandler {Boolean hasattribute = false; Attributes Attributes = null; public void Startdocument () throws Saxexception {//System.out.println ("document started printing"); } public void Enddocument () throws Saxexception {//System.out.println ("Document printing ended"); } public void Startelement (string uri, String localname, String qName, Attributes Attributes) throws Saxexce ption {if (Qname.equals ("users")) {return; } if (qname.equals ("user")) {return; } if (Attributes.getlength () > 0) {this.attributes = attributes; This.hasattribute = true; }} public void EndElement (string uri, String localname, String qName) throws Saxexception {if (h Asattribute && (attributes = null)) {for (int i = 0; i < attributes.getlength (); i++) { System.out.print (attributes.getqname (0) + ":" + attributes.getvalue (0)); }}} public void characters (char[] ch, int start, int length) throws Saxexception {System . Out.print (New String (CH, start, length)); }}
3. jdom Example
Package Com.xml;import Java.io.filenotfoundexception;import Java.io.fileoutputstream;import java.io.IOException; Import Java.util.list;import org.jdom2.document;import Org.jdom2.element;import Org.jdom2.jdomexception;import Org.jdom2.input.saxbuilder;import org.jdom2.output.xmloutputter;/** * @author Alexia * * JDOM parse XML document * */public class J Domdemo implements XmlDocument {public void Parserxml (String fileName) {Saxbuilder builder = new Saxbuilder (); try {Document document = Builder.build (FileName); Element users = document.getrootelement (); List userlist = Users.getchildren ("user"); for (int i = 0; i < userlist.size (); i++) {Element user = (Element) userlist.get (i); List userInfo = User.getchildren (); for (int j = 0; J < Userinfo.size (); j + +) {System.out.println ((Element) UserInfo.get (j)). GetName () + ":" + (Element) useriNfo.get (j)). GetValue ()); } System.out.println (); }} catch (Jdomexception e) {e.printstacktrace (); } catch (IOException e) {e.printstacktrace (); } }}
4. dom4j Example
Package Com.xml;import Java.io.file;import java.io.filewriter;import java.io.ioexception;import java.io.Writer; Import Java.util.iterator;import Org.dom4j.document;import Org.dom4j.documentexception;import Org.dom4j.documenthelper;import Org.dom4j.element;import Org.dom4j.io.saxreader;import org.dom4j.io.XMLWriter;/** * @author Alexia * * dom4j Parse XML document */public class Dom4jdemo implements XmlDocument {public void Parserxml (String file Name) {File Inputxml = new File (fileName); Saxreader Saxreader = new Saxreader (); try {Document document = Saxreader.read (Inputxml); Element users = document.getrootelement (); for (Iterator i = Users.elementiterator (); I.hasnext ();) {element user = (Element) I.next (); for (Iterator j = user.elementiterator (); J.hasnext ();) {element node = (element) J.next (); System.out.println (Node.getname () + ":" + Node.gettext ()); } System.out.println (); }} catch (Documentexception e) {System.out.println (E.getmessage ()); } }}
Four ways to generate and parse XML documents (Introduction + Advantages and disadvantages Comparison + example)