Currently, there are many techniques for parsing XML in Java, such as Dom, SAX, JDOM, dom4j, and the advantages and disadvantages of these 4 kinds of parsing XML document technologies are described below.
First, define an interface XmlDocument that operates an XML document, which defines the interface for creating and parsing XML, as follows:
Package com.interview.xml;/*** * @description: Defines an interface for establishing and parsing XML documents * @author Xgj */public interface XmlDocument {/** * build X ML document */public void Createxml (String fileName); /** * Parse XML document */public void Parserxml (String fileName); }
First, DOM
DOM is the official standard for representing XML documents in a platform-and language-neutral way. The DOM is a collection of nodes or pieces of information that are organized in a hierarchical structure. This hierarchy allows developers to look for specific information in the tree. Analyzing the structure usually requires loading the entire document and constructing the hierarchy before any work can be done. Because it is based on the information hierarchy, the DOM is considered to be tree-based or object-based. Dom and the generalized tree-based processing have several advantages. First, because the tree is persistent in memory, you can modify it so that the application can make changes to the data and structure. It can also navigate up and down the tree at any time, rather than a one-off process like sax. Dom is much simpler to use.
public class Domxml implements XmlDocument {private Document document;public void init () {try {documentbuilderfactory fact Ory = Documentbuilderfactory.newinstance ();D Ocumentbuilder builder = Factory.newdocumentbuilder (); this.document = Builder.newdocument ();} catch (Parserconfigurationexception e) {System.out.println (E.getmessage ());}} public void Createxml (String fileName) {Element root = this.document.createElement ("Xml-methods"); This.document.appendChild (root); Element method = This.document.createElement ("method"); Element name = this.document.createElement ("type"); Name.appendchild (This.document.createTextNode ("Dom")); Method.appendchild (name); Element methodcreate = this.document.createElement ("Method-create"); Methodcreate.appendchild ( This.document.createTextNode ("Createxml")); Method.appendchild (methodcreate); Element methodparse = this.document.createElement ("Method-parse"); Methodparse.appendchild ( This.document.createTextNode ("Parserxml")); Method.appendchild (methodparse); Root.appendchiLD (method); Transformerfactory tf = transformerfactory.newinstance (); try {Transformer Transformer = Tf.newtransformer ();D Omsource Source = new Domsource (document); Transformer.setoutputproperty (outputkeys.encoding, "gb2312"); Transformer.setoutputproperty (outputkeys.indent, "yes"); PrintWriter pw = new PrintWriter (new FileOutputStream (FileName)); Streamresult result = new Streamresult (PW); Transformer.transform (source, result); SYSTEM.OUT.PRINTLN ("Generate XML file successfully!");} catch (Transformerconfigurationexception e) {System.out.println (E.getmessage ());} catch (IllegalArgumentException e) {System.out.println (E.getmessage ());} catch (FileNotFoundException e) {System.out.println (E.getmessage ());} catch (Transformerexception e) { System.out.println (E.getmessage ());}} public void Parserxml (String fileName) {try {documentbuilderfactory dbf = documentbuilderfactory.newinstance ();D Ocumentbuilder db = Dbf.newdocumentbuilder ();D ocument Document = Db.parse (fileName); NodeList employees = Document.getchildnodes ();(int i = 0; i < employees.getlength (); i++) {Node employee = Employees.item (i); NodeList EmployeeInfo = Employee.getchildnodes (); for (int j = 0; J < Employeeinfo.getlength (); j + +) {Node node = employ Eeinfo.item (j); NodeList Employeemeta = Node.getchildnodes (); for (int k = 0; k < employeemeta.getlength (); k++) {System.out.println (emp Loyeemeta.item (k). Getnodename () + ":" + Employeemeta.item (k). Gettextcontent ());}} SYSTEM.OUT.PRINTLN ("Parse completed");} catch (FileNotFoundException e) {System.out.println (E.getmessage ());} catch (Parserconfigurationexception e) { System.out.println (E.getmessage ());} catch (Saxexception e) {System.out.println (E.getmessage ());} catch (IOException e) {System.out.println (E.getmessage () );}} public static void Main (string[] args) {domxml dom=new domxml ();d om.init ();//dom.createxml ("D://dom.xml"); Dom.parserxml ("D://dom.xml");}}
Second, SAX
The advantages of Sax processing are very similar to the advantages of streaming media. The analysis can begin immediately, rather than waiting for all data to be processed. Also, because the application examines data only when it is being read, it does not need to store the data in memory. This is a huge advantage for large documents. In fact, the application doesn't even have to parse the entire document; it can stop parsing when a condition is met. In general, Sax is much faster than its surrogate dom. Select Dom or choose Sax? Choosing the DOM or Sax parsing model is a very important design decision for developers who need to write their own code to handle XML documents. Dom uses a tree-structured approach to accessing XML documents, and the event model that Sax uses.
The DOM parser transforms an XML document into a tree containing its contents and can traverse the tree. The advantage of parsing a model with DOM is that programming is easy, and developers simply need to invoke the build instructions and then use the navigation APIs to access the desired tree nodes to complete the task. It is easy to add and modify elements in the tree. However, because of the need to process the entire XML document when using the DOM parser, the performance and memory requirements are high, especially when encountering large XML files. Because of its traversal capabilities, DOM parsers are often used in services where XML documents require frequent changes.
The SAX parser uses an event-based model that can trigger a sequence of events when parsing an XML document, and when a given tag is found, it can activate a callback method that tells the method that the label has been found. Sax requirements for memory are usually low because it allows the developer to decide for themselves which tag to process. Sax is a much better extension of this ability, especially when developers only need to work with some of the data contained in the document. But it is difficult to code with a SAX parser, and it is difficult to access multiple different data in the same document at the same time.
public class Saxxml implements xmldocument{@Overridepublic void Createxml (String fileName) {System.out.println (" Pure sax does not provide write XML operations ");} @Overridepublic void Parserxml (String fileName) {try {saxparserfactory factory = Saxparserfactory.newinstance (); SAXParser parser = Factory.newsaxparser (); InputStream in = new FileInputStream (fileName); Parser.parse (in, New MyHandler ());} catch (Parserconfigurationexception e) {e.printstacktrace ();} catch (Saxexception e) {e.printstacktrace ();} catch ( IOException e) {e.printstacktrace ();}} Class MyHandler extends defaulthandler{@Overridepublic void Enddocument () throws Saxexception {System.out.println (" Parse document End "); @Overridepublic void Startdocument () throws Saxexception {System.out.println ("Start parsing document");} @Overridepublic void Startelement (String uri, String localname, String Qname,attributes Attributes) throws Saxexception { System.out.println ("Current node is:" +qname);} } public static void Main (string[] args) {saxxml sax=new saxxml (); Sax.parserxml ("D://dom.xml");}}
Third, JDOM http://www.jdom.org/
The purpose of Jdom is to become a Java-specific document model that simplifies interacting with XML and is faster than using DOM implementations. Because it is the first Java-specific model, JDOM has been vigorously promoted and promoted. is considering using Java Spec request JSR-102 to end it as a Java standard extension. Since the beginning of 2000, Jdom has been developed.
There are two main differences between Jdom and Dom. First, Jdom uses only specific classes rather than interfaces. This simplifies the API in some ways, but it also limits flexibility. Second, the API uses the collections class extensively, simplifying the use of Java developers who are already familiar with these classes.
The purpose of the Jdom document declaration is to "use 20% (or less) effort to resolve 80% (or more) java/xml problems" (assuming 20% based on the learning curve). Jdom is certainly useful for most java/xml applications, and most developers find the API much easier to understand than the DOM. Jdom also includes fairly extensive checks of program behavior to prevent users from doing anything meaningless in XML. However, it still requires you to fully understand the XML to do something that goes beyond basic work (or even understand some of the errors in some cases). This may be more meaningful than learning the DOM or Jdom interface.
The jdom itself does not contain parsers. It usually uses the SAX2 parser to parse and validate the input XML document (although it can also use the previously constructed DOM representation as input). It contains converters to output jdom representations to SAX2 event streams, DOM models, or XML text documents. Jdom is an open source published under the Apache license variant.
public class Jdomxml implements XmlDocument {@Overridepublic void Createxml (String fileName) {document document; Element root;root = new Element ("Xml-methods");d ocument = new document (root); Element method = new Element ("method"); Root.addcontent (method); Element type = new Element ("type"); Type.settext ("Jdom"); method.addcontent (type); Element methodcreate = new Element ("Method-create"); Methodcreate.settext ("Createxml"); Method.addcontent ( Methodcreate); Element amethodparsege = new Element ("Method-parse"); Amethodparsege.settext ("Parserxml"); Method.addcontent ( Amethodparsege); Xmloutputter xmlout = new Xmloutputter (); try {xmlout.output (document, New FileOutputStream (FileName));} catch ( FileNotFoundException e) {e.printstacktrace ();} catch (IOException e) {e.printstacktrace ()}} @Overridepublic void Parserxml (String fileName) {try {Saxbuilder builder = new Saxbuilder ();D ocument Document = Builder.bu ILD (FileName); Element method = Document.getrootelement (); List methodlist = Method.getchildren ("method"); for (int i = 0; I < methodlist.size (); i++) {element el = (Element) methodlist.get (i); List info = El.getchildren (); for (int j = 0; J < Info.size (); j + +) {System.out.println (Info.get (j));}} catch (Jdomexception e) {e.printstacktrace ();} catch (IOException e) {e.printstacktrace ()}} public static void Main (string[] args) {jdomxml jdom = new Jdomxml (); String fileName = "D://jdom.xml";//jdom.createxml (filename); jdom.parserxml (filename);}}
Iv. dom4j http://dom4j.sourceforge.net/
Although DOM4J represents a completely independent development result, initially it is an intelligent branch of Jdom. It incorporates a number of features beyond the basic XML document representation, including integrated XPath support, XML schema support, and event-based processing for large documents or streaming documents. It also provides options for building a document representation, which has parallel access through the DOM4J API and the standard DOM interface. It has been in development since the second half of 2000.
To support all of these features, DOM4J uses interfaces and abstract basic class methods. DOM4J uses the collections class in the API extensively, but in many cases it also provides workarounds to allow for better performance or more direct encoding methods. The direct benefit is that while DOM4J has paid the cost of a more complex API, it offers much greater flexibility than jdom.
The goal of DOM4J is the same as jdom when adding flexibility, XPath integration, and the goal of large document processing: Ease of use and intuitive operation for Java developers. It is also committed to becoming a more complete solution than jdom, achieving the goal of essentially addressing all java/xml issues. When this goal is completed, it is less stressed than jdom to prevent incorrect application behavior.
Dom4j is a very, very good Java XML API that features excellent performance, power, and extreme ease of use, and is also an open source software. Now you can see that more and more Java software is using dom4j to read and write XML, especially to mention that even Sun's JAXM is using DOM4J.
public class Dom4jxml implements XmlDocument {@Overridepublic void Createxml (String fileName) {Document document = Documen Thelper.createdocument (); Element employees = document.addelement ("Xml-methods"); Element employee = employees.addelement ("method"); Element name = employee.addelement ("type"); Name.settext ("dom4j"); Element methodcreate = employee.addelement ("Method-create"); Methodcreate.settext ("Createxml"); Element Amethodparsege = employee.addelement ("Method-parse"); Amethodparsege.settext ("Parserxml"); try {Writer FileWriter = new FileWriter (fileName); XMLWriter XMLWriter = new XMLWriter (fileWriter); Xmlwriter.write (document); Xmlwriter.close ();} catch (IOException e) {System.out.println (E.getmessage ());}} @Overridepublic void Parserxml (String fileName) {file Inputxml = new File (fileName); Saxreader Saxreader = new Saxreader (); try {Document document = Saxreader.read (Inputxml); Element employees = document.getrootelement (); for (Iterator i = Employees.elementiterator (); I.hasnext ();) {ElementEmployee = (Element) i.next (); for (Iterator j = employee.elementiterator (); J.hasnext ();) {element node = (element) J.next (); System.out.println (Node.getname () + ":" + Node.gettext ());}}} catch (Documentexception e) {System.out.println (E.getmessage ());} System.out.println ("dom4j parserxml");} public static void Main (string[] args) {String filename= "d://dom4j.xml";D om4jxml dom4j = new Dom4jxml ();// Dom4j.createxml (filename);d om4j.parserxml (filename);}}
Reference documents:
[1] http://blog.csdn.net/wwww1988600/article/details/9019785
[2] http://blog.csdn.net/jzhf2012/article/details/8532873
XML (1) Four ways to parse XML in Java