Java read XML: Comparison and Selection of sax, Dom, JDOM, and dom4j)

Source: Internet
Author: User

Original article: www.hicourt.gov.cn/homepage/show9_content.asp


Sax:

The sax analyzer triggers a seriesEventThe application accesses the XML document through the event processing function. Because event triggering is time-ordered,The sax analyzer provides a sequential access mechanism for XML documents. For the analyzed parts, you cannot reverse them and re-process them.
The reason why Sax is called "simple" application interface is that the sax analyzer only does some simple work, and most of the work needs to be done by the application itself. That is to say, during the implementation of the sax analyzer, it only checks the byte stream in the XML document in sequence to determine which part of the XML syntax is the current byte, checks whether it complies with the XML syntax, and triggers corresponding events. The event processing function itself must be implemented by the application itself. Compared with Dom analyzer, the sax analyzer lacks some flexibility in processing XML documents. However, for applications that only need to access the data in XML documents and do not change the documents, the efficiency of the sax analyzer is higher. Due to the simple implementation of the sax analyzer and low memory requirements, it has a high implementation efficiency and a wide range of application values.

DOM:

The Dom analyzer analyzes XML documents in the form of a DOM tree.Stored in memoryThe application can access and operate any part of the DOM tree at any time. That is to say, the application can randomly access XML documents through the DOM tree. This access method provides great flexibility for application development. It can control the content of the entire XML document at will. However, because the DOM analyzer converts the entire XML document into a DOM tree in the memory, the memory requirement is high when the XML document is large or the document structure is complex. In addition, tree traversal with complex structures is also a time-consuming operation. Therefore, Dom analyzer has high requirements on machine performance and is inefficient at implementation. However, because the structure of the DOM analyzer tree is consistent with the structure of the XML document, the DOM tree mechanism can be used to achieve random access. Therefore, Dom analyzer is also widely used.

JDOM:

JDOM is a pure Java API for processing XML. To use a specific class instead of an interface, you only need to input one or two yuan to generate instances of most node types. It is currently an excellent Java API for processing XML.

Sax
Advantages: ① the entire document does not need to be loaded into the memory, so the memory consumption is low
② The PUSH model allows you to register multiple contenthandler
Disadvantages: ① no built-in documentation navigation support
② Random Access to XML documents is not allowed
③ XML modification in situ is not supported
④ Namespace scope is not supported
Most Suitable for: applications that only read data from XML (cannot be used to operate or modify XML documents)

Dom
Advantages: ① ease of use
② A rich set of APIS for easy navigation
③ Load the entire tree to the memory, allowing Random Access to XML documents
Disadvantages: ① the entire XML document must be parsed once
② Loading the entire tree to memory is costly
③ Generally, DOM nodes are not ideal for binding object types that must be created for all nodes.
Most Suitable for: applications or XSLT applications that need to modify XML documents (applications that cannot be used for read-only XML)

JDOM
Advantages: ① It is a tree-based Java API that processes XML and loads the tree into the memory.
② There is no downward compatible restriction, so it is simpler than dom
③ Fast speed with fewer Defects
④ Java rules with Sax
Disadvantages: ① documents larger than memory cannot be processed
② JDOM indicates the logic model of the XML document. Each byte cannot be truly transformed.
③ No actual models of DTD and mode are provided for instance documents.
④ The corresponding traversal package in Dom is not supported
It is most suitable for: JDOM has the convenience of tree and Java rules of sax. Used when balance is required


Dom4j

Although dom4j represents completely independent development results, it was originally a smart branch of JDOM. It combines many functions that exceed the representation of basic XML documents, including integrated
XPath support, XML Schema support, and event-based processing for large or streaming documents. It also provides the option to build the document representation, which uses the dom4j API
And standard DOM interfaces. It has been under development since the second half of 2000.

To support all these functions, dom4j uses interfaces and abstract basic class methods. Dom4j uses many collections in the API
But in many cases, it also provides some alternative methods to allow better performance or more direct encoding methods. The direct advantage is that although dom4j has made more complex APIs
But it provides much greater flexibility than JDOM.

When adding flexibility, XPath integration, and processing targets for large documents, dom4j has the same goals as JDOM: for Java
Ease of use and intuitive operations for developers. It is also committed to becoming a more complete solution than JDOM, implementing essentially processing all Java/XML
The target of the problem. When this goal is achieved, it places less emphasis on preventing incorrect application behavior than JDOM.

Dom4j is a very good Java XML
API, featuring excellent performance, powerful functionality, and extreme ease of use, is also an open source software. Now you can see that more and more Java software are using
Dom4j is used to read and write XML. It is particularly worth mentioning that Sun's jaxm is also using dom4j.

 

Overview

JDOM and Dom do not perform well in performance tests, and memory overflow occurs when testing 10 m documents. Dom and JDOM are also worth considering in the case of small documents. Although JDOM
Developers have already stated that they want to focus on performance issues before the official release, but from the performance point of view, it is indeed not worth recommending. In addition, Dom is still a good choice. Dom
The implementation is widely used in multiple programming languages. It is also the basis of many other XML-related standards, because it is officially recommended by W3C (compared with non-standard Java
Model), so it may also be required in some types of projects (such as using DOM in Javascript ).

Sax performs well, depending on its specific parsing method. A sax detects the upcoming XML Stream but does not load it into the memory (of course, some documents are temporarily hidden in the memory when the XML Stream is read ).

Undoubtedly, dom4j is the best. Currently, dom4j is widely used in many open-source projects. For example, the well-known hibernate also uses dom4j to read xml configuration files. If portability is not considered, use dom4j!


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.