Dom4j-the best XML solution?

Last Update:2018-12-07 Source: Internet

Author: User

Tags xml cdata xslt

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Dom4j is an open source XML parsing package produced by dom4j.org. Its website defines it as follows:

Dom4j is an easy to use, open source library for working with XML, XPath and
XSLT on the Java platform using the Java collections framework and with full
Support for Dom, Sax and JAXP.

Dom4j is an easy-to-use and open-source library for XML, XPath, and XSLT. It is applied to the Java platform and uses Java
The Collection framework fully supports Dom, sax, and JAXP.

Dom4j is easy to use. You can use it as long as you understand basic XML-DOM models. However, his own guide
There is only one page (HTML), but it is quite complete. There are few Chinese documents in China. So I am writing this short tutorial
For your convenience, this articleArticleI will only talk about the basic usage. If you need to use it in depth, please ...... Find other resources by yourself
Material.

IBM developerCommunity(See the appendix) to compare the performance of some XML parsing packages.
Dom4j has outstanding performance and ranks top in multiple tests. (In fact, this is also referenced in dom4j's official documentation.
So in this project, I used dom4j as an XML parsing tool.

JDOM is widely used as a parser in China. The two are good at its length, but the biggest feature of dom4j is that a large number
Is considered more flexible than JDOM. Did the master say that "interface-oriented programming ". Currently
More and more dom4j applications are available. If you are good at using JDOM, you may wish to continue using it. Just look at this article as an example.
For comparison, if you are about to adopt a parser, use dom4j.

Its main interfaces are defined in the org. dom4j package:

Attribute
Attribute defines XML attributes.

Branch
Branch defines a public
Common behaviors,

CDATA
CDATA defines the xml cdata Region

Characterdata
Characterdata is an excuse to identify character-based nodes. Such as CDATA, comment, text.

Comment
Comment defines the XML annotation Behavior

Document
Defines XML documents

Documenttype
Documenttype defines XML doctype Declaration

Element
Element defines XML elements

Elementhandler
Elementhandler defines the processor of the Element Object

Elementpath
Used by elementhandler to obtain the path level information currently being processed

Entity
Entity defines XML Entity

Node
Node defines polymorphism for all XML nodes in dom4j.

Nodefilter
Nodefilter defines the behavior of a filter or predicate generated in the dom4j node (predicate)

Processinginstruction
Processinginstruction defines XML processing instructions.

Text
Text defines XML text nodes.

Visitor
Visitor is used to implement the visitor mode.

Xpath
After analyzing a string, XPath provides an XPATH expression.

You can see what their names mean.

To understand this interface, you must understand the inheritance relationship of the interface:

A. Interface java. Lang. cloneable
A. Interface org. dom4j. Node
A. Interface org. dom4j. Attribute
B. Interface org. dom4j. Branch
A. Interface org. dom4j. Document
B. Interface org. dom4j. Element
A. Interface org. dom4j. characterdata
A. Interface org. dom4j. CDATA
B. Interface org. dom4j. Comment
C. Interface org. dom4j. Text
B. Interface org. dom4j. documenttype
C. Interface org. dom4j. Entity
D. Interface org. dom4j. processinginstruction
A lot of things are clear at a glance. Most of them are inherited by node. Know these relationships and write them in the future Program
Classcastexception does not occur.

The following are some examples (part from the documentation provided by dom4j.

1. Read and parse the XML document:

Reading and Writing XML documents mainly depends on the org. dom4j. Io package, which provides two different methods: domreader and saxreader,
The call method is the same. This is the benefit of relying on interfaces.

// Read XML from the file, input the file name, and return the XML file

Public document read (string filename) throws
Malformedurlexception, documentexception {

Saxreader reader = new saxreader ();

Document document = reader. Read (new file (filename ));

Return document;

}

The reader's read method is overloaded and can be read from multiple sources, such as inputstream, file, and URL.
. The resulting document object contains the entire XML table.

According to my own experience, the character encoding read is converted according to the encoding defined in the XML file header. Note that
Make sure that the names of the codes are consistent.

2. Get the root node

The second step after reading is to get the root node. Anyone familiar with XML knows that all XML analysis is based on the root element.
.

Public element getrootelement (document DOC ){

Return Doc. getrootelement ();

}

3. traverse the XML tree

Dom4j provides at least three methods to traverse nodes:

1) iterator)

// Enumerate all subnodes

For (iterator I = root. elementiterator (); I. hasnext ();){

Element element = (element) I. Next ();

// Do something

}

// Enumerate nodes named foo

For (iterator I = root. elementiterator (FOO); I. hasnext ();){

Element Foo = (element) I. Next ();

// Do something

}

// Enumeration attribute

For (iterator I = root. attributeiterator (); I. hasnext ();){

Attribute attribute = (attribute) I. Next ();

// Do something

}

2) Recursion

Iterator can also be used as an enumeration method for recursion.

Public void treewalk (){

Treewalk (getrootelement ());

}

Public void treewalk (element ){

For (INT I = 0, size = element. nodecount (); I <size; I ++)
{

Node node = element. node (I );

If (node instanceof element ){

Treewalk (element) node );

} Else {// do something ....

}

3) Visitor Mode

The most exciting thing is dom4j's support for visitor, which can be greatly reduced.CodeAnd easy to understand. Understanding Design Patterns
As we all know, visitor is one of the gof design patterns. The main principle is that the two types retain each other's references, and
One way is to access many visitable as a visitor. Let's take a look at the visitor mode in dom4j (not mentioned in the quick document
Supply)

You only need to customize a class to implement the visitor interface.

Public class myvisitor extends visitorsupport {

Public void visit (element ){

System. Out. println (element. getname ());

}

Public void visit (attribute ATTR ){

System. Out. println (ATTR. getname ());

}

} Call: Root. Accept (New myvisitor ())

The visitor interface provides multiple visit () Overloading methods. Different objects in XML are accessed in different ways.
The above is a simple implementation of element and attribute, which are usually used. Visitorsupport is
Dom4j provides the default adapter and the default adapter mode of the visitor interface.
Visit (*) Empty implementation to simplify the code.

Note that this visitor automatically traverses all sub-nodes. For root. Accept (myvisitor ),
Subnode. When I used it for the first time, I thought it was necessary to traverse it by myself, and then call visitor in recursion. The results can be imagined.

4. Support for xpath

Dom4j has good support for xpath. If you access a node, you can directly select it using XPath.

Public void bar (document ){

List list = Document. selectnodes (// Foo/bar );

Node node = Document. selectsinglenode (// Foo/BAR/author );

String name = node. valueof (@ name );

}

For example, if you want to find all the hyperlinks in the XHTML document, the following code can be implemented:

Public void findlinks (document) throws into entexception
{

List list = Document. selectnodes (// A/@ href );

For (iterator iter = List. iterator (); ITER. hasnext ();){

Attribute attribute = (attribute) ITER. Next ();

String url = attribute. getvalue ();

}

5. Conversion of strings and XML

Sometimes strings are often converted to XML or vice versa,

// Convert XML to the string document = ...;

String text = Document. asxml ();

// Convert string to XML

String text = <person> <Name> James </Name> </person>;

Document document = incluenthelper. parsetext (text );

6. Use XSLT to convert XML

Public document styledocument (

Document document,

String stylesheet

) Throws exception {

// Load the transformer using JAXP

Transformerfactory factory = transformerfactory. newinstance ();

Transformer transformer = factory. newtransformer (

New streamsource (stylesheet)

);

// Now lets style the given document

Documentsource source = new documentsource (document );

Documentresult result = new documentresult ();

Transformer. Transform (source, result );

// Return the transformed document

Document transformeddoc = result. getdocument ();

Return transformeddoc;

}

7. Create XML

XML is usually created before writing files, which is as easy as stringbuffer.

Public document createdocument (){

Document document = incluenthelper. createdocument ();

Element root = Document. addelement (Root );

Element author1 =

Root

. Addelement (author)

. Addattribute (name, James)

. Addattribute (location, UK)

. Addtext (James Strachan );

Element author2 =

Root

. Addelement (author)

. Addattribute (name, Bob)

. Addattribute (location, US)

. Addtext (Bob McWhirter );

Return document;

}

8. file output

A simple output method is to output a document or any node through the write method.

Filewriter out = new filewriter (FOO. XML );

Document. Write (out );

If you want to change the output format, for example, beautify the output or reduce the format, you can use the xmlwriter class public void
Write (document) throws ioexception {

// Specify the file

Xmlwriter writer = new xmlwriter (

New filewriter (output. XML)

);

Writer. Write (document );

Writer. Close ();

// Beautify the format

Outputformat format = outputformat. createprettyprint ();

Writer = new xmlwriter (system. Out, format );

Writer. Write (document );

// Reduce the format

Format = outputformat. createcompactformat ();

Writer = new xmlwriter (system. Out, format );

Writer. Write (document );

}

Dom4j is simple enough. Of course, some complicated applications have not been mentioned, such as elementhandler. If you are tempted
Then use dom4j together.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More