Java XML processing Technology one
XML technology is developed with the development of Java. In the case of XML, the simple data format is usually stored in a text file such as an INI configuration file, and the complex format is in a custom file format, so there is a special parser for each file format. XML solves this problem later, the program is faced with a fixed format of the XML file, as long as the standard API can be processed by the XML file.
XML files are widely used in case systems, such as Clientconfig.xml, serverconfig.xml files are used as XML files to make configuration files, metadata files and metadata loader is inseparable from XML. Therefore, this chapter will be a systematic explanation of the XML file processing technology.
1.1 XML Processing Technology comparison
The technology for XML files in the Java domain is broadly divided into two categories: XML API and oxmapping. XML API is the basis of XML processing, optional technology including JDOM, DOM4J, etc., oxmapping is the abbreviation of Object-xml Mapping, this technology hides the details of the XML underlying operations, you can map the XML file into a JavaBean Object, you can also save an JavaBean object as an XML file, with optional techniques XStream, digester, Castor, and so on. The relationship between XML API and Oxmapping is similar to that of JDBC and ormaping, oxmapping internal implementation is done using XML API, and the two implementation techniques implement XML processing from different levels.
XML API
The most popular of these XML processing techniques is JDOM and dom4j, which are used in very similar ways. But Dom4j's advantage is more obvious than Jdom:
Dom4j a large number of interfaces, which makes dom4j more flexible and scalable than dom4j;
DOM4J performance is better than JDOM;
DOM4J supports advanced features such as XPath;
Because of these advantages, many open source projects are beginning to use DOM4J to do XML parsing technology, this book will also use DOM4J as the first choice for XML processing.
Oxmapping
Parsing using XML API is a little cumbersome, inspired by ormapping technology, people have invented oxmapping technology, using oxmapping technology, we can map XML files into a JavaBean object, or you can save a JavaBean object into an XML file, which greatly simplifies our development effort, allowing developers to focus more on application-level things.
Many oxmapping frameworks emerge in the open source world, including XStream, Digester, Castor, and so on. XStream and digester the mapping process in code, and Castor needs to write a mapping configuration file similar to Cfg.xml in Hibernate. Compared with digester, the main advantage of XStream is more compact, more convenient to use, but the current use of digester is "open-source brand" Apache under the sub-project, the online can refer to more than XStream information, fortunately XStream more concise, so and does not cause too much impact on the XStream.
Use of 1.2 dom4j
DOM4J is an easy-to-use, open-source library for XML, XPath, and XSLT. It is applied to the Java platform, uses the Java Collection framework, and fully supports DOM, SAX, and JAXP. DOM4J is an open source project on SourceForge.net, with an address of http://sourceforge.net/projects/dom4j.
DOM4J programming based on interface is a very significant advantage, following is the inheritance architecture diagram of its main interface:
Figure 5. 1
Most of these interfaces are defined in the package org.dom4j, the following is a brief description of the meanings of each interface:
Table 5. 1 dom4j Main interface
Node |
node is the base type interface for all XML nodes in dom4j |
Attribute |
Attribute defines the properties of XML |
Branch |
Branch defines a common behavior for nodes that can contain child nodes, such as XML elements (element) and documents (docuemnts) |
Document |
Defines the XML document |
Element |
element defines XML elements |
DocumentType |
DocumentType defining XML DOCTYPE declarations |
Entity |
Entity Definition XML Entity |
Characterdata |
Characterdata is an identification excuse that identifies a character-based node. such as CDATA, Comment, Text |
Cdata |
CDATA Defines the XML CDATA region |
Comment |
Comment defines the behavior of XML annotations |
Text |
Text Definition XML text node |
ProcessingInstruction |
ProcessingInstruction defining XML processing directives |
Reading an XML file
In XML applications, the most common is the parsing of XML files read, DOM4J provides a variety of ways to read XML documents, including DOM tree traversal, Visitor mode and XPath way.
Either way, we'll start by constructing a Document object from an XML file:
Saxreader reader = new Saxreader ();
Document document = Reader.read (new File);
Here we use Saxreader as an XML reader, and we can also choose Domreader as an XML reader:
Saxreader reader = new Domreader ();
Document document = Reader.read (new File);
The Read method of reader has several overloaded methods, which can read XML documents from various sources such as InputStream, File, URL, and so on.
(1) Dom tree traversal
This reads the Dom as a normal tree, to read the value of a node in the XML, as long as the data structure of the tree traversal algorithm to locate the node to be read.
To facilitate the Dom tree, first get the root node of the tree:
Element root = Document.getrootelement ();
After the root node is acquired, it can be read down one level at a time:
Traverse all child nodes
for (Iterator i = Root.elementiterator (); I.hasnext ();)
{
element element = (Element) I.next ();
Do something
}
Traverse a node named "foo"
for (Iterator i = root.elementiterator ("foo"); I.hasnext ();)
{
element foo = (element) I.next ();
Do something
}
Traverse Properties
for (Iterator i = Root.attributeiterator (); I.hasnext ();)
{
Attribute Attribute = (Attribute) i.next ();
Do something
}
(2) Visitor mode
Dom Tree Traversal is the most common and commonly used method of XML reading, and other XML parsing engines, such as JDom, are also used to read XML in this way. But DOM4J provides another way of reading, and that is the Visitor way. This approach implements the Visitor mode, where the caller can just write a Visitor. The Visitor mode makes it easy for visitors to add new operations while allowing visitors to centralize related actions and separate unrelated operations.
The written Visitor must implement the Org.dom4j.Visitor interface, and DOM4J also provides a default adapter org.dom4j.VisitorSupport for Adapter mode.
public class Demovisitor extends Visitorsupport
{
public void visit (element Element)
{
System.out.println (Element.getname ());
}
public void Visit (Attribute attr)
{
System.out.println (Attr.getname ());
}
}
This Visitor can then be called on the node to begin the traversal:
Root.accept (New Demovisitor ())
This approach requires traversing all nodes and elements, so the speed is slightly slower.
(3) XPath mode
The most appealing feature of DOM4J is the integration support for XPath, which is not supported by all XML parsing engines, but it is a very useful feature.
XPath is the language that addresses, searches, and matches parts of a document. It uses path notation to specify and match parts of the document, which are similar to those used in file systems and URLs. For example, xpath:/x/y/z searches for the document's root node x, under which node y exists under node Z. The statement returns all nodes that match the specified path structure. /x/y/* returns any node under the Y node for which the parent node is x. /x/y[@name =a] matches all y nodes of the parent node X, whose properties are called Name, and the property value is a.
XPath greatly simplifies the handling of XML, as long as the user tells the engine what part of the document to match with the matching expression, the exact matching work is done by the XPath engine. This approach is much closer to the natural way of thinking in humans. Let's look at a practical example:
There is an XML file that records the basic situation of a department:
<?xml version= "1.0" encoding= "GB2312"?>
<department>
<name> Development Dept. </name>
<level>2</level>
<employeeList>
<employee number= "001" name= "Tom"/>
<employee number= "002" name= "Jim"/>
<employee number= "003" name= "Lily"/>
</employeeList>
</department>
Name represents the department name, level is the department, and EmployeeList is the employee list for the department. Write a program below to read this file and print out the department's information.
Code 5. 1 XPath Demo
InputStream instream = null;
Try
{
Instream = Dom4jDemo01.class.getResourceAsStream (
"/com/cownew/char0502/department01.xml");
Saxreader reader = new Saxreader ();
Document doc = Reader.read (new InputStreamReader (instream));
Node NameNode = Doc.selectsinglenode ("//department/name");
SYSTEM.OUT.PRINTLN ("department Name:" + namenode.gettext ());
Node Levelnode = Doc.selectsinglenode ("//department/level");
SYSTEM.OUT.PRINTLN ("Departmental level:" + levelnode.gettext ());
List employeenodelist = doc
. selectnodes ("//department/employeelist/employee");
SYSTEM.OUT.PRINTLN ("Department Employee:");
for (int i = 0, n = employeenodelist.size (); i < n; i++)
{
Defaultelement employeeelement = (defaultelement) employeenodelist
. get (i);
String name = Employeeelement.attributevalue ("name");
String number = Employeeelement.attributevalue ("number");
SYSTEM.OUT.PRINTLN (name + ", Work No.:" + number);
}
} finally
{
Resourceutils.close (instream);
}
Operation Result:
Department Name: Development Department
Department Level: 2
Department Employees:
Tom, Work No.: 001
Jim, work number: 002
Lily, Work No.: 003
With XPath, we can directly navigate to a specific node using the very clear way of "//department/name". XPath mode locates a single node using the selectSingleNode method, while locating multiple nodes uses the SelectNodes method.
All XML files in the case system are parsed using XPath methods, including Clientconfig.java, Serverconfig.java, Entitymetadataparser.java, and so on.
Creation of XML files
The creation of XML files in dom4j is similar to other XML engines, first constructing a tree of nodes based on the root node of document and then invoking the corresponding IO class library to save the XML file to the appropriate media.
The following is a demonstration of the process of generating the department information XML file mentioned above:
Code 5. 2 XML Creation Demo
Import Java.io.FileWriter;
Import java.io.IOException;
Import org.dom4j.Document;
Import Org.dom4j.DocumentHelper;
Import org.dom4j.Element;
Import Org.dom4j.io.OutputFormat;
Import Org.dom4j.io.XMLWriter;
public class Dom4jdemo02
{
public static void Main (string[] args)
{
Create a Document Object
Document document = Documenthelper.createdocument ();
Add root node "department"
Element departelement = document.addelement ("department");
Add the "Name" node
Element departnameelement = documenthelper.createelement ("name");
Departnameelement.settext ("Development Department");
Departelement.add (departnameelement);
Add a "Level" node
Element departlevelelement = documenthelper.createelement ("level");
Departlevelelement.settext ("2");
Departelement.add (departlevelelement);
Add Employee List "employeelist" node
Element employeeelementlist = Documenthelper
. createelement ("EmployeeList");
Departelement.add (employeeelementlist);
Add Employee Node "employee"
Element emp1element = documenthelper.createelement ("employee");
Emp1element.addattribute ("number", "001");
Emp1element.addattribute ("name", "Tom");
Employeeelementlist.add (emp1element);
Element emp2element = documenthelper.createelement ("employee");
Emp2element.addattribute ("number", "002");
Emp2element.addattribute ("name", "Jim");
Employeeelementlist.add (emp2element);
Element emp3element = documenthelper.createelement ("employee");
Add Property
Emp3element.addattribute ("Number", "003");
Emp3element.addattribute ("name", "Lily");
Employeeelementlist.add (emp3element);
Try
{
WriteToFile (document, "C:/department.xml");
} catch (IOException e)
{
E.printstacktrace ();
}
}
private static void WriteToFile (document document, String file)
Throws IOException
{
Landscaping format
OutputFormat format = Outputformat.createprettyprint ();
Format.setencoding ("GB2312");
XMLWriter writer = null;
Try
{
writer = new XMLWriter (new FileWriter (file), format);
Writer.write (document);
} finally
{
if (writer! = null)
Writer.close ();
}
}
}
After running it can be in c:/ Found the same department.xml as the contents of the 5.2.1 file.
Here are two points to keep in mind:
(1) OutputFormat format = OutputFormat. Createprettyprint ()
XML is often required to be read, dom4j default generation format is condensed format, this can reduce space consumption, but the disadvantage is that the file format is very ugly, so we use the lock format for output.
(2) format.setencoding ("GB2312")
The Night www.jiangyea.com
DOM4J the default encoding format is "UTF-8", which can be problematic when outputting Chinese characters, so we change to "GB2312" format.
This uses the CreateElement method provided by the Dom4j tool class Documenthelper to create a node that has public static CDATA Createcdata (String text) and public Methods such as Static Comment createcomment (string text), public static Entity createentity (string name, string text) can help us create nodes faster. Documenthelper also provides a ParseText method that can parse a string directly into a Documen object.
Java XML processing Technology one (parsing XML and surviving XML technology)