This article briefly discusses four common methods of updating XML documents in Java programming, and analyzes the advantages and disadvantages of these four methods. Secondly, the paper also discusses how to control the format of XML document output by Java program.
JAXP is the acronym for the Java API for XML processing, which is a programming interface written in the Java language for XML document processing. JAXP supports standards such as DOM, SAX, and XSLT. To enhance the flexibility of JAXP usage, the developer specifically designed a pluggability Layer for JAXP, with the support of Pluggability Layer, which allows JAXP to implement both the DOM API and the various XML parsers of the SAX API ( XML Parser, such as the Apache Xerces, work together and work together with XSLT processors that perform XSLT standards (XSLT Processor, such as Apache Xalan). The advantage of applying pluggability layer is that we only need to familiarize ourselves with the definition of the various programming interfaces of JAXP, without having a thorough understanding of the specific XML parser and XSLT processor used. For example, in a Java program, using JAXP to invoke the XML parser Apache crimson to process the XML document, if we want to use another XML parser (such as Apache Xerces) to improve the performance of the program, The original program code may not need any change, it can be used directly (all you need to do is add the jar file containing the Apache Xerces code to the environment variable classpath, and it will include Apache The jar file for the Crimson Code is deleted in the environment variable CLASSPATH.
Currently, JAXP has been widely used, which can be said to be the standard API for processing XML documents in the Java language. In learning to use JAXP, some beginners the problem is often raised: The program I wrote updates the DOM tree, but when the program exits, the original XML document doesn't change, or is it the same, how do you implement a synchronous update of the original XML document and Dom tree? I think , there seems to be no corresponding interface/method/class in Jaxp, which is a problem that many beginners are puzzled about. The thrust of this article is to solve this problem by simply introducing several common methods for synchronizing the original XML document and Dom tree. To narrow the scope of the discussion, the XML parsers involved in this article include only Apache Crimson and Apache Xerces, while the XSLT processor uses only Apache Xalan.
Method One: Read and write XML documents directly
This is perhaps the most stupid and original way. After the program acquires the DOM tree, the DOM tree is updated using various methods of the node interface of the DOM model, and the next step should be to update the original XML document. We can use a recursive approach or apply the Treewalker class to traverse the entire DOM tree, while each node/element of the DOM tree is written to a previously open original XML document, and when Dom tree is traversed completely, Dom The tree and the original XML document implement a synchronous update. In practice, this method is rarely used, but if you want to implement your own XML parser programmatically, this method is still possible.
Method Two: Use XmlDocument class
Using the XmlDocument class? There is no such thing in JAXP! is the author mistaken? It is the use of the XmlDocument class, or rather, the write () method of the XmlDocument class.
As mentioned above, JAXP can be used in conjunction with a variety of XML parsers, and this time we chose the XML parser as Apache Crimson. XmlDocument (Org.apache.crimson.tree.XmlDocument) is a class of Apache Crimson that is not included in standard JAXP. No wonder there is no trace of XmlDocument in Jaxp's documents. Now the question is, how do you apply the XmlDocument class to implement the ability to update XML documents? The following three write () methods are provided in the XmlDocument class (according to the latest version of crimson------Apache Crimson 1.1.3):
public void write (OutputStream out) throws IOException
public void write (Writer out) throws IOException
public void write (Writer out, String encoding) throws IOException
The main function of the three write () methods is to output the contents of DOM tree to specific output media, such as file output stream, application program console, and so on. So how do you use the three write () methods? See the following Java program snippet:
String name= "Fancy";
Documentbuilder parser;
Documentbuilderfactory factory = Documentbuilderfactory.newinstance ();
Try
{
parser = Factory.newdocumentbuilder ();
Document doc = Parser.parse ("User.xml");
Element newlink=doc.createelement (name);
Doc.getdocumentelement (). appendchild (NewLink);
((XmlDocument) doc). Write (New FileOutputStream ("Xuser1.xml"));
}
catch (Exception e)
{
//to log it
}
In the above code, you first create a Document object doc, get the full DOM tree, and then apply the node interface's AppendChild () method and append a new node (fancy) to the end of Dom tree. Finally, the write (OutputStream out) method of the XmlDocument class is invoked to output the contents of the DOM tree to the xuser.xml (in fact, it can also be exported to User.xml to update the original XML document, where, for comparison purposes, Output to the Xuser.xml file). It is important to note that the write () method cannot be directly invoked directly against Document object doc because the JAXP document interface does not define any write () methods, so the Doc object must be cast to the XmlDocument object. You can then invoke the Write () method, which uses the write (OutputStream out) method, which uses the default UTF-8 encoding to output the contents of the DOM tree to a specific output medium, if the DOM tree contains Chinese characters, Then the output may be garbled, that is, the so-called "Chinese character problem", the solution is to use the write (Writer out, String encoding) method, explicitly specify the encoding of the output, such as the second parameter set to "GB2312", then there is no " Chinese character problem ", the output result can display Chinese characters normally.
For a complete example, please refer to the following documents: Addrecord.java (see annex), User.xml (see annex). The operating environment for this example is: Windows XP Professional, JDK 1.3.1. In order to be able to compile and run Addrecord.java this program, you need to go to the URL http://xml.apache.org/dist/crimson/to download Apache Crimson, The obtained Crimson.jar files are added to the environment variable CLASSPATH.
Attention:
The predecessor of the Apache Crimson is Sun Project X Parser, which somehow evolved from X Parser to Apache Crimson, and so far many of the code for Apache Crimson have been ported directly from the X Parser. For example, the XmlDocument class used above, it is com.sun.xml.XmlDocument in the X parser, to the Apache Crimson, becomes the Org.apache.crimson.tree.XmlDocument class, in fact, the vast majority of their code is the same, may be package statements and import statements and the beginning of the file lience different. Early JAXP was bundled with X parser, so some older programs used the Com.sun.xml package, and if you recompile them now, you might not be able to pass, and that's why. Later JAXP and the Apache Crimson bundled together, such as JAXP 1.1, if you use JAXP 1.1, you don't need to download the Apache crimson, you can also compile the above example (Addrecord.java) normally. The latest JAXP 1.2 EA (Early Access) is a new way to use the better Apache Xalan and Apache Xerces, respectively, as XSLT processors and XML parsers, not directly supporting Apache Crimson, So if your development environment uses JAXP 1.2 ea or Java XML Pack (which contains JAXP 1.2 ea), then you will not be able to compile the example above directly (Addrecord.java), you need to download and install Apache Crimson Extra.