Java XML Tutorial (5th chapter)

Source: Internet
Author: User
Tags comments file system modify sort xml parser zip stringbuffer
xml| Tutorial Source: http://d23xapp2.cn.ibm.com/developerWorks/education/xml/xmljava/tutorial/xmljava-1-1.html

Fifth Chapter Parser Advanced function


Overview

We have discussed the basics of using an XML parser to work with XML documents. In this section, we will explore some advanced concepts.

First, we'll build a DOM tree from scratch. In other words, we will not need an XML source file to create a Document object.

We will then show you how to use the parser to process XML documents contained in a string.

Next, we'll show you how to manipulate a DOM tree. We will manipulate the XML document of our sample and sort its verses.

Finally, we'll show you how to use interfaces such as the DOM and SAX standards to make the parser's changes very easy. We'll show you two sample applications that use different XML parsers. And the DOM and SAX code doesn't change.

Build a DOM tree from scratch

Sometimes you want to build a DOM tree from scratch. To complete this task, you create a Document object and then add a different Node object to it.

You can run Java Dombuilder to see an example application that builds a DOM tree from scratch. The application recreated the DOM tree (but not the spaces) that was built by the initial parsing of Sonnet.xml.

We first create an instance of the Documentimpl class. This class implements the Document interface defined by the DOM.
Document doc = (document) Class.
forname ("Com.ibm.xml.dom.DocumentImpl").
Newinstance ();


Dombuilder.java (Please refer to Appendix 2 for code)

This code does not use an XML document to build a DOM tree. When the tree is built, the code outputs the contents of the tree to standard output.

...
<address>
<name>
<title>Mrs.</title>
<first-name>Mary</first-name>
<last-name>McGoon</last-name>
</name>
<street>1401 Main street</street>
<city>Anytown</city>
<state>NC</state>
<zip>34829</zip>
</address>
<address>
<name>
...

Add Node to our Document

Now that we have our own Document object, we begin to create Node. The first Node we want to create is a <sonnet> element. We will create all of the nodes and then add each node to its corresponding parent.

Note that we use the SetAttribute method to set the value of the Type property for the <sonnet> element.
Element root = Doc.
CreateElement ("Sonnet");
Root.setattribute ("type",
H "Shakespearean");
Building your document Structure

When we start building our DOM tree, we will need to build the structure of our documents. To complete it, we will need to use the AppendChild method appropriately. We'll create the <author> element and then create the other elements under it, and then use the AppendChild method to add all these elements to the correct parent.

Note that createelement is a method that belongs to the Document class. Our Document object has all the elements that we create here.

Finally, notice that we create a Text node for all element content. The text node is the child of the element, and the parent of the text node is added to the corresponding parent.

Element Author =
Doc.createelement ("author");

Element lastName = doc.
CreateElement ("Last-name");
Lastname.appendchild (Doc.
createTextNode ("Shakespeare"));
Author.appendchild (LastName);

To complete our DOM tree

Once we have added all the content to the <sonnet> element, we need to add it to the Document object. The last time we call the AppendChild method, this time is to add the child elements to the Document object.

Remember that an XML document can have only one root element, and if you are adding multiple root elements to document AppendChild will throw an exception.

When we build the DOM tree, we build a Dombuilder object and call its Printdomtree method to print the DOM tree.
Element line14 = doc.
CreateElement ("line");
Line14.appendchild (Doc.
createTextNode ("As any ...");
Text.appendchild (LINE14);
Root.appendchild (text);

Doc.appendchild (root);

Dombuilder db = new Dombuilder ();
Db.printdomtree (DOC);

Using DOM objects to avoid parsing

You can imagine a DOM document object as a compiled form of an XML document. If you are using XML to pass data between different parties, you will be able to save time by accepting and sending DOM objects instead of XML source data.

This is the most common reason why you need to build a DOM tree from scratch.

At worst, you need to create the XML source data from a DOM tree before you send the data, and then create a DOM tree when you accept the XML data. Using a DOM object directly will save a lot of time.

A warning: Be aware that a DOM object may be much larger than the XML source data. If you want to pass data on a very slow connection line, it is more efficient to send smaller XML source data and reparse the data than to pass large data.

Parsing an XML string

It is quite possible that you need to parse an XML string. IBM's XML4J parser supports this functionality, although you need to convert your string into a InputSource object.

The first step is to create a StringReader object from your string. Once this step is complete, you can create a InputSource object from StringReader.

You can run Java parsestring to see the results of your code running. In the example application, XML strings are written dead (hardcoded); There are many ways you can get XML input from a user or other machine. With this technology, you no longer need to export the XML document to a file system to parse it.

parsestring PS = new parsestring ();
StringReader sr =
New StringReader ("<?xml version=\" 1.0\ "?>"
<a>alphabravo ... ");
InputSource isrc = new InputSource (SR);
Ps.parseandprint (ISRC);

Parsestring.java (Appendix 2)

This code shows how to parse a string that contains an XML document.



Node in an array of DOM trees

To explain how you can change the structure of a DOM tree, we will modify our Dom example to arrange the <line> elements of 14 lines of verse. There are a variety of DOM methods that can be used to move nodes in the DOM tree.

To view the results of your code, run the Java domsorter sonnet.xml. It does not improve the rhythm of poetry, but it actually arranges the <line> elements.

To begin arranging work, we'll use the getElementsByTagName method to extract all the <line> elements in the document. This method saves us the overhead of writing code to traverse the entire tree.

if (Doc!= null)
{
Sortlines (DOC);
Printdomtree (DOC);
}
...
public void Sortlines (Document doc)
{
NodeList Thelines =
Doc.getdocumentelement ().
getElementsByTagName ("line");
...

Domsorter.java (Appendix 2)

This code finds all the <line> elements in the XML document and then sorts them. It shows how to manipulate a DOM tree.



Extract our <line> texts

To simplify the code, we create a helper function, Gettextfromline, to extract the text contained in a pair of <line> elements. It simply finds the first child of the <line> element and returns its text if it is a text node.

This method returns a Java String so our sort process can use the String.CompareTo method to determine the order of the sort.

This code should actually check <line> all children, as they may contain entity (entity) references (for example, entity &miss; may replace the text "Mistress"). We will take this improvement as an exercise for the reader.

Public String Gettextfromline (Node
Lineelement)
{
StringBuffer returnstring =
New StringBuffer ();
if (Lineelement.getnodename ().
Equals ("line"))
{
NodeList kids = lineelement.
Getchildnodes ();
if (Kids!= null)
if (Kids.item (0). Getnodetype () = =
Node.text_node)
Returnstring.append (Kids.item (0).
Getnodevalue ());
}
Else
Returnstring.setlength (0);

return new String (returnstring);
}

Text sorting

Now we have the ability to get the text from a given <line> element and we can start arranging the data. Since we have only 14 elements, we will use bubble sort.

The bubble sort algorithm compares two adjacent data values and then swaps them if they are not in the right order. To complete the exchange, we use the Getparentnode and InsertBefore methods.

Getparentnode returns the parents of any Node; we use this method to get the current <line> parent (a <lines> element of the document uses the sonnet DTD).

InsertBefore (NodeA, NodeB) is inserted nodeA into the DOM tree before NodeB. The most important feature of InsertBefore is that if the NodeA already exists in the DOM tree, it will delete the node before inserting the NodeB.
public void Sortlines (Document doc)
{
NodeList Thelines =
Doc.getdocumentelement ().
getElementsByTagName ("line");
if (thelines!= null)
{
int len = Thelines.getlength ();
for (int i=0 i < len; i++)
for (int j=0 J < (Len-1-i); j + +)
if (Gettextfromline (
Thelines.item (j)).
CompareTo (Gettextfromline (
Thelines.item (j+1))) > 0)
Thelines.item (j).
Getparentnode (). InsertBefore (
Thelines.item (j+1),
Thelines.item (j));
}
}

Useful DOM method for manipulating the tree

In addition to InsertBefore, there are other DOM methods that can be used to manipulate the tree.

Parentnode.appendchild (newchild)
Adds a node as the last child of a given parent node. Call Parentnode.insertbefore (newchild, NULL) to complete the same function.
Parentnode.replacechild (newchild, Oldchild)
Replace the oldchild with Newchild. Node Oldchild must be ParentNode's children.
Parentnode.removechild (Oldchild)
Remove the oldchild from the parentnode.

Parentnode.appendchild (newchild);
...
Parentnode.insertbefore (Newchild,
Oldchild);
...
Parentnode.replacechild (Newchild,
Oldchild);
...
Parentnode.removechild (Oldchild);
...

Things to note about tree operations

If you need to delete all the children of a given node, this is a lot harder than it looks. The example code on the left side of these two paragraphs looks like it can complete the task. However, the second one can be completed. The first example code cannot complete a task because the Kid instance data was changed after RemoveChild (Kid) was invoked.

In other words, the For loop deletes kid, the first child, and then checks to see if kid.getnextsibling is null. Since kid has just been deleted, it no longer has any compatriots, so kid.getnextsibling is null. The For loop will only run once. Whether node has one or thousands of children, the first example code deletes only the first child. To use the second paragraph of the sample code to delete all child nodes.
/** doesn ' t work **/
for (Node kid = Node.getfirstchild ();
Kid!= Null;
Kid = kid.getnextsibling ())
Node.removechild (Kid);

/** does work **/
while (Node.haschildnodes ())
Node.removechild (Node.getfirstchild ());


Using another DOM parser

Although we can't imagine a reason why you want to change the parser, you can use a XML4J parser to parse your XML document. If you look at the code for T Domtwo.java, you will see that only two modifications are required for the XML parser that is replaced with Sun.

First, we must load the (import) Sun company class. It's simple. What we want to modify is just the code that creates the Parser object. As you can see, Sun's parser build process is more complex, but the rest of the code is not modified. All DOM code does not require any modifications.

Ultimately, the difference in Domtwo is command-line format. For some reason, the Sun's parser cannot parse the file name in the usual way. If you run the Java domtwo File:///d:/sonnet.xml (and of course modify the file URI based on your system), you will get the same result as Domone.

Import Com.sun.xml.parser.Parser;
Import
Com.sun.xml.tree.XmlDocumentBuilder;

...

Xmldocumentbuilder Builder =
New Xmldocumentbuilder ();
Parser Parser =
New Com.sun.xml.parser.Parser ();
Parser.setdocumenthandler (builder);
Builder.setparser (parser);
Parser.parse (URI);
doc = Builder.getdocument ();


Domtwo.java (see Appendix 2)

This code is equivalent to Domone.java, but it uses the Sun's XML parser instead of IBM's. It shows the portability of the DOM interface.



Using a different SAX parser

We've also written saxtwo.java to show how to use the Sun's SAX parser. Like Domtwo, we have two modifications. The first is to load (import) Sun's Resolver class rather than IBM's SAXParser class.

We need to modify the code that creates the Parser object, and then we need to create a InputSource object based on the URI we entered. The only code we want to modify to create parser is to be included in the try snippet to capture the exceptions that might occur when we create a parser object.

Import Com.sun.xml.parser.Resolver;
...

Try
{
Parser Parser =
Parserfactory.makeparser ();
Parser.setdocumenthandler (this);
Parser.seterrorhandler (this);
Parser.parse (Resolver.
Createinputsource (new File (URI));
}

Saxtwo.java (see Appendix 2)

This code is equivalent to Saxone.java, but it uses the Sun's XML parser instead of IBM's. It shows the portability of the SAX interface.


Summarize

In this section, we introduce some advanced programming techniques that use XML parsers. We show how to directly generate DOM trees, how to parse strings instead of files, how to move elements in an XML tree, and how to change parsers without affecting the DOM and SAX code.

Hope you like this tutorial!

This is all the content of this tutorial. We discussed the basic architecture of XML applications, and we also described how you work with XML documents. Future tutorials will introduce more details about building XML applications, including:

Using visual tools to build XML applications
Converts an XML document from one form to another
Create interfaces for end users or other processes, and interfaces for storing data back-end
To get more information

If you want to learn more about XML, you can access the DeveloperWorks XML zone. This site has sample code, other tutorials, information about XML standards, and more.

Finally, we are willing to listen to your comments! We design developerWorks as a resource for developers. If you have any comments, suggestions or complaints, please let us know.

Thank you,---Doug tidwell or developerWorks China site!



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.