As mentioned in the introduction Sun now provides these tools for XML processing in Java:
- StAX Reader/writer
- SAX Parser
- DOM Parser
- XPath Evaluator
- XSL Processor
- Jaxb
In the following sections I'll talk a bit about what these tools is, and what their purposes is.
Stax
The Java StAX API is a streaming API for reading and writing XML streams. As such it resembles the older Sax API, except that the Sax API can is only being used to read XML, not write it.
The
The main difference between the sax and StAX API is if the using Sax you provide a handler to the SAX parser. This handler then have methods called on it, corresponding to the entitites (elements, comments, text etc.) found in the XM L document. It's the SAX parser that controls the iteration through the XML document. With the StAX parser you control the iteration through the XML document. You call the next ()
method (or corresponding method) if you feel your is ready to process the NEX t element, text node etc. From THE&NBSP, next ()
method You obtain an object, the can-tell-you-what entity, which has met in the XML D Ocument.
Controlling the iteration through the XML document like this can is an advantage. The iteration can be kept within the scope of one method (perhaps calling Submethods). This means so can use/access the same local variables when processing a piece of text, as when processing an eleme Nt. With SAX, these-entities is handled by the different methods in your handler object. Thus, if you need to access GKFX variables from these different methods, these variables have to IS member variables in t He handler object. Not a huge difference, but the local variables may still is preferable in many situations.
The StAX API interfaces comes with Java 6, but there is yet no implementation. A standard implementation can is found at stax.codehaus.org.
SAX Parser
The SAX parser is the first API for processing XML entity by entity (element, text, comments etc.) as they is met Duri NG traversal of the XML document. When you use a SAX parser you pass a handler object to the SAX parser. This handler object have a method for every "event" you want to handle, during the traversal of the XML document. Examples of events are startdocument
, startelement
, characters
etc.
The SAX parser is mostly suited for processing XML documents where each element can be processed individually. Documents where you need access to earlier or later parts of the document, to process a given element, is not as easy to Handle. Well, if you need access to earlier parts, you can store that earlier part when it occurs in a member variable in the hand Ler object. But if you need access to later elements, this is kind of difficult. Unless you process the earlier elemements when the later elements needed is encountered. Yet, if you need to jump around the XML document when processing it, it might is easier to use a DOM parser.
In the most cases where do you find a SAX parser useful you'll be better off using a StAX reader instead.
The SAX interfaces and implementation comes with Java (at least from Java 5).
DOM Parser
The DOM (document Object Model) parser parses an XML Document to an Object graph. The whole document is a converted into one big objects. Once created you can traverse the object graph at would. You can walk up and down in the graph as. This object graph takes-a lot of memory, so the should only is used in situations where no other options is suitable.
The DOM interfaces and implementation comes with Java (at least from Java 5).
XPath Evaluator
Java comes with a built-in XPath evaluator. You set up an XPath expression, and has the evaluator Evalute the expression on a DOM object. The evaluator then returns the elements matching the XPath expression. XPaths can be a handy-finding the nodes you need to process, rather than navigate down to them yourself.
XSL Processor
Java comes with a built-in XSL Processor. An XSL Processor transforms a input XML document to a output document (not necessarily XML), according to an XSL Stylesh Eet. An example appliance of a stylesheet would is to transform an XML document containing data, to the HTML document with that Data presented in a to humanly readable format, for instance in tables, lists etc.
Jaxb
JAXB is a API that resembles the DOM API. JAXB lets you generate classes a XSL schema, matching the XML document defined in this schema. JAXB then lets you read a XML document conforming to this schema, into the A object structure built from the generated O Bjects. You can also serialize this object structure back to disk or network.
The generated JAXB classes looks more like regular domain objects. They has getters and setters with names matching the element names. The DOM API just have methods like addElement()
etc. where the concrete element name is a parameter, and you need to know what ele Ments can added as children at any given element in the DOM structure. The JAXB generated clasess thus help you more, by reflecting the allowed structure in class and method names.
Java & XML Tool Overview