Document Object Model (DOM) Simple APIs for XML (SAX) JDOM Java APIs for XML parsing (JAXP)
DOMThe Document Object model (commonly called DOM) defines a set of interfaces for the parsed version of an XML document. The parser reads the entire document and then constructs a tree that resides in memory, and your code can manipulate the tree structure using the DOM interface. You can traverse the tree to see what the original document contains, you can delete several parts of the tree, you can rearrange the tree and add new branches, and so on. The disadvantage of DOM: the DOM constructs the entire document-resident memory tree. If the document is large, it will require a great deal of memory. The DOM creates objects that represent everything in the original document, including elements, text, attributes, and spaces. If you just focus on a small part of the original document, creating objects that will never be used is extremely wasteful. The DOM parser must read the entire document before your code gets control. For very large documents, this can cause significant delays. These are simply problems caused by the design of the document object model;
DOM API is a very useful way to parse XML documents。
SAXSeveral features of sax solve the DOM problem: The SAX parser sends events to your code. It tells you when the parser discovers the start of the element, the end of the element, the text, the beginning or end of the document, and so on. You can decide what events are important to you, and you can decide what type of data structure you want to create to hold the information from those events. If you do not explicitly save data from an event, it is discarded. The SAX parser does not create any objects at all, it simply passes events to your application. If you want to create objects based on those events, this will be done by you. The SAX parser starts sending events at the beginning of parsing. When the parser discovers the start of the document, the start of the element, and the text, the code receives an event. Your application can start building results immediately; you don't have to wait until the entire document is parsed. Even better, if you only find something in your document, the code can throw an exception once it finds what you're looking for. The exception stops the SAX parser, and the code uses the data it finds to do whatever it needs to do.
SAX and DOM have their ownSAX Parser Disadvantage: Sax events are stateless. When the SAX parser finds text in an XML document, it sends an event to your code. The event only gives you the text you find; it does not tell you what elements contain that text. If you want to know this, you must write your own state management code. SAX events are not persistent. If your application needs a data structure to model an XML document, you must write that code yourself. If you need to access data from a SAX event and don't store that data in code, you'll have to parse the document again. SAX is not controlled by a centrally managed organization. Although
so far, this has not caused any problems., but some developers feel more comfortable if SAX is controlled by an organization like the one in the consortium.
JDOMThe difficulty of accomplishing certain tasks with DOM and SAX models frustrated Jason Hunter and Brett McLaughlin, and they created the JDOM package. JDOM is an Open-source project based on Java technology that tries to follow the 80/20 rule: Meet the needs of 80% of users with DOM and SAX 20% capabilities. JDOM uses SAX and DOM parsers, so it is implemented as a relatively small set of Java classes. The main feature of JDOM is that it drastically reduces the number of code you have to write. The length of the JDOM application is usually one-third of the DOM application, about half the size of a SAX application. (Of course, purists who insist on using DOM suggest that learning and using DOM will eventually pay off in the long run). JDOM doesn't do everything, but for most of the parsing you want to do, it might just fit you.
JAXPAlthough DOM, SAX, and JDOM provide standard interfaces for most common tasks, there are still things that they cannot solve. For example, the process of creating an Domparser object in a Java program differs depending on the DOM parser. To fix this problem, Sun publishes JAXP (Java api,java API FOR XML parsing for XML parsing). This API provides a common interface for processing XML documents using DOM, SAX, and XSLT. The interfaces that JAXP provides, such as Documentbuilderfactory and Documentbuilder, provide a standard interface for different parsers. There are also methods that allow you to control whether the underlying parser recognizes namespaces and uses DTDs or schemas to validate XML documents.
What kind of interface is right for youTo determine which interface is right for you, you need to understand the design essentials for all interfaces, and you need to understand what your application does with the XML document that you will be working on. Consider the following questions to help you find the right approach.
Do you want to write your application in Java? JAXP uses DOM, SAX, and JDOM; If you write code in Java, you should use JAXP to isolate your code from the specifics of the various parser implementations.
how the application will be deployed. If your application is going to be deployed as a Java applet, you will want to minimize the number of code you want to download, and don't forget that the SAX parser is smaller than the DOM parser. Also know that using JDOM requires a small amount of code to be written in addition to SAX or DOM parsers.
Once you parse an XML document, you need to access that data more than once. If you need to go back and access the parsed version of the XML file, DOM might be the right choice. When a SAX event is triggered, if you need it later, you (the developer) decide to save it in some way. If you need to access an event that has not been saved, you must resolve the file again. And the DOM automatically saves all the data.
need only a small amount of content for the XML source file. If you only need a small amount of content in an XML source file, SAX might be the right choice. SAX does not create objects for everything in the source file; you want to determine what is important. With SAX, you check each event to see if it has something to do with your needs, and then process it accordingly. Even better, once you find what you're looking for, your code throws an exception to completely stop the SAX parser.
are you working on a machine with little memory? If so, SAX is your best bet, regardless of what other factors you might consider.
Know that there are XML APIs for other languages, especially the Perl and Python communities that have excellent XML tools.