Comparison of several common XML Parser
Currently, common XML Parser mainly includes: sax, Dom, and xerces.
1. The advantages of sax processing are very similar to those of streaming media. The analysis can start immediately, rather than waiting for all data to be processed. In addition, because the application only checks data when reading data, it does not need to store the data in the memory. This is a huge advantage for large documents. In fact, the application does not even have to parse the entire document; it can stop parsing when a condition is met. In general, Sax is much faster than its replacement Dom. On the other hand, since the application does not store data in any way, it is impossible to use SAX to change the data or move back in the data stream.
2. Dom and generalized tree-based processing have several advantages. First, because the tree is persistent in the memory, you can modify it so that the application can change the data and structure. It can also navigate up and down the tree at any time, rather than one-time processing like sax. Dom is much easier to use. On the other hand, constructing such a tree in the memory involves a lot of overhead. It is not uncommon for large files to fully occupy the system memory capacity. In addition, creating a DOM tree may be a slow process.
3. Select Dom or sax, depending on the following factors:
Purpose of the application: If you plan to make changes to the data and output it as XML, Dom is the right choice in most cases. It doesn't mean that you can't change data by using sax, but the process is much more complicated, because you must copy the data instead of making changes to the data itself.
Data capacity: for large files, Sax is a better choice. How to use data: If only a small amount of data is used, it may be better to use SAX to extract the data to the application. On the other hand, if you know that you will reference a large amount of information that has been processed in the future, Sax may not be an appropriate choice.
Speed requirement: the sax implementation is usually faster than the DOM implementation.
It is important to remember that sax and Dom are not mutually exclusive. You can use Dom to create a sax event stream or a DOM tree. In fact, most Resolvers used to create the DOM tree actually use SAX to complete this task!
4. Sax and Dom are two methods for analyzing XML documents (no specific implementation, only interfaces), so they are not interpreters, you cannot process XML documents. The sax package is org. xml. Sax, And the DOM package is org. W3C. Dom. The package name is very important and helps you understand the relationship between them.
5. JAXP is an API. It encapsulates two interfaces, namely, Sax and Dom. Based on the sax/DOM, a set of simple APIs are provided for developers. The JAXP package is javax. XML. parsers: Let's take a look at the JAXP source file. Its file contains reference to sax or DOM (import) JAXP, which is not a specific implementation. It is just a set of APIs. If you only have JAXP, it will not work. (In fact, JAXP only packs the sax and Dom and generates documentbuilderfactory/documentbuilder and saxparserfactory saxparser. That is, the factory mode in the design mode. Its advantage is that the specific object (Interpreter) is created by the subclass)
6. The xerces interpreter (the fastest XML interpreter on the Earth) inherits the saxparser saxparserfactory documentbuilder documentbuilderfactory defined in JAXP in xerces (extends) corresponding to saxparserimpl xparserfactoryimpl documentbuilderimpl documentbuilderfactoryimpl. This is why your classpath only has xerces. jar (including sax Dom JAXP) and xercesimpl. jar.