Incomplete test of four XML parsing techniques in Java

Source: Internet
Author: User
Test Environment: AMD poison dragon 1.4goc1.5G, 256MDDR333, Windows2000ServerSP4, SunJDK1.4.1 + Eclipse2.1 + Resin2.1.8. The XML file format is as follows :? Xmlversion = 1.0 encoding = GB

Test environment:

AMD poison dragon 1.4g oc 1.5G, 256 M DDR333, Windows2000 Server SP4, Sun JDK 1.4.1 Eclipse 2.1 Resin 2.1.8, tested in Debug mode.

The XML file format is as follows:

Reference content is as follows:

   A1234

No. XX, Section X, XX Road, XX town, XX county, Sichuan province

   B1234

XX group, XX village, XX Township, XXX city, Sichuan province

Test method:

Let each scheme parse the XML file of 10 K, 100 K, 1000 K, and K, and calculate the time (unit: milliseconds ).

Reference content is as follows:

JSP file:

<% @ Page contentType = "text/html; charset = gb2312" %> <% @ page import = "com. test. *" %>

  <% String args [] = {""}; MyXMLReader. main (args); %>

Test

The first appearance is DOM (JAXP Crimson parser)

DOM represents the official W3C standard of XML documents in a way unrelated to the platform and language. DOM is a collection of nodes or information fragments organized in hierarchies. This hierarchy allows developers to search for specific information in the tree. To analyze this structure, you usually need to load all documents and structural hierarchies before you can do anything. Because it is based on information layers, DOM is considered to be tree-based or object-based. DOM and tree-based processing in the broad sense have several advantages. First, because the tree is persistent in the memory, you can modify it so that the program can make changes to the data and structure. It can also navigate in the tree at any time, rather than one-time processing like SAX. DOM applications are much simpler.

On the other hand, parsing and loading all documents in very large documents may be slow and resource-consuming, so it is better to use other methods to process such data. These event-based models, such as SAX.

Reference content is as follows:

Bean file:

Package com. test;

Import java. io. *; import java. util. *; import org. w3c. dom. *; import javax. xml. parsers .*;

Public class MyXMLReader {

Public static void main (String arge []) {

Long lasting = System. currentTimeMillis ();

Try {

File f = new File ("data_10k.xml ");

DocumentBuilderFactory factory = DocumentBuilderFactory. newInstance ();

DocumentBuilder builder = factory. newDocumentBuilder ();

Document doc = builder. parse (f );

NodeList nl = doc. getElementsByTagName ("VALUE ");

For (int I = 0; I

System. out. print ("license plate number:" doc. getElementsByTagName ("NO"). item (I). getFirstChild (). getNodeValue ());

System. out. println ("owner address:" doc. getElementsByTagName ("ADDR"). item (I). getFirstChild (). getNodeValue ());

}

} Catch (Exception e ){

E. printStackTrace ();

}

System. out. println ("Run time:" (System. currentTimeMillis ()-lasting) "millisecond ");}}

10 K elapsed time: 265 203 219 172

9172 K time consumption: 9016 8891 9000

691719 K time consumption: 675407 708375 739656

10000k takes time: OutOfMemoryError

Followed by SAX

The advantages of such processing are very similar to those of streaming media. Analysis can start immediately, rather than waiting for all data to be processed. In addition, because the application only checks data when reading data, it does not need to store the data in the memory. This is a huge strength for large documents. In fact, a program does not even have to parse all documents; it can end parsing when a condition is met. In general, SAX is much faster than its changer DOM.

Select DOM or SAX?

For developers who need to write their own code to process XML documents, choosing DOM or the SAX parsing model is a very important design decision.

DOM uses a tree structure to access XML documents, while SAX uses an event model.

The DOM parser converts an XML document into a tree containing its content and can traverse the tree. The advantage of using DOM to parse the model is that programming is easy. developers only need to call the build instruction and then use navigation APIs to visit the desired tree node to complete the task. You can easily add and modify elements in the tree. However, because the DOM parser needs to process all XML documents, the performance and memory requests are relatively high, especially when a large XML file is encountered. Because of its traversal capability, the DOM parser is often used in services that require frequent changes in XML documents.

The SAX parser uses an event-based model. it triggers a series of events when parsing XML documents. when a given tag is found, it can activate a callback method, tell the method that the label has been found. SAX usually has relatively low memory requests, because it allows developers to determine the tag to be processed by themselves. Especially when developers only need to process part of the data contained in the document, the extension of SAX can be better reflected. However, when using the SAX parser, encoding is more difficult, and it is difficult to visit multiple different data in the same document at the same time.

Reference content is as follows:

Bean file:

Package com. test; import org. xml. sax. *; import org. xml. sax. helpers. *; import javax. xml. parsers .*;

Public class MyXMLReader extends DefaultHandler {

Java. util. Stack tags = new java. util. Stack ();

Public MyXMLReader (){

Super ();}

Public static void main (String args []) {

Long lasting = System. currentTimeMillis ();

Try {

SAXParserFactory sf = SAXParserFactory. newInstance ();

SAXParser sp = sf. newSAXParser ();

MyXMLReader reader = new MyXMLReader ();

Sp. parse (new InputSource ("data_10k.xml"), reader );

} Catch (Exception e ){

E. printStackTrace ();

}

System. out. println ("Run time:" (System. currentTimeMillis ()-lasting) "millisecond ");}

Public void characters (char ch [], int start, int length) throws SAXException {

String tag = (String) tags. peek ();

If (tag. equals ("NO ")){

System. out. print ("license plate number:" new String (ch, start, length);} if (tag. equals ("ADDR ")){

System. out. println ("address:" new String (ch, start, length ));}}

Public void startElement (String uri, String localName, String qName, Attributes attrs ){

Tags. push (qName );}}

10 K elapsed time: 110 47 109 78

344 k time consumption: 406 375 422

3234 K time consumption: 3281 3688 3312

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.