parsing processing of XML and other

Last Update:2017-02-28 Source: Internet

Author: User

Tags string access advantage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

XML DOM and sax are two mainstream choices, and jdom,dom4j do well.

The DOM parser converts an XML document into a tree containing its contents, and can traverse the tree. The advantage of using the DOM parsing model is that it is easy to program, the developer only needs to invoke the build instructions, and then use the navigation APIs to access the required tree nodes to complete the task. You can easily add and modify elements in the tree. However, because of the need to process the entire XML document with the DOM parser, the performance and memory requirements are high, especially when you encounter a large XML file. Because of its traversal capabilities, DOM parsers are often used in services where XML documents require frequent changes.

Example: Import java.io.*;import java.util.*;import org.w3c.dom.*;import javax.xml.parsers.*;

public class myxmlreader{

public static void Main (String arge[]) {

Long lasting =system.currenttimemillis ();

try{

File F=new file ("Data_10k.xml");

Documentbuilderfactory factory=documentbuilderfactory.newinstance ();

Documentbuilder Builder=factory.newdocumentbuilder ();

Document doc = Builder.parse (f);

NodeList nl = doc.getelementsbytagname ("VALUE");

for (int i=0;i

System.out.print ("License plate number:" + doc.getelementsbytagname ("NO"). Item (i). Getfirstchild (). Getnodevalue ());

System.out.println ("Owner address:" + doc.getelementsbytagname ("ADDR"). Item (i)-getfirstchild (). Getnodevalue ());

}

}catch (Exception e) {

E.printstacktrace ();

}

The SAX parser employs an event-based model that triggers a series of events when parsing an XML document, and when a given tag is found, it can activate a callback method that tells the method that the label has been found. Sax typically requires less memory because it lets developers decide which tag they want to handle. Especially when developers only need to work with some of the data contained in the document, Sax has a better ability to expand. But coding is difficult when using a SAX parser, and it is difficult to access multiple different data in the same document at the same time.

Example: Import Org.xml.sax.*;import org.xml.sax.helpers.*;import javax.xml.parsers.*;

public class Myxmlreader extends DefaultHandler {

Java.util.Stack tags = new java.util.Stack ();

Public Myxmlreader () {

Super ();}

public static void Main (String args[]) {

Long lasting = System.currenttimemillis ();

try {

SAXParserFactory SF = Saxparserfactory.newinstance ();

SAXParser sp = Sf.newsaxparser ();

Myxmlreader reader = new Myxmlreader ();

Sp.parse (New InputSource ("Data_10k.xml"), reader);

catch (Exception e) {

E.printstacktrace ();

}

System.out.println ("Run Time:" + (System.currenttimemillis ()-lasting) + "milliseconds");}

public void characters (char ch[], int start, int length) throws Saxexception {

String tag = (string) tags.peek ();

if (Tag.equals ("NO")) {

System.out.print ("License plate number:" + New String (CH, start, length));} if (Tag.equals ("ADDR")) {

SYSTEM.OUT.PRINTLN ("Address:" + New String (CH, start, length));}

public void Startelement (String uri,string localname,string qname,attributes attrs) {

Tags.push (qName);}

Note: When the form of XML data is passed as a pass, it is more suitable to use DOM, although it has higher requirements to the system (memory, performance, etc.), but the general server can satisfy the processing of XML document on G.

Sax can be used when there is a need for certain aspects of XML or specific access to certain nodes, or for a timely event to be appropriate. It is based on the time processing mechanism, in programming, by overloading some event methods to obtain the processing of XML documents.

About XML encoding, InputStreamReader and XmlReader relationships:

The usual DOM and sax for documents encoded in ASCII, read XML documents with InputStreamReader, then become Unicode codes, and cannot be handled with XmlReader, with the error being encountered because of invalid Unicode characters. (When you use the System.out.println () output There is no problem, because it can automatically be converted to the local machine code).

Way to solve:

BufferedReader br=new BufferedReader (New InputStreamReader (new FileInputStream (f), "iso8859-1"));

This allows you to limit its encoding, so there is no problem.

String length problem: There is no length limit for string type Ann, but the maximum length of string in the general JDK is 4G.

string is associated with bufferedstring: a large number of processes that do not involve strings being effective, usually using string. Bufferedstring has an advantage in handling large amounts of string processing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More