Java learning notes -- Parse XML using Dom and parse XML using Sax

Source: Internet
Author: User
Tags xml parser

Popular Science:

1) Dom

Dom
Is the official W3C standard for XML documents in a way unrelated to the platform and language. Dom is a collection of nodes or information fragments organized by hierarchies. This hierarchy allows developers to search for specific information in the tree. To analyze this structure, you usually need to load the entire document and construct a hierarchy before you can do any work. Because it is based on information layers, Dom is considered to be tree-based or object-based. Dom and tree-based processing in the broad sense have several advantages. First, because the tree is persistent in the memory, you can modify it so that the application can change the data and structure. It can also navigate up and down the tree at any time, not like
That is a one-time processing of sax. Dom is much easier to use.


On the other hand, parsing and loading a very large document may be slow and resource-consuming, so it is better to use other methods to process such data. These event-based models, such as sax.

 

2) Sax

The advantages of this processing are very similar to those of streaming media. The analysis can start immediately, rather than waiting for all data to be processed. In addition, because the application only checks data when reading data, it does not need to store the data in the memory. This is a huge advantage for large documents. In fact, the application does not even have to parse the entire document; it can stop parsing when a condition is met. In general, Sax is much faster than its replacement Dom.

Advantages and disadvantages

Dom:


The parser reads the entire document, constructs a memory-resident tree structure, and then the code can use the DOM interface to operate on this tree structure.


Advantage: the entire document tree is in the memory for ease of operation. It supports multiple features such as deletion, modification, and rescheduling;


Disadvantages: transferring the entire document to memory (including useless nodes) wastes time and space;


Usage: Once the documents are parsed, the data needs to be accessed multiple times. The hardware resources (memory and CPU) are sufficient)

 

Sax:

Event-driven. When the parser finds the start, end, text, start, or end of an element, it sends events and programmers write code to respond to these events and save data.


Advantage: the entire document does not need to be transferred in advance, which consumes less resources.


Disadvantages: it is not persistent. After an event, if no data is saved, the data is lost; stateless; only the text can be obtained from the event, but the element of the text is unknown;


Usage scenarios: only a small amount of content in the XML document is required, which is rarely accessed in the future; one-time reading; less machine memory;


Note: The SAX Parser does not create any objects.

Use Dom to parse XML:

Package COM. ACCP. XML; import Java. io. ioexception; import Java. util. arraylist; import Java. util. list; import javax. XML. parsers. documentbuilder; import javax. XML. parsers. documentbuilderfactory; import javax. XML. parsers. parserconfigurationexception; import Org. w3C. dom. document; import Org. w3C. dom. element; import Org. w3C. dom. node; import Org. w3C. dom. nodelist; import Org. XML. sax. saxexception; import COM. ACCP. entity. PET; public class domparsexml {public static void main (string [] ARGs) throws saxexception, ioexception, parserconfigurationexception {string fileurl = "pet. XML "; List <pet> petlist = parsexml (fileurl); For (Pet: petlist) {system. out. println (PET) ;}} public static list <pet> parsexml (string fileurl) throws saxexception, ioexception, parserconfigurationexception {list <pet> petlist = new arraylist <pet> (); // Obtain the xml dom parser factory instance documentbuilderfactory DBF = documentbuilderfactory. newinstance (); // get the XML Parser object documentbuilder DB = DBF through the parser factory. newdocumentbuilder (); // obtain the Document Object document DOC = dB. parse (fileurl); // obtain all subnodes under the dog node nodelist = Doc. getelementsbytagname ("dog"); For (INT I = 0; I <nodelist. getlength (); I ++) {node = nodelist. item (I); // if it is an element node if (node. getnodetype () = node. element_node) {pet PET = new pet (); element dog = (element) node; // obtain the ID attribute string id = dog of the current node. getattribute ("ID"); // assign the ID attribute pet. setid (integer. parseint (ID); // obtain all the subnodes under the current node for (node childnode = dog. getfirstchild (); childnode! = NULL; childnode = childnode. getnextsibling () {If (childnode. getnodetype () = node. element_node) {// obtain the node name string name = childnode. getnodename (); // obtain the node value string value = childnode. getfirstchild (). getnodevalue (); // assign if ("name" to each attribute ". equals (name) {pet. setname (value);} else if ("health ". equals (name) {pet. sethealth (integer. parseint (value);} else if ("love ". equals (name) {pet. setlove (integer. parseint (value);} else if ("strain ". equals (name) {pet. setstrain (value) ;}} petlist. add (PET); // Add the object to the set} return petlist ;}}

Use Sax to parse XML:

Package COM. ACCP. XML; import Java. io. ioexception; import Java. util. arraylist; import Java. util. list; import javax. XML. parsers. parserconfigurationexception; import javax. XML. parsers. saxparser; import javax. XML. parsers. saxparserfactory; import Org. XML. sax. attributes; import Org. XML. sax. saxexception; import Org. XML. sax. helpers. defaulthandler; import COM. ACCP. entity. PET; public class saxparsexml extends defaulthandler {private list <pet> List = NULL; private pet PET = NULL; private string pretagname = NULL; // obtain the public list of methods for the pet set <pet> getpet (string fileurl) throws parserconfigurationexception, saxexception, ioexception {// obtain the saxparserfactory factory = saxparserfactory of the sax Parser. newinstance (); // create the saxparser parser = factory from the factory. newsaxparser (); // parses the content described by the specified Resource Identifier (URI) into xmlparser. parse (fileurl, this); // return the pet set return this. list;} // method called when the receiving Element Node starts @ overridepublic void startelement (string Uri, string localname, string QNAME, attributes) throws saxexception {// output node name system. out. println (QNAME); If ("dog ". equals (QNAME) {// create a pet object PET = new pet (); // obtain the ID attribute value of the current node string id = attributes. getvalue ("ID"); // assign the ID attribute pet. setid (integer. parseint (ID);} // Save the current node name pretagname = QNAME;} // method called to end the receiving Element Node @ overridepublic void endelement (string Uri, string localname, string QNAME) throws saxexception {If ("dog ". equals (QNAME) {// Add the created pet object to the set list. add (PET); // assign a null value after adding the node to avoid repeated PET = NULL;} // assign the saved node name a null value pretagname = NULL ;} @ overridepublic void characters (char [] CH, int start, int length) throws saxexception {// assign if ("name" to each property of PET ". equals (pretagname) {string content = new string (CH, start, length); pet. setname (content);} else if ("health ". equals (pretagname) {string content = new string (CH, start, length); pet. sethealth (integer. parseint (content);} else if ("love ". equals (pretagname) {string content = new string (CH, start, length); pet. setlove (integer. parseint (content);} else if ("strain ". equals (pretagname) {string content = new string (CH, start, length); pet. setstrain (content) ;}}// method called to start document parsing @ overridepublic void startdocument () throws saxexception {system. out. println (""); List = new arraylist <pet> () ;}// method called to end the parsing process @ overridepublic void enddocument () throws saxexception {system. out. println ("Resolution document ended ");}}

Test class:

public class SaxParseXMLTest {public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {SaxParseXML saxParseXML=new SaxParseXML();String fileURL="pet.xml";List<Pet> list=saxParseXML.getPet(fileURL);for (Pet pet : list) {System.out.println(pet);}}}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.