Dom parsing for XML File Parsing

Last Update:2014-11-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

XML files are a common data exchange format. They are platform-independent, language-independent, and system-independent, which greatly facilitates data integration and interaction. The basic parsing methods include Dom parsing and sax parsing, specifically Dom parsing, Sax parsing, dom4j parsing, and JDOM parsing. First, let's talk about the specific implementation of Dom parsing:

1. Important objects

Documentbuilderfactory: Create a factory object for the document parser

Documentbuilder: Get the document parser object, which is obtained by the employee ID object.

Document: Document Object

2. XML parsing Implementation Method

XML file:

<?xml version="1.0" encoding="UTF-8"?><world>    <comuntry id="1">        <name>China</name>        <capital>Beijing</capital>        <population>1234</population>        <area>960</area>    </comuntry>    <comuntry id="2">        <name id="">America</name>        <capital>Washington</capital>        <population>234</population>        <area>900</area>    </comuntry>    <comuntry id="3">        <name >Japan</name>        <capital>Tokyo</capital>        <population>234</population>        <area>60</area>    </comuntry>    <comuntry id="4">        <name >Russia</name>        <capital>Moscow</capital>        <population>34</population>        <area>1960</area>    </comuntry></world>

3. XML parsing Implementation Method

1. Obtain documentbuilderfactory

2. Obtain documentbuilder

3. Read the input stream of the file

4. Obtain the root element of the document and call recursive functions for processing.

Import Java. io. file; import Java. io. fileinputstream; import Java. io. ioexception; import Java. io. inputstream; import javax. XML. parsers. documentbuilder; import javax. XML. parsers. documentbuilderfactory; import javax. XML. parsers. parserconfigurationexception; import Org. w3C. dom. document; import Org. w3C. dom. element; import Org. w3C. dom. namednodemap; import Org. w3C. dom. node; import Org. w3C. dom. nodelist; import Org. XML. Sax. saxexception; public class domparse {/*** 1. obtain documentbuilderfactory * 2. obtain documentbuilder * 3. read the input stream of the file * 4. obtain the root element of the document and call the recursive function for processing * @ Param ARGs */public static void main (string [] ARGs) {// obtain documentbuilderfactory factory = documentbuilderfactory. newinstance (); try {// obtain documentbuilder = factory. newdocumentbuilder (); // read the input stream inputstrea of the file M inputstream = new fileinputstream (new file ("world. XML "); // obtain the Document Object document = documentbuilder. parse (inputstream); // obtain the document's root element rootelement = document. getdocumentelement (); listchildnodes (rootelement, 0);} catch (parserconfigurationexception e) {// todo auto-generated Catch Block E. printstacktrace ();} catch (saxexception e) {// todo auto-generated Catch Block E. printstacktrace () ;} Catch (ioexception e) {// todo auto-generated Catch Block E. printstacktrace () ;}}/*** recursively traverse and print all elementnodes (including node attributes) * 1. process the node information first * 2. process node property information * 3. processing sub-nodes (implemented by Recursive Method) * @ Param node indicates the Node object * @ Param level node level (starting from the first layer Root Node 1) */public static void listchildnodes (node, int level) {// only process nodes of the elementnode type if (Level = 0) {system. out. println ("<? XML version = \ "1.0 \" encoding = \ "UTF-8 \"?> ");} If (node. getnodetype () = node. element_node) {Boolean hastextchild = false; string levelspace = ""; for (INT I = 0; I <level; I ++) {levelspace + = "";} // print the start label system of elementnode first. out. print (levelspace + "<" + node. getnodename () + (node. hasattributes ()? "": ">"); // The angle brackets after the start label of the node when attributes exist. ">" This means that if (node. hasattributes () {// print all attributes of a node. namednodemap = node. getattributes (); For (INT I = 0; I <namednodemap. getlength (); I ++) {system. out. print (namednodemap. item (I ). getnodename () + "= \" "// Escape Character \ + namednodemap must be used to enclose double quotation marks in a string. item (I ). getnodevalue () + "\" "+ (I = (namednodemap. getlength ()-1 )? "": ""); // If it is not the last attribute, leave a gap between the attributes} system. out. print (">"); // all attributes in the start label are printed with Angle brackets ">"} // processing if (node. haschildnodes () {level ++) // obtain the list of all subnodes nodelist = node. getchildnodes (); // cyclically traverse all the subnodes for (INT I = 0; I <nodelist. getlength (); I ++) {// The child node is of the textnode type, and the included text content is valid if (nodelist. item (I ). getnodetype () = node. text_node &&(! Nodelist. item (I ). gettextcontent (). matches ("\ s +") {// use regular expressions to select the text node hastextchild = true where the content contains valid characters other than spaces; // The level-1 subnode of the elementnode is a text node with valid characters. out. print (nodelist. item (I ). gettextcontent (); // Add text content after the start tag // The subnode is normal for elementnode processing} else if (nodelist. item (I ). getnodetype () = node. element_node) {system. out. println (); // call the method recursively-to traverse all subnodes under the node listchildnodes (nodelist. item (I), Level); // level indicates the level at which the node is located (Corresponding space)} level --; // After traversing all the child nodes, the level variables decrease with the number of layers of the child nodes, return to the layers of the node. // level ++ and level -- the initial values of the Child Nodes of the node are affected.} // print the end label of the element. if its first level-1 subnode is valid text, the text and end labels will be added to the end label, and the // level or something will be useless. Otherwise, to print the end tag in layers. system. out. print (hastextchild )? "": "\ N" + levelspace) + "</" + node. getnodename () + "> ");}}}

4. The success is that XML File Parsing is implemented using the DOM method. Note that recursive function calling is the key to dynamic parsing. Once you know the content of an XML file, you can encapsulate the parsed objects for storage or other usage. Have the opportunity to share with you.

Dom parsing for XML File Parsing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Dom parsing for XML File Parsing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Dom parsing for XML File Parsing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support