Classic entry-level documentation for parsing XML with Dom in Java

Source: Internet
Author: User
Tags object contains documentation interface object model string
Dom|xml, preface

There are two most common ways to parse XML documents in Java: Using the event-based XML Simple API (simply API for XML), called Sax and the Document object model based on the tree and node, is called Dom. Sun offers Java API for XML parsing (JAXP) interfaces to use Sax and Dom, and with JAXP, we can use any of the JAXP-compliant XML parsers.
The Jaxp interface contains three packages:
(1) Org.w3c.dom the interface recommended for the XML standard planning Document Object model.
(2) Org.xml.sax event-driven XML Simple API (SAX) for parsing xml
(3) Javax.xml.parsers Parser Factory tool, the programmer obtains and configures special special parser.

Second, the premise

Dom programming does not include other dependencies because the Org.w3c.dom, Org.xml.sax, and javax.xml.parsers packages included with the JDK in the JDK can satisfy the criteria.

III. parsing XML documents using DOM

Let's take a look at how the DOM parses XML! Again, I'll explain how the DOM parses XML documents from a simple, no more simple example: Let's take a look at what XML is.

<?xml version= "1.0" encoding= "gb2312"?>
<books>
<book email= "Zhoujunhui" >
<name>rjzjh</name>
<price>jjjjjj</price>
</book>
</books>

The simple can't be simpler. But that's all there is, the root element, the attribute, the child node. Well, it's OK to respond to the problem, let's look at the Java code to parse this XML file!

1 public class Domparse {
2 public domparse () {
3 documentbuilderfactory domfac=documentbuilderfactory.newinstance ();
4 try {
5 Documentbuilder Dombuilder=domfac.newdocumentbuilder ();
6 InputStream is=new FileInputStream ("Bin/library.xml");
7 Document Doc=dombuilder.parse (IS);
9 Element root=doc.getdocumentelement ();
NodeList Books=root.getchildnodes ();
One if (books!=null) {
for (int i=0;i<books.getlength (); i++) {
Node Book=books.item (i);
if (Book.getnodetype () ==node.element_node) {
String email=book.getattributes (). getNamedItem ("email"). Getnodevalue ();
SYSTEM.OUT.PRINTLN (email);
For (Node node=book.getfirstchild (); node!=null;node=node.getnextsibling ()) {
if (Node.getnodetype () ==node.element_node) {
if (Node.getnodename (). Equals ("name")) {
String Name=node.getnodevalue ();
String name1=node.getfirstchild (). Getnodevalue ();
SYSTEM.OUT.PRINTLN (name);
System.out.println (NAME1);
24}
if (Node.getnodename (). Equals ("price")) {
String price=node.getfirstchild (). Getnodevalue ();
SYSTEM.OUT.PRINTLN (price);
28}
29}
30}
31}
32}
33}
The catch (Parserconfigurationexception e) {
E.printstacktrace ();
+} catch (FileNotFoundException e) {
Panax Notoginseng e.printstacktrace ();
Saxexception = catch (e) {
E.printstacktrace ();
catch (IOException e) {
E.printstacktrace ();
42}
43}
The public static void main (string[] args) {
New Domparse ();
46}
47}

Iv. Code Interpretation

Let's take a look at the program reference class:
Import Java.io.FileInputStream;
Import java.io.FileNotFoundException;
Import java.io.IOException;
Import Java.io.InputStream;
Import Javax.xml.parsers.DocumentBuilder;
Import Javax.xml.parsers.DocumentBuilderFactory;
Import javax.xml.parsers.ParserConfigurationException;

The following are mainly Org.xml.sax package classes
Import org.w3c.dom.Document;
Import org.w3c.dom.Element;
Import Org.w3c.dom.Node;
Import org.w3c.dom.NodeList;
Import org.xml.sax.SAXException;

The above simple code can be seen, but in order to introduce DOM programming, you might want to take a look at this program:

(1) A factory instance that gets a DOM parser

Documentbuilderfactory domfac=documentbuilderfactory.newinstance ();
The instance of the Javax.xml.parsers.DocumentBuilderFactory class is the parser factory we want

(2) Get DOM parser from Dom factory

Documentbuilder Dombuilder=domfac.newdocumentbuilder ();
The DOM parser is Newdocumentbuilder () by the static method of Javax.xml.parsers.DocumentBuilderFactory instance

(3) Convert the XML document to be parsed into an input stream so that the DOM parser can parse it

InputStream is=new FileInputStream ("Bin/library.xml");
InputStream is an interface.

(4) Parsing XML document input stream, getting a document

Document Doc=dombuilder.parse (IS);
A Org.w3c.dom.Document object is obtained from the input stream of an XML document, and subsequent processing is done on the Document object

(5) Get the root node of XML document

Element root=doc.getdocumentelement ();
Only the root node in the DOM is a Org.w3c.dom.Element object.

(6) To get the node's child nodes

NodeList Books=root.getchildnodes ();
for (int i=0;i<books.getlength (); i++) {
Node Book=books.item (i);
}
This is a org.w3c.dom.NodeList interface to store all of its child nodes, there is a round of sub-node method, followed by the introduction

(7) Get the attribute value of the node

String email=book.getattributes (). getNamedItem ("email"). Getnodevalue ();
SYSTEM.OUT.PRINTLN (email);
Notice that the property of the node is also its child node. Its node type is also Node.element_node

(8) Round-robin nodes

For (Node node=book.getfirstchild (); node!=null;node=node.getnextsibling ()) {
if (Node.getnodetype () ==node.element_node) {
if (Node.getnodename (). Equals ("name")) {
String Name=node.getnodevalue ();
String name1=node.getfirstchild (). Getnodevalue ();
SYSTEM.OUT.PRINTLN (name);
System.out.println (NAME1);
}
if (Node.getnodename (). Equals ("price")) {
String price=node.getfirstchild (). Getnodevalue ();
SYSTEM.OUT.PRINTLN (price);
}
}

The printout of this code is:
Null
Alterrjzjh
Jjjjjj
It can be seen from above
String Name=node.getnodevalue (); is a null value. and
String name1=node.getfirstchild (). Getnodevalue (); is the real value, because DOM treats <name> rjzjh</name> as a two-tier node with its parent node as the <name> node itself, and it has only one child node (if there are more than one!). , the child node is its value "RJZJH", so we see the result above.
Also, the node type of the child node is also node.element_node type, and the Node.getnextsibling () method is to remove an adjacent node.

v. DOM nodes

The DOM is a collection of nodes that have several different types of nodes defined because the document may contain different types of information. The most common types of nodes in the DOM are:

(1) Element:
An element is the basic component of XML. The child nodes of an element can be other elements, text nodes, or both. ELEMENT nodes can also contain only nodes of this unique type of attribute.

(2) Properties:
The attribute node contains information about the element node, but it is not a child of the element

(3) Text:
Text node text information, or simply blank text.

(4) Documentation:
The document node is the parent of all other nodes in the entire document
An element is a very important type of node, and an element node can be a container for other nodes.

Vi. steps for DOM to parse XML documents:

The main steps see 4th (1), (2), (3), (4) step



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.