Introduction to using the Document Object Model DOM in Java [reprint]

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The Document Object Model (DOM) is a document standard for complete documents and complex applications.ProgramDom provides a lot of flexibility. Dom standards are standard. It is strong and complete, and has many implementations. This is the deciding factor for many large installations-especially for product applications, to avoid a lot of rewriting when the API changes.
These are the reasons why I didn't select other object-oriented standards such as JDOM or dom4j when processing XML data. However, Dom is also a language-independent model from the very beginning, in addition, it tends to be used in languages like C or Perl without the use of Java's object-oriented performance. Therefore, it has encountered a lot of troubles during usage. Let's make a summary here. In addition, I currently use XML mainly as a unified format for data transmission and display interfaces on the user interface. The application surface is not very wide, so there are not many Dom contents used.
When preparing to use it, you have made sufficient preparations and encountered difficulties, therefore, at the beginning, we had a simple tool class to encapsulate the public methods necessary for DOM objects. It turns out that it is wise to do so. It is a simple operation to create document objects, if you need to write more than five lines at a timeCodeAnd also to deal with those annoying exceptions, it is really a blow to everyone's enthusiasm, so at first, made an xmltool class, specifically encapsulated the following public methods:
1. Create a Document Object (including an empty document object, which is created with a given node as the root node.
2. convert a standard XML string into a document object.
3. Read an XML file from the physical hard disk and return a Document Object.
4. convert a Node object to a string.

each method intercepts the exceptions thrown by related Dom operations and converts them to a runtimeexception. These exceptions are not thrown during actual use, in particular, it is not necessary to spend time on the parserconfigurationexception generated when a document object is generated, transformerconfigurationexception generated when the node is converted into a string, and so on. In addition, if related exceptions occur, there is actually no way to handle them, this situation is usually caused by a problem in the system environment configuration (for example, the necessary Dom implementation parser and other packages are not added to the environment). Therefore, when packaging this exception, the message is simply obtained and thrown.
the code is as follows:
/**
* Initialize an empty Document Object and return it.
* @ return a document
*/
Public static document newxmldocument () {
try {
return newdocumentbuilder (). newdocument ();
}catch (parserconfigurationexception e) {
throw new runtimeexception (E. getmessage ();
} }

/**
* initialize a documentbuilder
* @ return a documentbuilder
* @ throws parserconfigurationexception
*/
Public static documentbuilder newdocumentbuilder ()
throws parserconfigurationexception {
return newdocumentbuilderfactory (). newdocumentbuilder ();
}

/**
* initialize a documentbuilderfactory
* @ return a documentbuilderfactory
*/
Public static documentbuilderfactory newdocumentbuilderfactory () {
documentbuilderfactory DBF = documentbuilderfactory. newinstance ();
DBF. setnamespaceaware (true);
return DBF;
} /**
* converts an imported XML string into an org. w3C. dom. the document object is returned.
* @ Param xmlstring: A string expression that complies with XML specifications.
* @ return a document
*/
Public static document parsexmldocument (string xmlstring) {
If (xmlstring = NULL) {
throw new illegalargumentexception ();
} try {
return newdocumentbuilder (). parse (
New inputsource (New stringreader (xmlstring);
}catch (exception e) {
throw new runtimeexception (E. getmessage ();
} }

/**
* If an input stream is given, it is parsed into an org. W3C. Dom. Document Object and returned.
* @ Param Input
* @ Return a org. W3C. Dom. Document
*/
Public static document parsexmldocument (inputstream input ){
If (input = NULL ){
Throw new illegalargumentexception ("the parameter is null! ");
}
Try {
Return newdocumentbuilder (). parse (input );
} Catch (exception e ){
Throw new runtimeexception (E. getmessage ());
}
}
/**
* Given a file name, get the file and resolve it to an org. W3C. Dom. Document Object.
* @ Param filename name of the file to be parsed
* @ Return a org. W3C. Dom. Document
*/
Public static document loadxmldocumentfromfile (string filename ){
If (filename = NULL ){
Throw new illegalargumentexception ("the file name and its physical path are not specified! ");
}
Try {
Return newdocumentbuilder (). parse (new file (filename ));
} Catch (saxexception e ){
Throw new illegalargumentexception (
"The target file (" + filename + ") cannot be correctly parsed into XML! \ N "+ E. getmessage ());
} Catch (ioexception e ){
Throw new illegalargumentexception (
"The target file cannot be obtained (" + filename + ")! \ N "+ E. getmessage ());
} Catch (parserconfigurationexception e ){
Throw new runtimeexception (E. getmessage ());
}
}
/**
* If a node is specified, the node is added to the newly constructed document.
* @ Param node A document Node
* @ Return a new document
*/
Public static document newxmldocument (node ){
Document Doc = newxmldocument ();
Doc. appendchild (Doc. importnode (node, true ));
Return Doc;
}

/**
* output a DOM Node object as a string. If it fails, an empty string "" is returned "".
* @ Param node Dom Node object.
* @ return a XML string from node
*/
Public static string tostring (node) {
If (node = NULL) {
throw new illegalargumentexception ();
} transformer = newtransformer ();
If (transformer! = NULL) {
try {
stringwriter Sw = new stringwriter ();
transformer. transform (
New domsource (node),
New streamresult (SW);
return SW. tostring ();
}catch (transformerexception Te) {
throw new runtimeexception (Te. getmessage ();

}
}
Return errxmlstring ("XML Information cannot be generated! ");
}
/**
* Output a DOM Node object as a string. If it fails, an empty string "" is returned "".
* @ Param node Dom Node object.
* @ Return a XML string from Node
*/
Public static string tostring (node ){
If (node = NULL ){
Throw new illegalargumentexception ();
}
Transformer transformer = newtransformer ();
If (transformer! = NULL ){
Try {
Stringwriter Sw = new stringwriter ();
Transformer. Transform (
New domsource (node ),
New streamresult (SW ));
Return Sw. tostring ();
} Catch (transformerexception Te ){
Throw new runtimeexception (TE. getmessage ());

}
}
Return errxmlstring ("XML Information cannot be generated! ");
}
/**
* Obtain a transformer object. Because the same Initialization is performed during use, it is extracted as a public method.
* @ Return a transformer encoding gb2312
*/
Public static transformer newtransformer (){
Try {
Transformer transformer =
Transformerfactory. newinstance (). newtransformer ();
Properties Properties = transformer. getoutputproperties ();
Properties. setproperty (outputkeys. encoding, "gb2312 ");
Properties. setproperty (outputkeys. method, "XML ");
Properties. setproperty (outputkeys. version, "1.0 ");
Properties. setproperty (outputkeys. indent, "no ");
Transformer. setoutputproperties (properties );
Return transformer;
} Catch (transformerconfigurationexception TCE ){
Throw new runtimeexception (TCE. getmessage ());
}
}
/**
* An XML error message is returned. The title of the prompt message is: system error. String Assembly is mainly used in this case.
* No exception occurs.
* @ Param errmsg error message
* @ Return a XML string show err msg
*/
Public static string errxmlstring (string errmsg ){
Stringbuffer MSG = new stringbuffer (100 );
MSG. append ("<? XML version = \ "1.0 \" encoding = \ "gb2312 \"?> ");
MSG. append ("<errnode Title = \" system error \ "errmsg = \" "+ errmsg +" \ "/> ");
Return msg. tostring ();
}
/**
* An XML error message is returned. The title of the prompt message is: System Error
* @ Param errmsg error message
* @ Param errclass refers to the class that throws this error and is used to extract the error source information.
* @ Return a XML string show err msg
*/
Public static string errxmlstring (string errmsg, class errclass ){
Stringbuffer MSG = new stringbuffer (100 );
MSG. append ("<? XML version = \ "1.0 \" encoding = \ "gb2312 \"?> ");
MSG. append (
"<Errnode Title = \" system error \ "errmsg = \""
+ Errmsg
+ "\" Errsource = \""
+ Errclass. getname ()
+ "\"/> ");
Return msg. tostring ();
}
/**
* An XML error message is returned.
* @ Param title the title prompted
* @ Param errmsg error message
* @ Param errclass refers to the class that throws this error and is used to extract the error source information.
* @ Return a XML string show err msg
*/
Public static string errxmlstring (
String title,
String errmsg,
Class errclass ){
Stringbuffer MSG = new stringbuffer (100 );
MSG. append ("<? XML version = \ "1.0 \" encoding = \ "gb2312 \"?> ");
MSG. append (
"<Errnode Title = \""
+ Title
+ "\" Errmsg = \""
+ Errmsg
+ "\" Errsource = \""
+ Errclass. getname ()
+ "\"/> ");
Return msg. tostring ();
}

the above are all basic Dom applications, so we will not detail them in detail.
in actual use, there are several situations that are frequently used, but the DOM interface design makes this operation very troublesome. Therefore, corresponding processing methods are added respectively.
the most troublesome part is to obtain the text information of the text subnode of a node, as shown in the following XML node:

text

if you have an Element Node object, you must obtain the text "text ", first, you must obtain the list of child nodes of the element node and determine whether the child node exists. If the child node exists, traverse the child node to find a textnode node and use getnodevalue () method to obtain the text information. Because there is no subnode when the element node has no information, you must determine whether the element node has a subnode to access the textnode that actually contains the text information, if all the data to be processed is provided in this form, a large amount of development code will be added and the development work will be boring. Therefore, a default convention is used here, A public method is provided. This method obtains the text information of the text node of the direct subnode under the given node. If the text node does not exist, null is returned. However, the use of this method is limited and may lead to incorrect use of this method. However, according to the actual usage, such conventions and usage are correct, because the above example is actually used, the Code:
/**
* This method gets the text information of the text node under the given node, if the text node does not exist, null is returned.
* Note: it is a direct subnode, and the difference between two or more layers is not considered.
* @ Param node A a node.
* @ return a string if a given node has a text subnode, the text information of the first accessed text subnode is returned. If not, null is returned.
*/
Public static string getnodevalue (node) {
If (node = NULL) {
return NULL;
}

Text text = gettextnode (node );

If (text! = NULL ){
Return text. getnodevalue ();
}

Return NULL;
}

/**
* This method gets the text node under the given node. If the text node does not exist, null is returned.
* Note: it is a direct subnode, and the difference between two or more layers is not considered.
* @ Param node A a node.
* @ Return a text if the given node has a text subnode, the first accessed text subnode is returned. If not, null is returned.
*/
Public static text gettextnode (node ){
If (node = NULL ){
Return NULL;
}
If (node. haschildnodes ()){
Nodelist list = node. getchildnodes ();
For (INT I = 0; I <list. getlength (); I ++ ){
If (list. Item (I). getnodetype () = node. text_node ){
Return (text) List. item (I );
}
}
}
Return NULL;
}

The code above gets the direct text subnodes of the given node for separate packaging.

Another common problem is that I want to directly locate the target node and obtain the Node object, instead of finding the target node through node traversal at a layer-by-layer, the dom2 interface provides at least the following methods to locate nodes:
1. For document objects:
1) getdocumentelement ()-Get the Root Node object, which is rarely used, because the root node is basically a root node, the actual data nodes start from direct subnodes under the root node.
2) getelementbyid (string elementid)-This method is supposed to be the best method for locating, but it is not used by me in actual use. The main reason is, the "ID" here is different from the attribute "ID" of a node. w3C. dom. the document API description clearly states that I did not see the relevant usage methods after I found a lot of information, so I had to give up.
3) getelementsbytagname (string tagname)-there is actually no way to choose this method, so you have to use it, but it is also very useful in practice. Although this method returns a nodelist, however, in actual use, the tagname of the node is designed as a special string, which can be obtained directly. In actual use, the tagname is similar, in many cases, the field name in the database is used as the tagname to conveniently obtain the value of this field. In a simple convention, the following method is used:
/**
* This method Retrieves all nodes whose tagnames are tagnames under the Element Parameter and returns the first node in the node list.
* If the node of the tagname does not exist, null is returned.
* @ Param element: node to be searched
* @ Param tagname: name of the tag to be searched
* @ Return a element gets the first node in the node list named by tagname.
*/
Public static element getfirstelementbyname (
Element element,
String tagname ){
Return (element) getfirstelement (element. getelementsbytagname (tagname ));
}
/**
* Obtain the first node from the given node list and return NULL if the node set is null or empty.
* @ Param nodelist A nodelist
* @ Return a node
*/
Private Static node getfirstelement (nodelist ){
If (nodelist = NULL | nodelist. getlength () = 0 ){
Return NULL;
}
Return nodelist. Item (0 );
}
This Convention seems very restrictive. In fact, this is basically the case in actual use. You only need to obtain the first element node with a given tagname.
4) getelementsbytagnamens (string namespaceuri, string localname)-This method is basically not used because it has not encountered the need to use a namespace.
2. For the Element Object-the element object is the same as the document object, and the getdocumentelement () method is missing. However, like the document object, the getelementsbytagname () method is mainly used.
3. Other node objects do not have direct access methods.

Another kind is caused by the limitation of dom2. In the dom2 specification, a document Doca node cannot be directly added to the subnode list of another document docb object node, to do this, you must first convert the Doca node to the sub-node list added to the target node by using the importnode method of docb. Therefore, there is also a method for unified processing:
/**
* This method attaches the root node of the appendeddoc parameter and the following nodes to the Follow node of the doc.
* As the slave node of the doc.
* Equivalent to: Doc. appenddoc (appendeddoc );
* @ Param doc a document
* @ Param appendeddoc a document
*/
Public static void appendxmldocument (document DOC, document appendeddoc ){
If (appendeddoc! = NULL ){
Doc. getfirstchild (). appendchild (
Doc. importnode (appendeddoc. getfirstchild (), true ));
}
}
/**
* This method attaches the root node of the appendeddoc parameter and the following nodes to the node.
* As the next subnode of the node.
* Equivalent to: node. appenddoc (appendednode );
* @ Param node the node to be added will be added to the end of the node.
* @ Param appendednode a node is added as the last subnode of the node.
*/
Public static void appendxmldocument (node, node appendednode ){
If (appendednode = NULL ){
Return;
}
If (appendednode instanceof document ){
Appendednode = (document) appendednode). getdocumentelement ();
}
Node. appendchild (
Node. getownerdocument (). importnode (appendednode, true ));
}

 Basically, there are some other commonly used methods, but they are not often used. In addition, if anyone knows the specific and convenient usage of the getelementbyid () method mentioned above, please advise.

author blog: http://blog.csdn.net/Morgan0916/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to using the Document Object Model DOM in Java [reprint]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to using the Document Object Model DOM in Java [reprint]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support