Dom of XML Parser

Source: Internet
Author: User
Tags xml parser

The Document Object Model (often called Dom) defines a set of interfaces for the parsed versions of the XML document. The parser reads the entire document and constructs a tree with resident memory. Then, your code can use the DOM interface to operate on this tree structure. You can traverse the tree to understand what the original document contains. You can delete several sections of the tree, arrange the tree again, and add new branches.

 

Dom Problems

Dom provides a rich set of functions that you can use to interpret and operate XML documents, but they are costly to use. When developing the original dom for XML documents, many people on the XML-DEV mail list raised several Dom issues:

  • Dom constructs a tree where the entire document resides in the memory. If the file size is large, a large amount of memory is required.
  • Dom creation indicates the object of each item in the original document, including elements, text, attributes, and spaces. If you only need to pay attention to a small part of the original document, it is extremely waste to create objects that will never be used.
  • The Dom parser must read the entire document before your code gets control. This can cause significant latency for very large documents.

These are only problems caused by the design of the Document Object Model,Dom api is a very useful method for parsing XML documents..

Common Dom functions and common functions
Let's look at a simple document tree.

The top-level node of the tree is a nodea node. The following Pseudo Code reflects the relationship between nodes in the tree:

Nodea. firstchild = nodea1
Nodea. lastchild = nodea3
Nodea. childnodes. Length = 3
Nodea. childnodes [0] = nodea1
Nodea. childnodes [1] = nodea2
Nodea. childnodes [2] = nodea3
Nodea1.parentnode = nodea
Nodea1.nextsibling = nodea2
Nodea3.prevsibling = nodea2
Nodea3.nextsibling = NULL
Nodea. lastchild. firstchild = nodea3a
Nodea3b. parentnode. parentnode = nodea
Dom provides some methods to operate the node Structure of a document tree. You can execute "insert node", "Update node", "delete node ", function Description of common operation functions such as "clone node"
Insertbefore () inserts a new subnode before referring to the subnode. If the subnode referenced is null, the new subnode is inserted as the last subnode of the call node.
ReplaceChild () replaces oldchild with the specified newchild in the childnodes set. If the replacement succeeds, oldchild is returned. If newchild is null, oldchild is deleted.
Removechild () deletes a specified node from the childnodes set of the node. If the node is deleted successfully, the deleted child node is returned.
Appendchild () adds a new node to the end of the childnodes set. If yes, the new node is returned.
Clonenode () creates a new and replicated node. If the input parameter is true, it also copies the child node. If the node is an element, it also copies the corresponding attribute, return new node

Nodes in the tree can be divided into "Element Node" and "text node )"

"Element Node" corresponds to a "label" in the document. It can have "attributes" and can contain "subnodes" internally ".

"Text node" corresponds to the plain text content in the document. It does not have any corresponding "tags" in the document and cannot contain "subnodes ".

The "access method" of "Element Node" and "text node" is different. However, the "delete method" is the same.

Element Node)
Access Element Node
You can use the following methods:

Using the relationship between nodes, you can jump from one node to another until you can jump to the desired target node, mainly using firstchild, parentnode, etc.
Getelementbyid ()
Getelementsbytagname ()

The following describes these methods in detail:

Use the relationship between nodes to locate an element node
Example 1:

<HTML>
<Head>
<Title> </title>
</Head>
 
<Body> <p> This is a piece of text. </P>
<Script language = "JavaScript">
Alert(document.doc umentelement. lastchild. firstchild. tagname );
</SCRIPT>
</Body>
</Html>
The result will show "p". The following are some explanations:

Document.doc umentelement
Obtain the HTML Tag.
Lastchild
Get the body tag
Firstchild
Obtain the first node in the body, that is, the p tag.
Tagname
Get the label name of the node, "P"
Example 2:

<HTML>
<Head>
<Title> </title>
</Head>
 
<Body>
<P> This is a piece of text. </P>
<Script language = "JavaScript">
Alert(document.doc umentelement. lastchild. firstchild. tagname );
</SCRIPT>
</Body>
</Html>
In this example, a blank line is added after the "Body Tag", but in Firefox, the blank line is regarded as a node, so the return value is "undefined ", in ie, the null rows will be skipped and still point to the P tag.

Use document. getelementbyid () to access the Element Node
First, set the id value for the node to be accessed.

<P id = "section"> This is a paragraph. </P>
Then, pass the node id value to the document. getelementbyid () function to obtain the corresponding node.

Alert (document. getelementbyid ("section"). tagname );
In this method, you do not need to worry about where the node is located in the document tree, but you only need to ensure that its id value is unique on the page.

Use document. getelementsbytagname (Tag Name) [N] to access the Element Node
The returned value of document. getelementsbytagname (Tag name) is an array. For example, you can use the following example to change the connection of the entire page.

VaR nodelist = Document. getelementsbytagname ("");
For (VAR I = 0; I <nodelist. length; I ++)
Nodelist [I]. style. color = "# ff0000 ";

Create Element Node
Code before creation

<P id = "section"> This is a paragraph. </P>

VaR span_text = Document. createelement ("span ");
Span_text.appendchild (document. createtextnode ("this is the newly added text ."));
Document. getelementbyid ("section"). appendchild (span_text );
Text Node)
Access text node
Text node does not have the ID attribute as element node does, so it cannot be accessed through document. getelementbyid.

Text node does not correspond to any "tag" in the document, so it cannot be accessed through document. getelementsbytagname ().

Therefore, a text node can only be located using the relationship between nodes.

Example 1:

<P id = "section"> here is the initial paragraph text </P>
Document Structure: The text string in the p label. The initial text of a paragraph constitutes the text node. It is the only child of the P element and can be accessed through firstchild.

Reflects the document structure

Access the Js of text node:

Alert (document. getelementbyid ("section"). firstchild. nodevalue );
Note: The nodevalue attribute can be used to obtain the "text content" corresponding to the text node, but this attribute cannot be used for element node, such as alert (document. getelementbyid ("section "). nodevalue); it is incorrect.

Example 2: (the document structure is more complex)

<P id = "section"> here is the paragraph's <B> initial </B> text </P>
Document Structure: The P element has three children,

Here is the section that constitutes a text node
<B> the initial </B> constitutes the element node.
A text node

Reflects the document structure

Here, we use document. getelementbyid ('Paragraph '). firstchild. nodevalue to obtain only the section, excluding the <B> initial </B> text.

You can also use

Document. getelementbyid ('Paragraph '). firstchild. nodevalue = "<B> New </B> text"; the HTML code will not be interpreted, and the browser will regard them as common text for display.

Create text node:

VaR new_textnode = Document. createtextnode ("newly added text ");
The above Code creates a new text node, but it is not part of the document tree. to display it on the page, you must make it a child of a node in the document tree, because text node cannot have a son, you cannot add it to a text node. You can only add it to element node.

<P id = "section"> here is the initial paragraph text </P>
The following code creates a new text node and adds it to the end of the childnodes array through the appendchild () method.

VaR new_textnode = Document. createtextnode ("newly added text ");
VaR El = Document. getelementbyid ("section ");
El. appendchild (new_textnode );
Delete text node:

VaR El = Document. getelementbyid ("section ");
If (El. haschildnodes ())
El. removechild (El. lastchild); first, judge whether the parent node has child to prevent the call of removechild () to generate an error when it does not have child.

Attribute
Attribute objects are related to elements, but are not subnodes of the document tree.

Access attribute
You can define attributes in HTML tags:

<P id = "section" myattribute = "myvalue"> here is the paragraph text </P>
Then, use getattribute () to obtain the value of this attribute.

Alert (document. getelementbyid ("section"). getattribute ("myattribute "));
The returned value will be "myvalue". However, you must use the getattribute () function instead of attributename, because some Browsers Do not support custom attributes.

Create Attribute
There are three ways to create new attributes for the element

VaR ATTR = Document. createattribute ("myattribute ");
ATTR. value = "myvalue ";
VaR El = Document. getelementbyid ("section ");
El. setattributenode (ATTR); var El = Document. getelementbyid ("section ");
El. setattribute ("myattribute", "myvalue"); var El = Document. getelementbyid ("section ");
El. myattribute = "myvalue"; Delete attribute
Attribute can also be deleted from an element. You can use removeattribute () or point element. attributename to a null value.

Change Attribute Value
Example 1:

<P id = "section" align = "Left"> here is the paragraph text </P>
Use setattribute to change the value of an attribute.

Document. getelementbyid ('Paragraph '). setattribute ('align', 'right ');
Example 2:

<P id = "section" style = "text-align: Left;"> here is the paragraph text </P>
Modify the text-align sub-attribute in the style of an element.

Document. getelementbyid ('Paragraph '). style. textalign = 'right ';
Note: textalign In the DOM corresponds to the text-align sub-attribute of the style. There is a basic rule: if the style sub-attribute name contains the "-" symbol, it will be removed from the Dom and the subsequent letter will be changed to uppercase, for example: backgroundcolor in Dom corresponds to background-color

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/gang_gang_gang/archive/2008/11/29/3408127.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.