Use XPath to read Xml files

Source: Internet
Author: User
Tags processing instruction types of functions xsl xpath contains

Node matching path Xpath

In the process of using XSL for conversion, the concept of matching is very important. In the template declaration statement xsl: template match = "" and template application statement xsl: apply-templates select = "", the Section enclosed by quotation marks must be able to precisely locate the node. The specific Positioning method is provided in XPath.

In addition, you can use Xpath to search and locate XML documents.

The purpose of introducing XPath is to find a node element accurately when matching the XML document structure tree. You can compare XPath to a file management path: through the file management path, you can find the desired file according to certain rules. Similarly, according to the rules set by XPath, you can also easily find any node in the XML structure document tree.

Before introducing the matching rules of XPath, let's take a look at some basic concepts about XPath. The first thing to talk about is the XPath data type. XPath can be divided into four data types:

Node-set)
A node set is a set of nodes that meet the conditions returned by path matching. Other types of data cannot be converted to a node set.

Boolean)
The condition matching value returned by a function or Boolean expression is the same as the Boolean value in a general language and has two values: true and false. Boolean values can be converted to numeric and string types.

String)
A string is a collection of characters. XPath provides a series of string functions. A string can be converted to data of the numeric or boolean type.

Number)
In XPath, the value is a floating point number, which can be a 64-bit double-precision floating point number. In addition, it includes some special descriptions of numerical values, such as non-numerical NaN (Not-a-Number), positive infinity, negative infinity-infinity, and positive and negative 0. The integer value of number can be obtained through the function. In addition, the value can also be converted to boolean and string types.

The last three data types are similar to the corresponding data types in other programming languages, but the first data type is a unique product of the XML document tree. In addition, because XPath contains a series of operations on the document structure tree, it is also necessary to understand the XPath node type. Because of the logical structure of the XML document, an XML file can contain elements, CDATA, comments, processing instructions, and other logical elements. The elements can also contain attributes and define namespaces using attributes. Correspondingly, in XPath, nodes are divided into seven node types:

Root Node)
The root node is the top layer of a tree, and the root node is unique. All other element nodes on the tree are their child nodes or descendant nodes. The root node is processed in the same way as other nodes. In XSLT, tree matching always starts from the root node.

Element Node)
An element node corresponds to every element in the document. A child node of an element node can be an element node, a comment node, a processing command node, and a text node. You can define a unique id for an element node.
Each element node can have an extension. It consists of two parts: one is the namespace URI and the other is the local name.

Text Node)
A text node contains a set of character data, that is, the characters contained in CDATA. No text node is adjacent to any sibling text node, and the text node has no extension.

Attribute Nodes)
Each element node has an associated set of attribute nodes. The element is the parent node of each attribute node, but the attribute node is not a child node of its parent element. This means that the child node of the element can match the attribute node of the element, but in turn it is not true, only one-way. Furthermore, attribute nodes of elements are not shared, that is, different element nodes do not have the same attribute node.
Processing of default properties is equivalent to defining properties. If an attribute is declared in DTD but declared as # IMPLIED, and this attribute is not defined in the element, the attribute node of the element does not contain this attribute.
In addition, the attribute nodes corresponding to the attribute do not have namespace declarations. The namespace attribute corresponds to another type of node.

Namespace Node)
Each element node has a related namespace node set. In XML documents, namespaces are declared by retaining attributes. Therefore, in XPath, such nodes are very similar to attribute nodes, and their relationships with parent elements are unidirectional, it is not shared.

Processing Instruction Nodes)
The processing command node corresponds to each processing command in the XML document. It also has an extension. The local name of the extension points to the processing object, and the namespace part is empty.

Comment Nodes)
The comment node corresponds to the comment in the document. Next, we construct an XML document tree:

<A id = "a1">
<B id = "b1">
<C id = "c1">
<B name = "B"/>
<D id = "d1"/>
<E id = "e1"/>
<E id = "e2"/>
</C>
</B>
<B id = "b2"/>
<C id = "c2">
<B/>
<D id = "d2"/>
<F/>
</C>
<E/>
</A>

Now, we can implement some basic methods to match nodes in XML using Xpath.

Path Matching
Path Matching is similar to the expression of the file path, which is easy to understand. There are several symbols:

Letter Number
Meaning
For example
Matching result

/
Indicates the Node path
/A/C/D
Node "A" subnode "C" subnode "D", that is, D node whose id value is d2

/
Root Node

//
All elements whose paths end with the specified sub-path after "//"
// E
All Eelements. The result is all three Eelements.

// C/E
All the eElements whose parent node is C are Eelements whose id values are e1 and e2.

*
Path wildcard
/A/B/C /*
Element A → Element B → all child elements under Element C, that is, Element B whose name is B, Element D whose id is d1, and two Eelements whose id value is e1 and e2

/*/D
There are two levels of node D elements. The matching result is the D element with the id value of d2.

//*
All elements

|
Logic or
// B | // C
All B and C Elements

 

Location match
For each element, its child elements are ordered. For example:

For example
Meaning
Matching result

/A/B/C [1]
Element A → Element B → the first child element of Element C
B element whose name is B

/A/B/C [last ()]
Element A → Element B → last child element of Element C
Eelement whose id is e2

/A/B/C [position ()> 1]
Element A → Element B → element whose position number is greater than 1 under Element C
D element with the id value of d1 and two Eelements with the id value

Attributes and attribute values
In XPath, attributes and attribute values can be used to match elements. Note that the attribute names of elements must have a "@" prefix before them. For example:

For example
Meaning
Matching result

// B [@ id]
All B elements with property IDS
Two B elements whose id values are b1 and b2

// B [@ *]
All B elements with attributes
Two B elements with the id attribute and one B element with the name attribute

// B [not (@ *)]
All B elements that do not have attributes
Element A → Element B under Element C

// B [@ id = "b1"]
B element whose id is b1
B element under Element

Kinship match
XML documents can be categorized into tree structures, so no node is isolated. Generally, we define the attribution relationship between nodes as a kinship, such as parent, child, ancestor, descendant, and brother. These concepts can also be used for element matching. For example:

For example
Meaning
Matching result

// E/parent ::*
Parent node element of all E nodes
Element A whose id is a1 and element C whose id is c1

// F/ancestor ::*
Ancestor node elements of all F elements
Element A whose id is a1 and element C whose id is c2

/A/child ::*
Child element of
B element whose id value is b1 and b2, C element whose id value is c2, and Eelement without any attribute

/A/descendant ::*
All descendant elements of
All other elements except element

// F/self ::*
All elements of F
F element itself

// F/ancestor-or-self ::*
All F elements and their ancestor node Elements
F element, F element's parent node C element, and A Element

/A/C/descendant-or-self ::*
All elements A → c and their descendant Elements
The C element whose id is c2, the child elements B, D, and F of this element

/A/C/following-sibling ::*
Element A → all sibling node elements in the descending order of Element C
Eelement without any attribute

/A/C/preceding-sibling ::*
Element A → all sibling node elements next to element C
Two B elements whose id values are b1 and b2

/A/B/C/following ::*
Element A → Element B → all elements in the descending order of Element C
Element B with id b2, element C without attributes, Element B without attributes, Element D with id d2, element F without attributes, and element ewithout attributes.

/A/C/preceding ::*
Element A → all elements before element C
Element B whose id is b2, element ewhose id is e2, element ewhose id is e1, Element D whose id is d1, Element B whose name is B, and element C whose id is c1, B element with id b1

Condition match
Conditional matching is to use the Boolean values of some functions to match nodes that meet the conditions. The following types of functions are commonly used for condition matching: node functions, string functions, numeric functions, and Boolean functions. For example, last (), position (), and so on. These functions help us find the desired node precisely.

Functions and functions
Function

Count ()
Returns the number of nodes that meet the conditions.

Number () function
Convert the text in the attribute value to a value

Substring ()

Syntax: substring (value, start, length)

Truncate string

Sum () function

Sum

 

 

These functions are only part of the XPath syntax. A large number of functional functions are not introduced yet, and the XPath syntax is still evolving. Through these functions, we can implement more complex queries and operations.

Among the above matching methods, the maximum number of Path Matching is also used. Locate the node based on the given sub-path relative to the current path.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.