Web crawler _xpath Learning (1)

Source: Internet
Author: User
Tags xslt xpath contains xquery

(1) Introduction:

XPath is a language for finding information in an XML document, and XPath can be used to traverse elements and attributes in an XML document.

XPath is the main element of the XSLT standard, and XQuery and XPointer are built on top of the XPath expression at the same time.

Therefore, the understanding of XPath is the foundation of many advanced XML applications.

XPath is the XML Path language, which is a language used to determine the location of a portion of an XML (a subset of standard generic Markup Language) documents. XPath is an XML-based tree structure that provides the ability to find nodes in a data structure tree. At first, the intention of XPath was to use it as a common grammatical model between XPointer and XSL. But XPath is quickly used by developers as a small query language.

What is XPath

*xpath navigating in an XML document using a path expression

*xpath contains a library of standard functions

*xpath is the main element in XSLT

*xpath is a standard

An XPath path expression

XPath uses a path expression to pick a node or set of nodes in an XML document. These path expressions are very similar to the expressions we see in the regular computer file system.

XPath Standard Functions

XPath contains more than 100 built-in functions. These functions are used for string values, numeric, date and time comparisons, node and QName processing, sequence processing, logical values, and so on.

XPath is used in XSLT

XPath is the primary element in the XSLT standard. Without knowledge of XPath, you cannot create an XSLT document.

Both XQuery and XPointer are built on top of an XPath expression. XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and operators.

XPath is the standard      

XPath became the world's standard on November 16, 1999.

XPath is designed for use by XSLT, XPointer, and other XML parsing software.

There are currently two versions of XPath1.0 and XPath2.0. Among them, Xpath1.0 was 1999, and the XPATH2.0 standard was established in the year 2007.

(2) XPath node

In XPath, there are seven types of nodes: elements, attributes, text, namespaces, processing instructions, annotations, and document nodes (or become root nodes).

Nodes (node)

The XML document is treated as a node tree. The root of a tree is called a document node or root node.

Take a look at the following XML document:

1 <?XML version= "1.0" encoding= "Iso-8859-1"?>2 3 <Bookstore>4 5 < Book>6   <titleLang= "en">Harry Potter</title>7   <author>J K. Rowling</author> 8   < Year>2005</ Year>9   < Price>29.99</ Price>Ten </ Book> One  A </Bookstore>

  For an example of a node in the XML document above:

 <bookstore>  (document node)<author>J K. Rowling</author>   (element node) lang= "en" (Attribute node)

Base value (or atomic value, Atomic value)

The base value is a node that has no parent or no child.

Examples of basic values:

J K. Rowling "en"

Project (item)

A project is a base value or node.

Node relationships

* Father (parent)

Each element and attribute has a parent.

In the following example, the book element is the parent of the title, author, year, and price elements:

1 < Book>2   <title>Harry Potter</title>3   <author>J K. Rowling</author>4   < Year>2005</ Year>5   < Price>29.99</ Price>6 </ Book>

* SUB (children)

An element node can have 0, one, or more of a child.

In the example above, the title, author, year, and price elements are the children of the book element.

* Compatriots (Sibling)

Nodes that have the same parent

In the example above, the title, author, year, and price elements are all compatriots:

* Ancestors (Ancestor)

The parent of a node, parent, and so on.

In the following example, the ancestor of the title element is the book element and the bookstore element:

1 <Bookstore>2 3 < Book>4   <title>Harry Potter</title>5   <author>J K. Rowling</author>6   < Year>2005</ Year>7   < Price>29.99</ Price>8 </ Book>9 Ten </Bookstore>

Descendants (descendant)

The child of a node, the child of a child, and so on.

In the example above, the descendants of bookstore are the book, title, author, year, and price elements.

(3) XPath syntax

XPath uses a path expression to pick a node or set of nodes in an XML document. A node is picked up either along a path or a step (steps).

XML Instance Document

We will use this XML document in the following example:

1 <?XML version= "1.0" encoding= "Iso-8859-1"?>2 3 <Bookstore>4 5 < Book>6   <titleLang= "Eng">Harry Potter</title>7   < Price>29.99</ Price>8 </ Book>9 Ten < Book> One   <titleLang= "Eng">Learning XML</title> A   < Price>39.95</ Price> - </ Book> -  the </Bookstore>

Select a node

XPath uses a path expression to select a node in the XML document. A node is selected by a path or step.

The most useful path expressions are listed below:

Instance

In the table below, we have listed some path expressions and the results of the expressions:

predicate (predicates)

To find a particular node or a node that contains a specified value.

The predicate is embedded in square brackets.  

Instance

In the table below, we have listed some path expressions and the results of the expressions:

Select Unknown node

XPath wildcard characters can be used to select unknown XML elements:

     

Instance

In the table below, we have listed some path expressions and the results of the expressions:

Select several paths

By using the ' | ' in a path expression operator, you can select a number of paths.

Instance:

 

Web crawler _xpath Learning (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.