1. Summary of the usage of re regular
(1), ^ indicates which character to start with eg: ' ^g ' denotes a string starting with G. means that any character ' ^g.d ' represents a second character starting with G, and a third string of B indicates that a character appears any number of times
Import re ' bobby123 ' '^b.*' # Any string that begins with B appears any number of times ^ begins with . Any character * Occurs any number of times if re.match (regex_str,line ):print('Yes ')
(2), $ represents a string that ends with a character
' bobby123 ' '^b.*3$' # Any string that begins with B appears any number of times ^ with what begins . Any character * Occurs any number of times $ with what character ends if re.match (regex_strs,lines ):print(' Yes')
The result of the operation is: ' Yes '
(3), to indicate a non-greedy pattern matching (not very clear, roughly greedy pattern matching will be from the two ends of the string to match, non-greedy from left to right, matching results will be different); () indicates that the character appears at least once
# non-greedy mode? greedy mode does not add? Just good 'boooooooooooooobby123'. B.*?B). *' #() Extract substring = re.match (regex_strs,lines) Print(Mat.group (1))
The result of operation is: 'boooooooooooooob ';
Greedy match eg:
# + The usage character appears at least once ' boooooooooooooobbbbby123 ' '. * (b.+b). *' #() extract substring greedy plus + result for bbb= Re.match (regex_strs,lines) Print (Mat.group (1))
Operation result: ' BBB '
2. XPath syntax XML Instance document
We will use this XML document in the following example.
<?xml version= "1.0" encoding= "iso-8859-1"?><bookstore><book> <title lang= "Eng" >Harry potter</title> <price>29.99</price></book><book> <title lang= "eng" > Learning xml</title> <price>39.95</price></book></bookstore>
Select a node
XPath uses a path expression to select a node in the XML document. A node is selected by a path or step.
The most useful path expressions are listed below:
An expression |
Description |
NodeName |
Select all child nodes of this node. |
/ |
Select from the root node. |
// |
Selects the nodes in the document from the current node that matches the selection, regardless of their location. |
. |
Select the current node. |
.. |
Selects the parent node of the current node. |
@ |
Select the attribute. |
Instance
In the table below, we have listed some path expressions and the results of the expressions:
Path Expression |
Results |
Bookstore |
Selects all child nodes of the bookstore element. |
/bookstore |
Select the root element bookstore. Note: If the path starts with a forward slash (/), this path always represents the absolute path to an element! |
Bookstore/book |
Selects all book elements that belong to a child element of bookstore. |
Book |
Selects all book child elements, regardless of their position in the document. |
Bookstore//book |
Selects all book elements that belong to descendants of the bookstore element, regardless of where they are located under bookstore. |
@lang |
Select all attributes that are named Lang. |
predicate (predicates)
To find a particular node or a node that contains a specified value.
The predicate is embedded in square brackets.
Instance
In the table below, we list some path expressions with predicates, as well as the results of expressions:
Path Expression |
Results |
/BOOKSTORE/BOOK[1] |
Selects the first book element that belongs to a bookstore child element. |
/bookstore/book[last ()] |
Select the last book element that belongs to the bookstore child element. |
/bookstore/book[last ()-1] |
Select the second-to-last book element that belongs to the bookstore child element. |
/bookstore/book[position () <3] |
Select the first two book element that belongs to the child element of the bookstore element. |
title[@lang] |
Select all the title elements that have properties named Lang. |
title[@lang = ' Eng '] |
Selects all title elements, and these elements have the lang attribute value of Eng. |
/BOOKSTORE/BOOK[PRICE>35.00] |
Selects all the book elements of the bookstore element, and the value of the price element must be greater than 35.00. |
/bookstore/book[price>35.00]/title |
Selects all the title elements of the book element in the bookstore element, and the value of the price element must be greater than 35.00. |
Select Unknown node
XPath wildcard characters can be used to select unknown XML elements.
wildcard characters |
Description |
* |
Matches any element node. |
@* |
matches any attribute node. |
Node () |
Matches any type of node. |
Instance
In the table below, we list some path expressions and the results of these expressions:
Path Expression |
Results |
/bookstore/* |
Selects all child elements of the bookstore element. |
//* |
Selects all elements in the document. |
Title[@*] |
Select all the title elements with attributes. |
Select several paths
By using the ' | ' in a path expression operator, you can select a number of paths.
Instance
In the table below, we list some path expressions and the results of these expressions:
Path Expression |
Results |
Book/title | Book/price |
Selects all the title and price elements of the book element. |
Title | Price |
Selects all the title and price elements in the document. |
/bookstore/book/title | Price |
Selects all the title elements of the book element that belong to the bookstore element, and all the price elements in the document. |
This article source W3cschool link http://www.w3school.com.cn/xpath/xpath_syntax.asp
Python re module, XPath usage