The use of XPath technology and BeautifulSoup in Python

Source: Internet
Author: User

Basic knowledge of XPath

XPath syntax: Use a path expression to pick a node or node set in an XML or HTML document

Path expression

NodeName: Indicates that all child nodes of this node are selected

/: Indicates pick from root node

: Select a node at any location.

           . : Select the current node

          .. : selects the parent node of the current node

@: Select Properties

Predicate Instances

To achieve the effect of road-strength expression

Select the first student element that belongs to the classroom child element/classroom/student[1]

Select the last student element that belongs to the classroom child Element/classroom/student[last ()]

Selects the penultimate stduent element that belongs to the classroom child Element/classroom/stduent[last ()-1]

Select the first two student elements that belong to the child elements of the Classroom element/classroom/stduent[position () <3]

Select all name elements that have an attribute named Lang//name[@lang]

Select all name elements that have a lang attribute value of eng//name[@lang = ' en ')

Select all the student elements of the classroom element, and the value of the age element must be greater than. Classroom.stduent[age>20]

Selects all the name elements of the student element in the classroom element, and the value of the age element must be greater than 20/classroom/stduent[age>20]/name

wildcard Characters "*" and "|" Operation

Implementing an Effect Path expression

Selects all child elements of the classroom element/classroom/*

Selects all elements in a document//*

Select all name elements with attributes//name[@*]

Selects all the name and age elements of the stduent element//stduent/name | Stduent/age

Selects all the name elements of the student element that belong to the classroom element, and all the age elements in the document/classroom/stduent/name | Age

The syntax for an XPath axis step is the axis name: node test [predicate]

Axis name meaning

Child Selects all children of the current node

Parent selects the parents of the current node

Ancestor Select all ancestors of the current node (parent, grandfather, etc.)

Ancestor-or-self selects all ancestors of the current node and the current node itself

Descendant Select all descendant nodes of the current node

Descendant-or-self selects all descendant nodes of the current node and the current node itself

Preceding selects all nodes before the start tag of the current node in the document

Following selects all nodes after the end tag of the current node in the document

Preceding-sibling all sibling nodes before the current node is selected

Following-sibling the sibling node after the current node is selected

Self Pick Current node

Attribute selects all properties of the current node

Namespace Select all namespaces for the current node

XPath axis sample analysis

Implementing an Effect Path expression

Selects the teacher node of the child element of the current classroom node/classroom/child::teacher

Select the parent node of all ID nodes//id/parent::*

Select all ancestor nodes with ClassID as child nodes//classid/ancestor::*

Select all descendant nodes under the classroom node/classroom/descendant::*

Select all ID elements with student as the parent node//student/descendant::id

Selects the ancestor nodes of all classid elements and their own//classid/ancestor-or-self::*

Select the/classroom/student itself and all its descendant elements/classroom/student/descendant-or-self::*

Select all the sibling nodes before/classroom/teacher, and the result is to select all the student nodes/classroom/teacher/preceding-sibling::*

Selects all sibling nodes after the second stduent in/classroom/classroom/student[2]/following-sibling::*

Selects all previous nodes (except their ancestors) of the/classroom/teacher node, not just the student node, but also the child nodes inside/classroom/teacher/preceding::*

Selects all nodes after the second student in/classroom, and the result is that the teacher node and its child nodes are selected/classroom/student[2]/following::*

Select the student node and use it alone no meaning//stduent/self::*

Select All properties under the/classroom/teacher/name node/classroom/teacher/name/attribute::*

example XPath operator analysis

Meaning instance

Select all student elements of the classroom element/classroom/student[age=19+1]/classroom/stduent[age=5*4]/classroom/stu Dent[age=21-1]

And the value of the age element must be equal to 20/CLASSROOM/STUDENT[AGE=40DIV2]

Similarly, you can select actions such as greater than, less than, not equal to

An OR operation instance/classroom/stduent[age<20 or Age>25] ..... age is less than 20 or greater than 25

An instance of and/classroom/stduent[age>20 and age<25] .... Age between 20 and 25 in the age of.

MoD calculates the remainder of division

Instance Code

 fromlxmlImportEtreecontentstream= Open (R'Xpathtext.xml','RB') Content= Contentstream.read (). Decode ('Utf-8') Root=etree. XML (content)Print(content)Print('-------') EM= Root.xpath ('/classroom/student[2]/following::*')Print(Em[0].xpath ('./name/text ()'))#get the contents of the text in the name labelPrint(Em[0].xpath ('./name/@lang'))#gets the property value named Lang in the name tag
View Code

BeautifulSoup Basic Knowledge

The use of XPath technology and BeautifulSoup in Python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.