This article mainly introduces the usage of jqueryPyQuery in Python. For more information, see the Python implementation of jQuery in the pyquery Library. it can be used to parse HTML webpage content. usage:
The code is as follows:
From pyquery import PyQuery as pq
1. you can load an HTML string, an HTML file, or a url. for example:
The code is as follows:
D = pq ("Hello")
D = pq (filename = path_to_html_file)
D = pq (url = 'http: // www.baidu.com ') # The url must be fully written.
2. html () and text () -- obtain the corresponding HTML block or text block, for example:
The code is as follows:
P = pq ("Hello")
P('head'0000.html () # returnHello
P ('head'). text () # return hello
3. obtain elements based on HTML tags, for example:
The code is as follows:
D = pq ('
Test 1
Test 2
')
D ('P') # Return [
,
]
Print d ('P') # return
Test 1
Test 2
Print d('p').html () # return test 1
Note: when more than one element is obtained, the html () and text () methods only return the corresponding content blocks of the first element.
4. eq (index) -- obtain the specified element based on the given index number
For example, if you want to get the content in the second p tag, you can:
The code is as follows:
Print d('p'{.eq(1}.html () # return test 2
5. filter () -- obtain the specified element based on the class name and id, for example:
The code is as follows:
D = pq ("
Test 1
Test 2
")
D ('P'). filter ('# 1') # Return [ ]
D ('P'). filter ('. 2') # Return [ ]
6. find () -- Search for nested elements, for example:
The code is as follows:
D = pq ("
Test 1
Test 2
")
D ('P'). find ('P') # Return [ , ]
D ('P'). find ('P'). eq (0) # Return [ ]
7. obtain elements directly based on the class name and id name, for example:
The code is as follows:
D = pq ("
Test 1
Test 2
")
D('00001'0000.html () # return test 1
D('.2'0000.html () # return test 2
8. get the property value, for example:
The code is as follows:
D = pq ("
Hello
")
D ('A'). attr ('href ') # return http://hello.com
D ('P'). attr ('id') # return my_id
9. modify the attribute value, for example:
The code is as follows:
D ('A'). attr ('href ', 'http: // baidu.com ')
10. addClass (value) -- add a class for the element, for example:
The code is as follows:
D = pq ('
')
D. addClass ('My _ class') # Return [ ]
11. hasClass (name) # Return to determine whether the element contains the given class. for example:
The code is as follows:
D = pq ("
")
D. hasClass ('My _ class') # return True
12. children (selector = None) -- obtain the child element, for example:
The code is as follows:
D = pq ("
Hello
World
")
D. children () # Return [ , ]
D. children ('# 2') # Return [ ]
13. parents (selector = None) -- obtain the parent element, for example:
The code is as follows:
D = pq ("
Hello
World
")
D ('P'). parents () # Return []
D ('# 1'). parents ('span') # Return []
D ('# 1'). parents ('P') # Return []
14. clone () -- Returns a copy of a node.
15. empty () -- remove node content
16. nextAll (selector = None) -- Return All element blocks following the returned results, for example:
The code is as follows:
D = pq ("
Hello
World
")
D ('P: first '). nextAll () # Return [ ,]
D ('P: last'). nextAll () # Return []
17. not _ (selector) -- returns the element that does not match the selector. for example:
The code is as follows:
D = pq ("
Test 1
Test 2
")
D ('P'). not _ ('# 2') # Return [ ]
For more information, refer to the official website http://packages.python.org/pyquery