Some knowledge points of beautifulsoup, beautifulsoup

Source: Internet
Author: User

Some knowledge points of beautifulsoup, beautifulsoup

1 html_doc = "<p class = 'body strikeout' id = 'hangsan'> </p>" 2 sp = BeautifulSoup (html_doc, "html. parser ") 3 print (sp. p ['class']) 4 # ['body', 'strikeout'] 5 print (sp. p ['id']) 6 # zhangsan 7 8 html_doc = "<p> Back to the <a rel =" index "> homepage </a> </p>" 9 # rel to obtain the tag attribute Value 10 print (sp. a ['rel ']) 11 # ['index'] 12 13. string can only obtain content with 0 or 1 sub-tags 14 "<p> 123 <a href =" index "> abc </a> </p>" 15 sp. p. string # None 16 sp. a. string # abc 17 "<p> <a href =" index "> abc </a> </p>" 18 sp. p. string = sp. a. string # abc 19 20 # Get brother node 21 "<p class =" st "> 123 <a rel =" index "> homepage </a> <div rel =" bey"> abc </div> <a rel = "index"> def </a> </p> "22 a_sp = sp. a 23 print (a_sp.next_sibling) # obtain div node 24 # Same as parent node 25 #. previus_sibling node 26 27 28 # obtain all sibling nodes, iteration 29 "<p class =" st "> 123 <a rel =" index "> homepage </a> <div rel =" bey "> abc </div> <rel = "index"> def </a> </p> "30 for I in sp. find (""). next_siblings: 31 print (I) 32 # <div rel = "bey"> abc </div> <a rel = "index"> def </a> 33 34 # Next node, may be the same as next_sibling. 35 "<p class =" st "> 123 <a rel =" index "> homepage </a> <div rel =" bey "> abc </div> <a rel = "index"> def </a> </p> "36 sp. find (""). next_element # homepage 37 sp. find (""). next_element.next_element.next_element # <div rel = "bey"> abc </div> 38 sp. find (""). next_element.next_element.next_element.next_element # abc 39 # The next_elements result is an iteration of next_element result 40 41 # find (re. compile ("^ B") uses regular expressions to search for tags starting with B, that is, body 42 43 # True 44 for I in sp. find (True): 45 print (I) 46 # match all child nodes 47 48 # find_all search Nodes Based on id and class. When there are multiple attribute values, to ensure the order of values of the class, otherwise the matching result is not found. 49 sp. find_all ("p", class _ = "sister") 50 sp. find_all (id = "link1") 51 sp. find_all (class _ = "sister") 52 53 # Find the connection 54 sp with elis in the url. find_all (href = re. compile ("elis") 55 56 # search for all href 57 for I in sp. find ("div", class _ = "xiaoba "). find_all ("a"): 58 print (I. get ("href") 59 60 # The attrs parameter of find_all searches for the tag of the custom attribute, where the attrs value is dist 61 sp. find_all (attrs = {"data-info": "xiaobai"}) 62 63 # filter tag 64 sp by TAG value. find_all ("a", text = "xiaobai") # The filter value is 65 66 of the xiaobai tag. # limit limits the number of results returned by 67 sp. find_all ("a", limit = 2) returns the first two data 68 69 # Find the direct sub-tag 70 sp. select ("p> a") # directly sub-tag a of the p tag, which may be multiple or one, and return to the list, which is directly sub-tag 71 sp. select ("p> a: nth-of-type (1)") # Return the first direct sub-tag, starting from 1 72 73 # Find the sub-tag under the sub-tag, id value or class value as the search condition, directly sub-tag 74 sp. select ("div>. class_name ") 75 sp. select ("div> # id_name") 76 sp. select ("div> ul> li> span> B") # supports consistent query 77 78 # query 79 sp by css class name. select (". class_name "), return to list 80 81 # id to find 82 sp. select ("# id_name") 83 84 # Tag Name, class_name, and id_name combination query element 85 sp. select (". class_name ") 86 sp. select ("a # id_name") 87 88 # Tag Name and attribute name combination 89 sp. select ("a [data-info]") # Find tag a 90 sp with the data-info attribute in tag. select ("a [data-info = 'xiaoba']") the value of "data-info" is the 91 # Use of a regular expression similar to that of "92 sp. select ("a [data-info $ =" is "]") # search for a tag whose property value ends with "is" 93 sp. select ("a [data-info ^ =" is "]") # start

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.