BeautifulSoup Study Notes

Source: Internet
Author: User

This article uses the BeautifulSoup 3, now has BEAUTIFULSOUP4, the name changed to BS4 (1) Download and install
12 # BeautifulSoup 的下载与安装pip install BeautifulSoup
Alternatively, you can download the installation package for installation (2) Quick Start
1234 # BeautifulSoup 快速开始html_doc =urllib2.urlopen(‘http://baike.baidu.com/view/1059363.htm‘)soup = BeautifulSoup(html_doc)printsoup.title
Results:
12 # BeautifulSoup 结果<title>前门大街_百度百科</title>
(3) BeautifulSoup Object IntroductionThere are three types of objects that are mainly contained in BeautifulSoup:
    • Beautifulsoup.beautifulsoup
    • Beautifulsoup.tag
    • Beautifulsoup.navigablestring
Use the following example to understand the above three types of data:
1234567891011121314 # BeautifulSoup 示例fromBeautifulSoup import BeautifulSoupimport urllib2 html_doc = urllib2.urlopen(‘http://www.baidu.com‘) soup = BeautifulSoup(html_doc) print type(soup)print type(soup.title)print type(soup.title.string) print soup.titleprintsoup.title.string
Result is
12345678 # BeautifulSoup 示例结果<class‘BeautifulSoup.BeautifulSoup‘><class ‘BeautifulSoup.Tag‘><class ‘BeautifulSoup.NavigableString‘><title>百度一下,你就知道</title>百度一下,你就知道print soup.titleprintsoup.title.string
From the above example can be relatively clear see BeautifulSoup mainly includes three kinds of objects.
    • Beautifulsoup.beautifulsoup//beautifulsoup Object
    • Beautifulsoup.tag//Tag Object
    • beautifulsoup.navigablestring//navigation string text object
(4) BeautifulSoup parse tree1. Beautifulsoup.tag object method Get Tag object, get tag object by dot number
12345678910 # BeautifulSoup 示例title =soup.titleprint type(title.contents)print title.contentsprint title.contents[0] # BeautifulSoup 示例结果<type‘list‘>[u‘\u767e\u5ea6\u4e00\u4e0b\uff0c\u4f60\u5c31\u77e5\u9053‘]百度一下,你就知道
Contents MethodGets the contents of the current label list, if the label does not have child tags, then the string method and Contents[0] get the same content. See the example above Next,parent MethodGets the current label's child label and parent tag
123456789101112131415161718192021 # BeautifulSoup 示例html =soup.htmlprinthtml.nextprint‘‘printhtml.next.nextprinthtml.next.next.nextSibling# BeautifulSoup 示例结果-equiv="content-type"content="text/html;charset=utf-8"/><meta http-equiv="X-UA-Compatible"content="IE=Edge"/><meta content="always"name="referrer"/><meta name="theme-color" content="#2932e1"/><link rel="shortcut icon"href="/favicon.ico"type="image/x-icon"/><link rel="icon"sizes="any"mask="mask"href="//www.baidu.com/img/baidu.svg"/><link rel="dns-prefetch"href="//s1.bdstatic.com"/><link rel="dns-prefetch"href="//t1.baidu.com"/><link rel="dns-prefetch"href="//t2.baidu.com"/><link rel="dns-prefetch"href="//t3.baidu.com"/><link rel="dns-prefetch"href="//t10.baidu.com"/><link rel="dns-prefetch"href="//t11.baidu.com"/><link rel="dns-prefetch" href="//t12.baidu.com"/><link rel="dns-prefetch"href="//b1.bdstatic.com"/><title>百度一下,你就知道</title>......</head><meta http-equiv="content-type"content="text/html;charset=utf-8"/><meta http-equiv="X-UA-Compatible"content="IE=Edge"/>
nextsibling,previoussiblingGet the next sibling label for the current label and the previous sibling tag

BeautifulSoup Study Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.