python BeautifulSoup html解析

來源:互聯網
上載者:User

標籤:pytho   網頁   findall   key   java   html解析   函數   utf-8   api   

* BeautifulSoup 的.find(), .findAll() 函數原型

findAll(tag, attributes, recursive, text, limit, keywords)find(tag, attributes, recursive, text, keywords)

  

* 取得 span.green

bsObj.findAll("span", {"class":"green"})

#-*- coding: UTF-8 -*-#!/usr/local/bin/pythonfrom urllib.request import urlopenfrom urllib.request import HTTPError, URLErrorfrom bs4 import BeautifulSoupdef getBsObj(url):    try:        html = urlopen(url, None, 3)    except(HTTPError, URLError) as e:        print(e)        return None    try:        bsObj = BeautifulSoup(html.read(), "html.parser")    except AttributeError as e:        return None    return bsObjbsObj = getBsObj("http://www.pythonscraping.com/pages/warandpeace.html")nameList = bsObj.findAll("span", {"class":"green"})for name in nameList:    print(name.get_text())

  

* 取得 h1,h2,h3,h4,h5,h6

bsObj.findAll({"h1","h2","h3","h4","h5","h6"});

  

// javascript 產生引號 包裹每個元素的字串

function quote(s) {    return "\"" + s.split(",").join("\",\"") + "\"";}var s = "h1,h2,h3,h4,h5,h6"console.log(quote(s))

  

* 取得 span.green, span.red

bsObj.findAll("span", {"class":{"green", "red"}})

* 取得網頁中包含"the prince"內容的標籤數量

nameList = bsObj.findAll(text="the prince")print(len(nameList))

* 找到#text  id="text"

allText = bsObj.find(id="text")print(allText.get_text())

* 找到div#text

allText = bsObj.find("div", {"id":"text"})

* 找到div#text > span.red:first-child

red = bsObj.find("div", {"id":"text"}).find("span", {"class":"red"}, False)print(red.get_text())

  

 

python BeautifulSoup html解析

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.