python 爬取豆瓣電影案例

來源:互聯網
上載者:User

標籤:https   get   count   androi   mobile   ini   html_   html   擷取   

# conding=utf-8from  parse import parse_urlimport jsonclass DoubanSpider:    def __init__(self):        self.temp_url = "https://m.douban.com/rexxar/api/v2/subject_collection/filter_movie_occident_hot/items?os=android&for_mobile=1&callback=jsonp3&start={}&count=18&loc_id=108288&_=0"    def get_content_list(self,html_str): #提取資料        dict_data = json.loads(html_str)        content_list = dict_data["subject_collection_items"]        total =dict_data["total"]        return content_list,total    def save_content_list(self,content_list):        with open("db.json","a",encoding="utf-8") as f:            for content in content_list:                f.write(json.dumps(content,ensure_ascii=False))                f.write("\n")                print(‘添加成功‘)    def run(self): #實現主要邏輯        num = 0        total = 100        while num < total + 18:            # 1.start_url            start_url = self.temp_url.format(num)            # 2.發送請求,擷取響應            html_str = parse_url(start_url)            # 3.提取資料            content_list, total = self.get_content_list(html_str)            # 4.儲存            self.save_content_list(content_list)            # 5.構造下一頁的url地址,迴圈2-5步            num +=18if __name__== ‘__main__‘:        douban = DoubanSpider()        douban.run()

 

python 爬取豆瓣電影案例

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.