scrapy爬蟲結果插入mysql資料庫

來源:互聯網
上載者:User

標籤:zip   mrti   bugs   pom   str   uac   iocp   cpi   was   

1.通過工具建立資料庫scrapy

2.在scrapy資料庫中建立douban表

mysql> create table scrapy.douban(id int primary key auto_increment, name varchar(100) NOT NULL, author varchar(50) NULL, press varchar(100) NULL, date varchar(30) NULL, page varchar(30) NULL, price varchar(30) NULL, score varchar(30) NULL, ISBN varchar(30) NULL, author_profile varchar(1500) NULL, content_description varchar(1500) NULL, link varchar(255) NULL )default charset=utf8;

 

3.在scrapy爬蟲代碼中設定指向資料庫的參數pipeline.py

# -*- coding: utf-8 -*-# Define your item pipelines here## Don‘t forget to add your pipeline to the ITEM_PIPELINES setting# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.htmlimport jsonfrom twisted.enterprise import adbapifrom scrapy import logimport MySQLdbimport MySQLdb.cursorsclass DoubanPipeline(object):        def __init__(self):        self.file = open("./books.json", "wb")    def process_item(self, item, spider):        # 編碼的轉換        for k in item:            item[k] = item[k].encode("utf8")        line = json.dumps(dict(item), ensure_ascii=False) + "\n"        self.file.write(line)        return itemclass MySQLPipeline(object):    def __init__(self):        self.dbpool = adbapi.ConnectionPool("MySQLdb",                                           db = "scrapy",            # 資料庫名                                           user = "root",       # 資料庫使用者名稱                                            passwd = "qmf123456",     # 密碼                                           cursorclass = MySQLdb.cursors.DictCursor,                                            charset = "utf8",                                           use_unicode = False                                            )    def process_item(self, item, spider):        query = self.dbpool.runInteraction(self._conditional_insert, item)        query.addErrback(self.handle_error)        return item    def _conditional_insert(self, tb, item):        tb.execute("insert into douban (name, author, press, date, page, price, score, ISBN, author_profile,\                   content_description, link) values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)",\                   (item["name"], item["author"], item["press"], item["date"],                   item["page"], item["price"], item["score"], item["ISBN"],                   item["author_profile"], item["content_description"], item["link"]))        log.msg("Item data in db: %s" % item, level=log.DEBUG)    def handle_error(self, e):        log.err(e)

在setting.py檔案中設定

 

4.安裝MySQLdb驅動

MySQL-python-1.2.3.win-amd64-py2.7.exe

查看驅動是否安裝成功:

 

5.通過Python 的MySQLdb查詢資料庫資訊

import MySQLdbconn=MySQLdb.connect(host="127.0.0.1",user="root",passwd="qmf123456",db="scrapy")cursor = conn.cursor()n = cursor.execute("select count(*) from douban")for row in cursor.fetchall():    for r in row:        print r

https://my.oschina.net/u/993130/blog/213617

http://www.jb51.net/article/57290.htm

http://www.cnblogs.com/sislcb/archive/2008/11/24/1339913.html

http://drizzlewalk.blog.51cto.com/2203401/448874

scrapy爬蟲結果插入mysql資料庫

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.