Scrapy Getting Started tutorial writing to the database

Source: Internet
Author: User
Tags mysql host xpath

Then the previous article, http://blog.csdn.net/androidworkor/article/details/51176387, says how to write the captured data to the database.

1. Writing a crawler script

Or in the case of crawling embarrassing encyclopedia, write a script that is saved in the spider_qiushibaike.py file in the Hello/spiders directory

#-*-Coding:utf-8-*-import scrapyfrom hello.items import helloitemclass spider_qiushibaike (scrapy.        Spider): name = "Qiubai" start_urls = [' http://www.qiushibaike.com '] def parse (self, Response): For item in Response.xpath ('//div[@id = "Content-left"]/div[@class = "Article Block untagged mb15"]: Qiubai = H                Elloitem () icon = Item.xpath ('./div[@class = ' author Clearfix ']/a[1]/img/@src '). Extract () If icon:  icon = icon[0] qiubai[' usericon '] = icon UserName = Item.xpath ('./div[@class = "Author Clearfix "]/a[2]/h2/text ()"). Extract () If username:username = Username[0] Qiubai            [' userName '] = userName content = Item.xpath ('./div[@class = "Content"]/descendant::text () '). Extract () If Content:con = "for str in Content:con + = str Qiuba i[' content ' = con like = Item.xpaTh ('./div[@class = "Stats"]/span[@class = "Stats-vote"]/i/text () '). Extract () If like:like = Like[0] qiubai[' like ' = comment = Item.xpath ('./div[@class = "Stats"]/span[@class = "Stats-comments"] /a/i/text () '). Extract () If comment:comment = comment[0] qiubai[' comment '] = Comm ENT yield Qiubai
2. Create DATABASE 2.1 CREATE database

I'm using the SQLyog software. Open SQLyog, fill in the Security/storage connection, MySQL Host Address, user name, password as follows:

When you are finished, click Connect to enter the database interface as shown in:

Right click on the red box to circle the area, select Create database as shown in:

So the database is created.

2.2 Creating a Table

Select the database you just created, right-click and select Create TABLE as shown below:

An interface appears as shown in:

Fill in the Table name, field, type, and so on. So the table is built. The structure of the table must be consistent with the number and type of items.py defined previously

# -*- coding: utf-8 -*-# Define here the models for your scraped items## See documentation in:# http://doc.scrapy.org/en/latest/topics/items.htmlimport scrapyclass HelloItem(scrapy.Item):    # define the fields for your item here like:    userIcon = scrapy.Field()    userName = scrapy.Field()    content = scrapy.Field()    like = scrapy.Field()    comment = scrapy.Field()
3. Writing the pipelines.py file

To write the pipelines file, its file directory authority is this:

Where the code is as follows:

# -*- coding: utf-8 -*-# Define your item pipelines here## Don‘t forget to add your pipeline to the ITEM_PIPELINES setting# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.htmlimport pymysqldef dbHandle(): conn = pymysql.connect( host=‘localhost‘, user=‘root‘, passwd=‘root‘, charset=‘utf8‘, use_unicode=False ) return connclass HelloPipeline(object): def process_item(self, item, spider): dbObject = dbHandle() cursor = dbObject.cursor() sql = ‘insert into joke.t_baike(userIcon,userName,content,likes,comment) values (%s,%s,%s,%s,%s)‘ try: cursor.execute(sql,(item[‘userIcon‘],item[‘userName‘],item[‘content‘],item[‘like‘],item[‘comment‘])) dbObject.commit() except Exception,e: print e dbObject.rollback() return item
4. Configure the pipelines.py to take effect

Open the settings.py file and add a line of code

ITEM_PIPELINES = {    ‘Hello.pipelines.HelloPipeline‘: 300,}

That's all you can do. Complete as shown in:

5. Running

All the preparations have been done, now is the time to witness the miracle.
Open command line execution

scrapy crawl qiubai

At this point you can probably see a similar log, as shown in:

Okay, here it is, just to see if the database was inserted successfully.

The database is plugged in successfully, looking at whether the data is crawling correctly, opening qiushibaike.com, and

All right, it's done.

Allow me to make an advertisement:

If you don't know how the environment is built, check out this article: http://blog.csdn.net/androidworkor/article/details/51171098

If you are not familiar with the basic usage of scrapy you can read this article: http://blog.csdn.net/androidworkor/article/details/51176387

OK, what's the problem, please leave a message.

Scrapy Getting Started tutorial writing to the database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.