Scrapy Getting Started tutorial writing to the database

Last Update:2016-04-22 Source: Internet

Author: User

Tags mysql host xpath

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Then the previous article, http://blog.csdn.net/androidworkor/article/details/51176387, says how to write the captured data to the database.

1. Writing a crawler script

Or in the case of crawling embarrassing encyclopedia, write a script that is saved in the spider_qiushibaike.py file in the Hello/spiders directory

#-*-Coding:utf-8-*-import scrapyfrom hello.items import helloitemclass spider_qiushibaike (scrapy.        Spider): name = "Qiubai" start_urls = [' http://www.qiushibaike.com '] def parse (self, Response): For item in Response.xpath ('//div[@id = "Content-left"]/div[@class = "Article Block untagged mb15"]: Qiubai = H                Elloitem () icon = Item.xpath ('./div[@class = ' author Clearfix ']/a[1]/img/@src '). Extract () If icon:  icon = icon[0] qiubai[' usericon '] = icon UserName = Item.xpath ('./div[@class = "Author Clearfix "]/a[2]/h2/text ()"). Extract () If username:username = Username[0] Qiubai            [' userName '] = userName content = Item.xpath ('./div[@class = "Content"]/descendant::text () '). Extract () If Content:con = "for str in Content:con + = str Qiuba i[' content ' = con like = Item.xpaTh ('./div[@class = "Stats"]/span[@class = "Stats-vote"]/i/text () '). Extract () If like:like = Like[0] qiubai[' like ' = comment = Item.xpath ('./div[@class = "Stats"]/span[@class = "Stats-comments"] /a/i/text () '). Extract () If comment:comment = comment[0] qiubai[' comment '] = Comm ENT yield Qiubai

2. Create DATABASE 2.1 CREATE database

I'm using the SQLyog software. Open SQLyog, fill in the Security/storage connection, MySQL Host Address, user name, password as follows:

When you are finished, click Connect to enter the database interface as shown in:

Right click on the red box to circle the area, select Create database as shown in:

So the database is created.

2.2 Creating a Table

Select the database you just created, right-click and select Create TABLE as shown below:

An interface appears as shown in:

Fill in the Table name, field, type, and so on. So the table is built. The structure of the table must be consistent with the number and type of items.py defined previously

# -*- coding: utf-8 -*-# Define here the models for your scraped items## See documentation in:# http://doc.scrapy.org/en/latest/topics/items.htmlimport scrapyclass HelloItem(scrapy.Item):    # define the fields for your item here like:    userIcon = scrapy.Field()    userName = scrapy.Field()    content = scrapy.Field()    like = scrapy.Field()    comment = scrapy.Field()

3. Writing the pipelines.py file

To write the pipelines file, its file directory authority is this:

Where the code is as follows:

# -*- coding: utf-8 -*-# Define your item pipelines here## Don‘t forget to add your pipeline to the ITEM_PIPELINES setting# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.htmlimport pymysqldef dbHandle(): conn = pymysql.connect( host=‘localhost‘, user=‘root‘, passwd=‘root‘, charset=‘utf8‘, use_unicode=False ) return connclass HelloPipeline(object): def process_item(self, item, spider): dbObject = dbHandle() cursor = dbObject.cursor() sql = ‘insert into joke.t_baike(userIcon,userName,content,likes,comment) values (%s,%s,%s,%s,%s)‘ try: cursor.execute(sql,(item[‘userIcon‘],item[‘userName‘],item[‘content‘],item[‘like‘],item[‘comment‘])) dbObject.commit() except Exception,e: print e dbObject.rollback() return item

4. Configure the pipelines.py to take effect

Open the settings.py file and add a line of code

ITEM_PIPELINES = {    ‘Hello.pipelines.HelloPipeline‘: 300,}

That's all you can do. Complete as shown in:

5. Running

All the preparations have been done, now is the time to witness the miracle.
Open command line execution

scrapy crawl qiubai

At this point you can probably see a similar log, as shown in:

Okay, here it is, just to see if the database was inserted successfully.

The database is plugged in successfully, looking at whether the data is crawling correctly, opening qiushibaike.com, and

All right, it's done.

Allow me to make an advertisement:

If you don't know how the environment is built, check out this article: http://blog.csdn.net/androidworkor/article/details/51171098

If you are not familiar with the basic usage of scrapy you can read this article: http://blog.csdn.net/androidworkor/article/details/51176387

OK, what's the problem, please leave a message.

Scrapy Getting Started tutorial writing to the database

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More