Development environment Pycharm
The target site is the same as the previous one, for reference: http://dingbo.blog.51cto.com/8808323/1597695
But instead of running in a single file this time, create a scrapy project
1. Use the command-line tool to create a basic directory structure for a scrapy project
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/58/2D/wKiom1SrRJKRikepAAQI8JUhjJ0168.jpg "title=" 2015-01-06 10_11_49-pycrawl-[c__users_ibm_admin_pycharmprojects_pycrawl]-pycharm Community Editi.png "alt=" Wkiom1srrjkrikepaaqi8juhjj0168.jpg "/>
2. Edit items.py
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/58/2D/wKiom1SrSOKwiY7hAAEHISDRDTU720.jpg "title=" 2015-01-06 10_29_37-pycrawl-[c__users_ibm_admin_pycharmprojects_pycrawl]-... _project001_project0.png "alt=" Wkiom1srsokwiy7haaehisdrdtu720.jpg "/>
3. Under the Spiders directory, create a new spider1.py
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/58/2A/wKioL1SrSgPBH7QLAAKfhKxMmOs295.jpg "title=" Spiders.png "alt=" Wkiol1srsgpbh7qlaakfhkxmmos295.jpg "/>
The error is normal.
Instead of Pycharm project's directory structure, we imported Torrentitem according to the directory structure of Scrapy project
4. Running the spider
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/58/2A/wKioL1SrSuqg2ttgAAg_JV9F4PI884.jpg "title=" Run001.png "alt=" Wkiol1srsuqg2ttgaag_jv9f4pi884.jpg "/>
Crawl to data saved to file Mininova-data.json
This article is from the "big numbers for use" blog, please be sure to keep this source http://dingbo.blog.51cto.com/8808323/1599566
Python crawler frame Scrapy Learning Note 3-------First scrapy project