The method to open, execute, and debug the scrapy crawler under pycharm, pycharmscrapy

Source: Internet
Author: User
Tags pycharm community edition

The method to open, execute, and debug the scrapy crawler under pycharm, pycharmscrapy

First, you must have a Scrapy project. I have created a new Scrapy project named test on the Desktop. Open the command line in the Desktop directory and type the command:scrapy startproject test1


The directory structure is as follows:




Open Pycharm and select open


Select project, OK

Open the following interface and press alt + 1 to open the project panel.


In the test1/spiders/, folder, create a crawler spider. py.name="dmoz". This name will be used later.

Create a begin. py file under the same directory of test1 and scrapy. cfg (which is easy to understand and can be written as main. py). Note the name indicated by Arrow 2 andname='dmoz'The names are the same.

from scrapy import cmdlinecmdline.execute("scrapy crawl dmoz".split())

7. The above file is finished. Configure pycharm below. Click Run> Edit events

8. Create a New python Module

9. Name: change to spider; script: select the newly created begin. py file; Working Direciton: change to your Working directory.


10. Now it's done. click the button in the upper-right corner to run.

Debugging

You can set breakpoints in other code to run the program debug.



Problem Encountered

1. Unknown command: crawl

Debug and run. The breakpoint is not hit. The console output is as follows:

H:\Python\Python36\python.exe "H:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py" --multiproc --client 127.0.0.1 --port 59810 --file H:/Python/Python36/Lib/site-packages/scrapy/cmdline.py crawl quotes -o quotes.jlpydev debugger: process 4740 is connectingConnected to pydev debugger (build 141.3058)Scrapy 1.3.2 - no active projectUnknown command: crawlUse "scrapy" to see available commandsProcess finished with exit code 2

The working directory settings are incorrect, leading to unrecognized scrapy commands. As mentioned above, set the working directory to include scrapy. cfg and run it again to solve the problem.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.