Python crawler scrapy How to perform multiple scrapy crawl tasks at the same time

Source: Internet
Author: User

Background:

When I first started learning about the Scrapy crawler frame, I was thinking about the past if I performed a crawler task on the server. But I can't create a new project for every reptile task. For example, I built a crawling task that I knew about, but I wrote multiple spiders in this crawling task, and the important thing was that I wanted them to run at the same time.

Small WHITE Solution:

1, in the spiders with a new run.py file, the content is as follows (the list can add parameters, such as--nolog)

2, small white thought (at that time I), so also line, Mygod, that I would not be more than write a few lines on the line, the results (the result idiot), small white and want to, that add a while loop, the crawler names are written to a list, so the loop to get each spiders name, the result is even worse.

3. The following commands are limited to the role of quick commissioning or the crawling task of a single spider under a project.

From Scrapy.cmdline import executeexecute ([' scrapy ', ' crawl ', ' httpbin '])

  

Through learning to know the original is this way:

1, create any directory in spiders sibling, such as: commands

2. Create a crawlall.py file in it (here the file name is the custom command)

  

  

crawlall.py
From scrapy.commands import scrapycommandfrom scrapy.utils.project import Get_project_settingsclass Command ( Scrapycommand):    requires_project = True    def syntax (self):        return ' [Options] '    def short_desc (self):        return ' Runs all of the Spiders '    def run (self, args, opts):        spider_list = self.crawler_process.spiders.list ()        for name in Spider_list:            self.crawler_process.crawl (name, **opts.__dict__)        Self.crawler_ Process.Start ()

3, to here is not finished, settings.py configuration file also need to add a bar.

Commands_module = ' project name '. Directory name '

' Zhihuuser.commands '

4, then the question came, if I wrote in spiders a number of crawling tasks, I said above so much, I eventually need how to execute, so easy! You can just put the following command in the planning task, OK.

Scrapy Crawlall

Python crawler scrapy How to perform multiple scrapy crawl tasks at the same time

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.