Background:
When I first started learning about the Scrapy crawler frame, I was thinking about the past if I performed a crawler task on the server. But I can't create a new project for every reptile task. For example, I built a crawling task that I knew about, but I wrote multiple spiders in this crawling task, and the important thing was that I wanted them to run at the same time.
Small WHITE Solution:
1, in the spiders with a new run.py file, the content is as follows (the list can add parameters, such as--nolog)
2, small white thought (at that time I), so also line, Mygod, that I would not be more than write a few lines on the line, the results (the result idiot), small white and want to, that add a while loop, the crawler names are written to a list, so the loop to get each spiders name, the result is even worse.
3. The following commands are limited to the role of quick commissioning or the crawling task of a single spider under a project.
From Scrapy.cmdline import executeexecute ([' scrapy ', ' crawl ', ' httpbin '])
Through learning to know the original is this way:
1, create any directory in spiders sibling, such as: commands
2. Create a crawlall.py file in it (here the file name is the custom command)
crawlall.py
From scrapy.commands import scrapycommandfrom scrapy.utils.project import Get_project_settingsclass Command ( Scrapycommand): requires_project = True def syntax (self): return ' [Options] ' def short_desc (self): return ' Runs all of the Spiders ' def run (self, args, opts): spider_list = self.crawler_process.spiders.list () for name in Spider_list: self.crawler_process.crawl (name, **opts.__dict__) Self.crawler_ Process.Start ()
3, to here is not finished, settings.py configuration file also need to add a bar.
Commands_module = ' project name '. Directory name '
' Zhihuuser.commands '
4, then the question came, if I wrote in spiders a number of crawling tasks, I said above so much, I eventually need how to execute, so easy! You can just put the following command in the planning task, OK.
Scrapy Crawlall
Python crawler scrapy How to perform multiple scrapy crawl tasks at the same time