In the company project, there are some reptiles need to use the domestic agent, some need to use foreign agents, some do not have agents
I tested three scenarios.
Programme one:
Turn on the agent in settings.py and rewrite the downloader_middlewares in the spider, but the override cannot take effect
Scenario Two:
Let Scrapy switch to a different settings.py file, this manual switch is effective, but I studied for a half day do not know if let the program automatically switch, finally give up
Programme III:
Write directly in middlewares.py and use Request.url to determine and enable the appropriate proxy
Class Proxymiddleware (object): Def process_request (self, request, spider): Url=request.url if ' Baidu.com ' in url:request.meta[' proxy '] = ' here set the domestic HTTP proxy ' elif ' facebook.com ' in URL: request.meta[' proxy ']= ' here set foreign HTTP proxy ' Else:pass
And finally, this is the solution.
Scrapy different agents for different sites