This example describes how Python randomly assigns user-agent to each request when it uses Scrapy to collect data. Share to everyone for your reference. The specific analysis is as follows:
This method allows each request to change different user-agent to prevent the site from user-agent shielding scrapy spiders
First add the following code to the settings.py file, replacing the default user-agent processing module
The code is as follows:
Downloader_middlewares = {
' Scraper.random_user_agent. Randomuseragentmiddleware ': 400,
' Scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware ': None,
}
Custom useragent Processing Module
The code is as follows:
From scraper.settings import user_agent_list
Import Random
From scrapy import log
Class Randomuseragentmiddleware (object):
def process_request (self, request, spider):
UA = Random.choice (user_agent_list)
If UA:
Request.headers.setdefault (' User-agent ', UA)
#log. MSG (' >>>> UA%s '%request.headers)
Hopefully this article will help you with Python programming.