Python uses scrapy to randomly assign a user-agent to each request when collecting data.
This example describes how Python uses scrapy to randomly allocate a user-agent for each request when collecting data. Share it with you for your reference. The specific analysis is as follows:
This method can be used to replace different user-agent requests each time to prevent the website from blocking scrapy spider Based on the user-agent.
First, add the following code to the settings. py file to replace the Default user-agent processing module.
Copy codeThe Code is as follows: DOWNLOADER_MIDDLEWARES = {
'Scraper. random_user_agent.RandomUserAgentMiddleware ': 400,
'Scrapy. contrib. downloadermiddleware. useragent. UserAgentMiddleware ': None,
}
Custom useragent Processing Module
Copy codeThe Code is as follows: from scraper. settings import USER_AGENT_LIST
Import random
From scrapy import log
Class RandomUserAgentMiddleware (object ):
Def process_request (self, request, spider ):
Ua = random. choice (USER_AGENT_LIST)
If ua:
Request. headers. setdefault ('user-agent', ua)
# Log. msg ('>>>> UA % s' % request. headers)
I hope this article will help you with Python programming.