標籤:open flat code sel 準備 for firefox error: agent
參考:
- https://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden
- https://segmentfault.com/q/1010000000470724
通過測試應該是request中header的問題。
1 class S0819MtimeTiantangPipeline(object): 2 def process_item(self, item, spider): 3 headers = { 4 "User-Agent": ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:53.0) Gecko/20100101 Firefox/53.0‘, 5 "Accept": ‘text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8‘, 6 "Accept-Language": ‘zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3‘, 7 "Accept-Encoding": ‘gzip, deflate‘, 8 "Upgrade-Insecure-Requests": 1, 9 ‘Connection‘: ‘keep-alive‘,10 }11 12 13 req = urllib2.Request(url=item[‘addr‘], headers=headers)14 res = urllib2.urlopen(req)
下面是我怎麼得到正確的header的方法:
1. 準備:
Firefox瀏覽器+HttpFox外掛程式
2. 步驟
1. 開啟HttpFox,然後將一個你要request的url輸入到Firefox瀏覽框裡,斷行符號
例: http://img31.mtime.cn/pi/2013/01/15/163845.87188937_1000X1000.jpg
2. 如選取所需要的header
urllib2.HTTPError: HTTP Error 403: Forbidden 解決方案