2017-10-09 19:06:22
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Preface:
Get cookies First and then automatically log in to watercress and Sina Weibo
System environment:
64-bit WIN10 system with python2.7 and python3.6 two versions (this time using python3.6), IDE pycharm, browser Chorme, Python third party libraries for requests
View cookies:
First login to the homepage, and login to the account (note the best practice crawler with the trumpet), right-click on the network, and then press Fn+f5 refresh the page, click the top www.douban.com option, you can find the cookie information
Login:
Copy the cookie into the following code:
Importrequestsheaders= {'user-agent':"'}cookies= {'Cookies':"'}url='http://www.douban.com'R= Requests.get (url, cookies = cookies, headers =headers) with open ('Douban_2.txt','wb+') as F:f.write (r.content)
Note: User-agent is also obtained and copied into the code as above
Run the code, you can find the "douban_2.txt" text file in the script file directory, which is the source code of the Watercress Login home page.
Python crawler + use cookies to log in watercress