Described earlier, the use of forms to fill in account numbers, user name of the way to simulate the login know. If the login is successful, you can then log in with the cookie without repeating the previous steps.
ImportRequestsImportHttp.cookiejar fromBs4Importbeautifulsoupsession=requests. Session () session.cookies= Http.cookiejar.LWPCookieJar ("Cookies") Agent='mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) maxthon/5.1.2.3000 chrome/55.0.2883.75 safari/537.36'Headers= { "Host":"www.zhihu.com", "Origin":"https://www.zhihu.com/", "Referer":"http://www.zhihu.com/", 'user-agent': Agent}postdata= { 'Password':' *******',#fill in the password ' Account':' ********',#fill in the account}response= Session.get ("https://www.zhihu.com", headers=headers) Soup= BeautifulSoup (Response.content,"Html.parser") XSRF= Soup.find ('input', attrs={"name":"_XSRF"}). Get ("value") postdata['_XSRF'] =Xsrfresult= Session.post ('Http://www.zhihu.com/login/email', Data=postdata, headers=headers) Session.cookies.save (Ignore_discard=true, Ignore_expires=true)
After running, a cookie file appears in the folder where the code resides.
Now load the cookie login:
ImportRequestsImportHttp.cookiejar as Cookielibsession=requests.session () session.cookies= Cookielib. Lwpcookiejar (filename='Cookies')Try: Session.cookies.load (Ignore_discard=True)except: Print("Cookie failed to load")defislogin (): URL="https://www.zhihu.com/"Login_code= Session.get (URL, headers=headers, allow_redirects=False). Status_codeifLogin_code = = 200: returnTrueElse: returnFalseif __name__=='__main__': Agent='mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) maxthon/5.1.2.3000 chrome/55.0.2883.75 safari/537.36'Headers= { "Host":"www.zhihu.com", "Origin":"https://www.zhihu.com/", "Referer":"http://www.zhihu.com/", 'user-agent': Agent}ifIsLogin ():Print('you are already logged in')
Show after run: You are already logged in.
The primary function of the Cookielib module is to provide objects that store cookies so that the requests module can be used together to access Internet resources. The Cookielib module is very powerful, and we can use the object of the Cookiejar class of this module to capture cookies and resend them on subsequent connection requests, such as the ability to implement the impersonation login function. The main objects of the module are Cookiejar, Filecookiejar, Mozillacookiejar, Lwpcookiejar.
Their relationship: cookiejar--derived-->filecookiejar--derived-–>mozillacookiejar and Lwpcookiejar
The default is that Filecookiejar does not implement the Save function.
and Mozillacookiejar or Lwpcookiejar have been realized.
So you can use Mozillacookiejar or Lwpcookiejar, to automatically implement the cookie save.
Cookiejar
/
Filecookiejar
/ \
Mozillacookiejar Lwpcookiejar
Python crawler--Using cookies to simulate login