Python crawler--Using cookies to simulate login

Source: Internet
Author: User

Described earlier, the use of forms to fill in account numbers, user name of the way to simulate the login know. If the login is successful, you can then log in with the cookie without repeating the previous steps.

ImportRequestsImportHttp.cookiejar fromBs4Importbeautifulsoupsession=requests. Session () session.cookies= Http.cookiejar.LWPCookieJar ("Cookies") Agent='mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) maxthon/5.1.2.3000 chrome/55.0.2883.75 safari/537.36'Headers= {    "Host":"www.zhihu.com",    "Origin":"https://www.zhihu.com/",    "Referer":"http://www.zhihu.com/",    'user-agent': Agent}postdata= {    'Password':' *******',#fill in the password    ' Account':' ********',#fill in the account}response= Session.get ("https://www.zhihu.com", headers=headers) Soup= BeautifulSoup (Response.content,"Html.parser") XSRF= Soup.find ('input', attrs={"name":"_XSRF"}). Get ("value") postdata['_XSRF'] =Xsrfresult= Session.post ('Http://www.zhihu.com/login/email', Data=postdata, headers=headers) Session.cookies.save (Ignore_discard=true, Ignore_expires=true)

After running, a cookie file appears in the folder where the code resides.

Now load the cookie login:

ImportRequestsImportHttp.cookiejar as Cookielibsession=requests.session () session.cookies= Cookielib. Lwpcookiejar (filename='Cookies')Try: Session.cookies.load (Ignore_discard=True)except:       Print("Cookie failed to load")defislogin (): URL="https://www.zhihu.com/"Login_code= Session.get (URL, headers=headers, allow_redirects=False). Status_codeifLogin_code = = 200:        returnTrueElse:        returnFalseif __name__=='__main__': Agent='mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) maxthon/5.1.2.3000 chrome/55.0.2883.75 safari/537.36'Headers= {        "Host":"www.zhihu.com",        "Origin":"https://www.zhihu.com/",        "Referer":"http://www.zhihu.com/",        'user-agent': Agent}ifIsLogin ():Print('you are already logged in')

Show after run: You are already logged in.

The primary function of the Cookielib module is to provide objects that store cookies so that the requests module can be used together to access Internet resources. The Cookielib module is very powerful, and we can use the object of the Cookiejar class of this module to capture cookies and resend them on subsequent connection requests, such as the ability to implement the impersonation login function. The main objects of the module are Cookiejar, Filecookiejar, Mozillacookiejar, Lwpcookiejar.

Their relationship: cookiejar--derived-->filecookiejar--derived-–>mozillacookiejar and Lwpcookiejar

The default is that Filecookiejar does not implement the Save function.

and Mozillacookiejar or Lwpcookiejar have been realized.

So you can use Mozillacookiejar or Lwpcookiejar, to automatically implement the cookie save.

Cookiejar

/

Filecookiejar

/                   \

Mozillacookiejar Lwpcookiejar

Python crawler--Using cookies to simulate login

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.