1. Read the local cookie file saved by the Selenium module to access the
Read the http://www.cnblogs.com/strivepy/p/9233389.html saved local cookie to access the user settings interface, saved with Selenium The JSON file is in the following format:
1[{"Domain":"www.zhihu.com","expiry": 1527855266.402958,"HttpOnly": false,"name":"Tgw_l7_route","Path":"/","Secure": false,"value":"200d77f3369d188920b797ddf09ec8d1"},2{"Domain":". zhihu.com","expiry": 1622462366.40309,"HttpOnly": false,"name":"d_c0","Path":"/","Secure": false,"value":"\ "Afakky_hrg2ptvlvtwew-ok8mrlkop4ijzy=|1527854371\""}, 3{"Domain":". zhihu.com","HttpOnly": false,"name":"_XSRF","Path":"/","Secure": false,"value":"7da6b4e4-c77d-47a4-81fa-68b1262235c8"}.... Deleted from the back)
Contains a lot of information that can not be used, such as path,secure , etc., in the Read cookie only need to read the name of each cookie and The Value property. The code is placed in the module named zhihu.py :
1 #-*-coding:utf-8-*-2 3 ImportRequests4 ImportJSON5 ImportOS6 fromRequests.cookiesImportRequestscookiejar7 8 9 defParse_index ():TenURL ='Https://www.zhihu.com/settings/account' Oneheaders = { A 'user-agent':'mozilla/5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/65.0.3325.146 safari/537.36' - } -cookies =getcookies_decode_to_dict () the #cookies = Getcookies_decode_to_cookiejar () - #the Requests.get () method's cookie parameter only receives dict or Cookiejar objects -Response = Requests.get (Url=url, Headers=headers, cookies=cookies) - Print(Response.url) + Print(Response.text) - + A defgetcookies_decode_to_dict (): atPath = OS.GETCWD () +'/cookies/' - if notos.path.exists (path): - Print('The cookie file does not exist, please run cookiesload.py first') - Else: -Cookies_dict = {} -with open (path +'Cookies.txt','R') as F: incookies =json.loads (F.read ()) - forCookiesinchCookies: tocookies_dict[cookie['name']] = cookie['value'] + returncookies_dict - the * defGetcookies_decode_to_cookiejar (): $Path = OS.GETCWD () +'/cookies/'Panax Notoginseng if notos.path.exists (path): - Print('The cookie file does not exist, please run cookiesload.py first') the Else: +Cookiejar =Requestscookiejar () Awith open (path +'Cookies.txt','R') as F: thecookies =json.loads (F.read ()) + forCookiesinchCookies: -Cookiejar.set (cookie['name'], cookie['value']) $ returnCookiejar $ - - if __name__=='__main__': theParse_index ()
Get to the source code display, has been successfully crawled to the user Settings page.
Python3 using the requests library to read locally saved cookie files for login-free access