Simulation landing domestic famous knowledge exchange website

Source: Internet
Author: User

Before looking at the front-end related knowledge for a long time, afraid of Python rusty, write a mock landing to restore

Zhihu Online Info Some of the need to log in to access the crawl, so you might as well try

1 First log in, then use Fiddler to grab the bag

Find login Zhihu need post the following data:

A? , the verification code went, forget it, no better.

The following will write code, wait, first look at the Zhihu response

The type of RESP is in JSON format, and after checking, the value of MSG is our login status, so we will print out this value to prove whether to log in.

2 below is not much to say, directly on the code

#!/usr/bin/python#-*-coding:utf-8-*-ImportRequests fromBs4ImportBeautifulSoupImportCookielibImportJsonhomepage='https://www.zXXXu.com/' #home Page URL#= R ' Zhihu_cookies.txt 'Session=requests.session () cookie= Cookielib. Cookiejar ()#This method can be used to temporarily store cookies" "session.cookies = Cookielib. Mozillacookiejar (filename) #这个方法是将cookie放入文件中try: Session.cookies.load (Filename=filename, Ignore_discard=true, Ignore_expirex=true) #gnore_discard的意思是即使cookies将被丢弃也将它保存下来, ignore_expires means that if the cookie already exists in the file, the overwrite of the original file is written, Except:print ' Cookie can not load! '" "Headers= {'Connection':'keep-alive',            'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',            'Accept-language':'en-us,en;q=0.8,zh-hans-cn;q=0.5,zh-hans;q=0.3',            'user-agent':'mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/45.0.2454.101 safari/537.36',            'accept-encoding':'gzip, deflate, SDCH',            'Host':'www.zXXXu.com',           }defget_xsrf (): Text= Session.get (homepage, headers=headers). Text Soup= BeautifulSoup (Text,'Html.parser') Result= Soup.find ('Div', class_='View View-signin'). Find ('input')['value']    returnresult#Get Verification CodedefGet_captcha ():PassdefLogin_zhihu (Phone, passwd): Login_url= homepage+'/login/phone_num'Data= {        '_XSRF':'%s'%get_xsrf (),'Password': passwd,'Phone_num': Phone,'Captcha_type':'cn'} result= Session.post (Login_url, Data=data, headers=headers)PrintJson.loads (Result.text) ['msg']#The body of result is the son format, and the value of ' msg ' is the login state    returnif __name__=='__main__': Phone= Raw_input ('Please input phone_num:') passwd= Raw_input ('Please input password:') URL= homepage +'/settings/profile'   #login Before you can access your profileLogin_zhihu (phone, passwd) resp_status= Session.get (URL, headers=headers, allow_redirects=false). Status_code#The jump action is closed here    PrintResp_status#return result is Access status code

There are two points to explain.

2.1 Cookie processing, I used a cookiejar stored a cookie, we can also ignore this step.

2.2 Headers must write all, before change a UA can login, now need to write on can, Zhihu is also anti-climb struggle (I was here to try many times to realize, we do not like me so silly)

3 and the final result is the return.

Finally, we recommend a Jianshu author of the Zhihu Crawler, which includes processing verification code (I am really annoying manual input) link address

Simulation landing domestic famous knowledge exchange website

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.