recently for some reasons, the need to use the Python simulation login site, but previously did not understand this block, and the target site login method is more complex, so the card is here, so I decided to start with a simple simulation, and gradually in-depth study of this piece.
Note: This article is for Exchange learning purposes only.
Login Features: Clear text transmission, with special flag data
The session object requests. The session is able to maintain certain parameters across the request, such as cookies, which are the same as all requests made at the same session instance, and the requests module automatically processes cookies each time, This makes it easy to handle cookies when logging in. In the processing of the cookie, the conversation object can be a sentence over several sentences urllib the operation under the module. That is equivalent to Urllib in:
CJ = Http.cookiejar.CookieJar () Pro = Urllib.request.HTTPCookieProcessor (CJ) opener = Urllib.request.build_opener (PRO) Urllib.request.install_opener (opener)
Analog Login v-Station
The task of this article is to use request. Session Demo Login V2ex (http://www.v2ex.com/) This site, namely V station.
Tools: Python 3.5,beautifulsoup module, requests module, Chrome
The data captured when this site was logged in is as follows:
Where the user name (U), password (p) is transmitted in clear text, very convenient. Once words from the analysis login Url:http://www.v2ex.com/signin source file () can be seen, should be the unique data of each login, we need to catch it in advance and put it into the form data post to the site.
Catch out or Old method, with BeautifulSoup artifact can. Here again a method of grasping the elements inside the tag, such as grasping the "value" above, with soup.find (' input ', {' name ': ' Once '}) [' Value '] can be
The value corresponding to value in the input tag containing name= "Once" is fetched.
So build postdata, then Post.
How to show login success? Here by visiting http://www.v2ex.com/settings, because this URL is not logged in is not to be seen:
After the above analysis, write out the source code (refer to the Code of ALEXKH):
Import requestsfrom BS4 Import beautifulsoupurl = "Http://www.v2ex.com/signin" UA = "mozilla/5.0 (Windows NT 6.3; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/49.0.2623.13 safari/537.36 "header = {" User-agent ": UA, " Refere R ":" Http://www.v2ex.com/signin " }v2ex_session = requests. Session () F = v2ex_session.get (url,headers=header) soup = BeautifulSoup (f.content, "html.parser") once = Soup.find (' Input ', {' name ': ' Once '}) [' Value ']print (once) postdata = {' u ': ' Whatbeg ', ' P ': ' * * * * ', ' once ': once, ' Next ': '/' }v2ex_session.post (URL, data = postdata, headers = header) F = v2ex_session.get (' http:// Www.v2ex.com/settings ', Headers=header) print (F.content.decode ())
Then run the discovery successful login:
The page source code is the Http://www.v2ex.com/settings code. Here once is 91279.
At this point, the login succeeds.
Python data analysis Python analog login (i) requests. Session Application