Today this article mainly introduces the Python web crawler-about simple analog login, has a certain reference value, now share to everyone, the need for friends can refer to
and access to the information on the Web page, you want to do a simulated login also need to send some information to the server, such as accounts, passwords and so on.
Analog login A site is roughly divided into such a few steps:
1. First locate the hidden information of the login website and save the content first (because there is no additional information on the website I am logged into here, so there is no information filter to save it)
2. Submit the information
3. Get the information after login
First give the source
<span style= "FONT-SIZE:14PX;" >#-*-coding:utf-8-*-import requestsdef Login (): Session = Requests.session () # res = Session.get (' HTTP://MY.ITS.CSU . edu.cn/'). Content login_data = {' UserName ': ' 3903150327 ', ' passWord ': ' 136510 ', ' Enter ': ' True '} session.post ('/HTTP/ my.its.csu.edu.cn//', data=login_data) res = session.get (' Http://my.its.csu.edu.cn/Home/Default ') print (Res.text) Login () </span>
First, filter to get hidden information
Enter the developer tool (press F12), find the network, manually first login, find the first request, at the bottom of the header there will be a data segment, this is the information required to log in. If you want to modify the hidden information in it
Get the contents of the Web page HTML first
res = session.get (' http://my.its.csu.edu.cn/'). Content
Filter content with regular expressions
Second, the submission of information
Find the action, and method, needed to submit the form in the source code
Use
Session.post (' http://my.its.csu.edu.cn/(here is the action submitted) ', Data=login_data)
The method submits the information
Third, get the information after login
After the information is submitted, the demo login succeeds.
The next step is to get the post-login information.
res = session.get (' Http://my.its.csu.edu.cn/Home/Default '). Content