Multiple ways to simulate a python login

Source: Internet
Author: User
This article mainly introduces the Python simulation login of a variety of methods, probably to provide you with four ways, each method to introduce you are very detailed, interested friends to see together

Body

Method One: Use a known cookie directly to access

Characteristics:

Simple, but need to login in browser first

Principle:

Simply put, the cookie is stored in the client that initiated the request, and the server uses the cookie to differentiate between the clients. Because HTTP is a stateless connection, when the server receives several requests at a time, it is unable to determine which requests were initiated by the same client. The "page after login" behavior requires the client to prove to the server that "I am the client I just logged in to". A cookie is then required to identify the client to store its information (such as login status).

Of course, this also means that as long as we get the other client's cookie, we can impersonate it to talk to the server. This is an opportunity for our program.

We first log in with a browser and then use the developer tools to view cookies. Then carry the cookie to the website to send a request to the Web site, you can make your program disguised as the browser you just logged in to get a page only to see.

Specific steps:

1. Log in with your browser to get the cookie string in your browser

Sign in with your browser first. Then open the developer tool and go to the Network tab. Find the current URL in the Name column on the left, select the Headers tab on the right, and view the request Headers, which contains the cookie that the website has issued to the browser. Yes, that's the string behind it. Copy it down and use it in the code for a while.

Note that it is best to log in before running your program. If you log in too early or close your browser, the cookie that is likely to be copied expires.

2. Write code

Version of the Urllib library:

Import sysimport iofrom urllib import requestsys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' UTF8 ') #改变标准输出的默认编码 # site URL to access after login = ' http://ssfw.xmu.edu.cn/cmstar/ Index.portal ' #浏览器登录后得到的cookie, that is, the string just copied cookie_str = R ' jsessionid=xxxxxxxxxxxxxxxxxxxxxx; Iplanetdirectorypro=xxxxxxxxxxxxxxxxxx ' #登录后才能访问的网页url = ' http://ssfw.xmu.edu.cn/cmstar/index.portal ' req = Request. Request (URL) #设置cookiereq. Add_header (' Cookie ', raw_cookies) #设置请求头req. Add_header (' user-agent ', ' mozilla/5.0 ( Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/60.0.3112.113 safari/537.36 ') resp = Request.urlopen (req) Print ( Resp.read (). Decode (' Utf-8 '))

Version of the Requests library:

Import requestsimport sysimport iosys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' UTF8 ') #改变标准输出的默认编码 # Web page URL to access after login = ' http://ssfw.xmu.edu.cn/cmstar/ Index.portal ' #浏览器登录后得到的cookie, that is, the string just copied cookie_str = R ' jsessionid=xxxxxxxxxxxxxxxxxxxxxx; Iplanetdirectorypro=xxxxxxxxxxxxxxxxxx ' #把cookie字符串处理成字典 so that the next use of cookies = {}for line in Cookie_str.split (';'): Key, Value = line.split (' = ', 1) cookies[key] = value

Method Two: Simulated login and then carry the cookie access

Principle:

We first issue a login request to the website in the program, which is to submit a form containing the login information (user name, password, etc.). Get a cookie from the response, and in the future, when you visit another page, you'll get a page that you can only see after you sign in.

Specific steps:

1. Find out which page the form was submitted to

Or take advantage of the browser's developer tools. Go to the Network tab and tick Preserve Log (important!). )。 Log in to the Web site in your browser. Then find the page that the form was submitted to in the Name column on the left. How to find it? Take a look to the right and go to the Headers tab. First, in the General section, the Request method should be post. Second, the bottom should have a section called form data, which can see the user name and password you just entered. You can also look at the name on the left, if it contains the word login, it is possible to submit a form page (not necessarily!). )。

It is important to emphasize that "the page submitted to the form" is not usually the page where you fill in your username and password! So use tools to find it.

2. Find the data you want to submit

Although you only fill in the user name and password when you log in the browser, but the data contained in the form is more than that. You can see all the data that needs to be submitted from the form data.

3. Write code

Version of the Urllib library:

Import Sysimport ioimport urllib.requestimport http.cookiejarsys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' UTF8 ') #改变标准输出的默认编码 # Data required for post at login = {' Login.token1 ': ' Study number ', ' Login.token2 ': ' Password ', ' goto:http ': '//ssfw.xmu.edu.cn/cmstar/loginsuccess.portal ', ' gotoonfail:http ': '// Ssfw.xmu.edu.cn/cmstar/loginfailure.portal '}post_data = Urllib.parse.urlencode (data). Encode (' Utf-8 ') # Set the request Header headers = {' user-agent ': ' mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/60.0.3112.113 safari/537.36 '} #登录时表单提交到的地址 (can be seen with developer tools) Login_url = ' http://ssfw.xmu.edu.cn/cmstar/userPasswordValidate.portal# constructs login request req = Urllib.request.Request (login_url, headers = headers, data = post_data) #构造cookiecookie = Http.cookiejar.CookieJar () #由cookie构造openeropener = Urllib.request.build_ Opener (Urllib.request.HTTPCookieProcessor (cookie)) #发送登录请求, the opener then carries a cookie to prove that he has logged in resp = Opener.open (req) # Web page URL to access after login = ' http://ssfw.xmu.edu.cn/cmstar/index.portal ' #构造访问请求req = urllib.request.RequesT (URL, headers = headers) resp = Opener.open (req) print (Resp.read (). Decode (' Utf-8 ')) 

Version of the Requests library:

Import requestsimport sysimport iosys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' UTF8 ') #改变标准输出的默认编码 # Web page URL to access after login = ' http://ssfw.xmu.edu.cn/cmstar/ Index.portal ' #浏览器登录后得到的cookie, that is, the string just copied cookie_str = R ' jsessionid=xxxxxxxxxxxxxxxxxxxxxx; Iplanetdirectorypro=xxxxxxxxxxxxxxxxxx ' #把cookie字符串处理成字典 so that the next use of cookies = {}for line in Cookie_str.split (';'): Key, Value = line.split (' = ', 1) cookies[key] = value# Set Request Header headers = {' user-agent ': ' mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/60.0.3112.113 safari/537.36 '} #在发送get请求时带上请求头和cookiesresp = Requests.get (URL, headers = headers, cookies = cookies) print (Resp.content.decode (' Utf-8 '))

Obviously feel the requests library to use more convenient AH ~ ~ ~

Method Three: Use the session to remain logged in after the simulation login status

Principle:

Session is the meaning of the conversation. The similarity to a cookie is that it also allows the server to "recognize" the client. The simple understanding is that each client and server interaction is treated as a "session". Since in the same "session", the server will naturally know whether the client has logged in.

Specific steps:

1. Find out which page the form was submitted to

2. Find the data you want to submit

These two steps are the same as the first two steps of method two.

3. Write code

Version of the requests library

Import requestsimport sysimport iosys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' UTF8 ') #改变标准输出的默认编码 # Data required for post at login = {' Login.token1 ': ' Study number ',  ' Login.token2 ': ' Password ', '  goto:http ': '//ssfw.xmu.edu.cn/cmstar/loginsuccess.portal ',  ' gotoonfail:http ': '// Ssfw.xmu.edu.cn/cmstar/loginfailure.portal '} #设置请求头headers = {' user-agent ': ' mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/60.0.3112.113 safari/537.36 '} #登录时表单提交到的地址 (can be seen with developer tools) Login_url = ' http://ssfw.xmu.edu.cn/cmstar/userPasswordValidate.portal ' #构造Sessionsession = requests. Session () #在session中发送登录请求, after which the session is stored cookie# can be viewed with print (Session.cookies.get_dict ()) resp = session.post (login _url, data) #登录后才能访问的网页url = ' http://ssfw.xmu.edu.cn/cmstar/index.portal ' #发送访问请求resp = session.get (URL) print ( Resp.content.decode (' Utf-8 '))

Method Four: Use a headless browser to access

Characteristics:

Powerful, can deal with almost any web page, but will lead to inefficient code

Principle:

If you can call a browser in the program to access the site, then a login-like operation is a breeze. In Python, you can use the Selenium library to invoke the browser, write the action in the Code (open the page, click ...). ) will become faithfully executed by the browser. This controlled browser can be firefox,chrome and so on, but the most common is the PHANTOMJS (no interface) browser. That is, just fill in the User name password, click on the "Login" button, open another webpage and other operations to write to the program, PHAMTOMJS will be able to actually let you log in the field, and the response back to you.

Specific steps:

1. Install Selenium library, Phantomjs browser

2. Locate the input text box and buttons in the source code when you log in

Because you want to do this in a headless browser, you'll need to find the input box before you can enter information. Find the Login button to click on it.

Open the user name password in the browser page, move the cursor to enter the user name of the text box, right-click, select "Review element", you can see in the right page source code The text box is the element. In the same way, you can find the text box, login button for the password entered in the source code.

3. Consider how to find the above elements in your program

The Selenium library provides find_element (s) _by_xxx methods to find input boxes, buttons, and other elements in a Web page. where XXX can be ID, name, tag_name (sign), Class_name (Class), XPath (XPath expression), and so on. Of course, the specific analysis of the Web page source code.

4. Write code

Import Requestsimport sysimport iofrom Selenium Import webdriversys.stdout = io. Textiowrapper (sys.stdout.buffer, encoding= ' UTF8 ') #改变标准输出的默认编码 # build Phantomjs Browser object, The brackets are phantomjs.exe on your computer with the path browser = Webdriver. Phantomjs (' D:/tool/07-net/phantomjs-windows/phantomjs-2.1.1-windows/bin/phantomjs.exe ') #登录页面url = R '/http Ssfw.xmu.edu.cn/cmstar/index.portal ' # Access login page browser.get (URL) # Wait a certain time, let the JS script loaded browser.implicitly_wait (3) # Enter user name username = browser.find_element_by_name (' user ') Username.send_keys (' study number ') #输入密码password = Browser.find_element_ By_name (' pwd ') password.send_keys (' password ') #选择 "student" radio button student = Browser.find_element_by_xpath ('//input[@value = ') Student "] Student.click () #点击" Login "button Login_button = Browser.find_element_by_name (' btn ') Login_button.submit () # Web page browser.save_screenshot (' Picture1.png ') #打印网页源代码print (Browser.page_source.encode (' Utf-8 '). Decode ()) Browser.quit ()

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.