Python Analog Browser Login

Last Update:2018-07-24 Source: Internet

Author: User

Tags set cookie urlencode

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Turn from: http://blog.csdn.net/shomy_liu/article/details/37658701

The previous article roughly describes two simple situations in which Python crawls a Web page, then learns about the login, and implements the login Renren

Let's summarize some of the steps to login

1, add cookie Configuration

Generally require account password login, direct URL or mimic browser operation is inaccessible, the general solution is to use a Python module is cookielib, used to remember the successful login to save to the local cookie;

The specific code to see everyone login bar

2, add the form information that the login needs to submit

General postdata information for the landing of the user name, password, and so many other information, the rest of the data is necessary, need to test; View this can be used Httpfox or review element nextwork inside, when you click Login, In the network of the review element, we can see the data information of post and get, and we can take it according to our needs.

The following is the code that imitates the login Renren; The code annotation is very fine. In order to recall ~ ~ Because the right is not very will, did not do crawl (just behind the learning Qaq)

[Python] View plain copy # -*- coding: cp936 -*- #renren login # filename: renren.py import urllib2,urllib,cookielib # Set cookie cookiejar= cookielib. Cookiejar () cookie=urllib2. Httpcookieprocessor (Cookiejar) Opener= urllib2.build_opener (cookie,urllib2. HttpHandler ()) Urllib2.install_opener (opener) #账号信息 email=raw_ Input (' input mailbox ') password=raw_input (' input password ') domain= ' renren.com ' #域名 url= ' http:// Www.renren.com/PLogin.do ' #可以通过审查元素得到 #httpfox抓取数据包信息, where headers and domain Optional postdata inside a lot of elements; the main username password #d对付反爬虫 headers={ ' User-agent ': ' mozilla/5.0 (windows nt 6.1; wow64; rv:30.0) Gecko/20100101 firefox/30.0 ' &NBSP;&NBSP;&Nbsp; } data={ ' email ' : email, ' password ' : password, ' domain ': domain } #编码data postdata = urllib.urlencode (data ) #发起请求 req=urllib2. Request (url,postdata,headers) #获取源码 print urllib2.urlopen (req). Read ()

Turn from: http://zipperary.com/2013/08/16/python-login/

I have posted several crawler code in my blog, which is very handy for downloading pictures in bulk. This kind of crawler is easier to implement. And some sites require users to log in before they can download files, the previous method is not done. Tell me today that using Python to simulate a browser's login process is ready for subsequent login downloads.

In the case of login, one of the additional modules is cookielib, which is used to remember the cookies saved to local after successful login, so that it is easy to cross between the pages of the website.

The first code example:

#encoding =utf8
Import urllib import
urllib2
import cookielib

# # #登录页的url
lgurl = ' http:// Mlook.mobi/member/login '

# # #用cookielib模块创建一个对象, and then use the URLLLIB2 module to create a cookie handler
cookie = cookielib. Cookiejar ()
Cookie_handler = Urllib2. Httpcookieprocessor (Cookie)

# # #有些网站反爬虫, where headers the program disguised as a browser
HDS = {' user-agent ': ' mozilla/5.0 (Windows NT 6.1 ) applewebkit/537.36 (khtml, like Gecko) chrome/28.0.1500.72 safari/537.36 '}  

# # #登录需要提交的表单
pstdata = {' Formhash ': ', #填入formhash
	' person[login] ': ', #填入网站的用户名
	' Person[password] ': ', #填入网站密码
	}

dt = Urllib.urlencode (pstdata) #表单数据编码成url识别的格式
req = urllib2. Request (url = lgurl,data = Dt,headers = HDS) #伪装成浏览器, access the page, and post the form data, there is no actual access, just create an object with that feature
opener = Urllib2.build_opener (Cookie_handler) #绑定handler, create a custom opener
response = Opener.open (req) #请求网页, return handle
page = Response.read () #读取并返回网页内容

print page #打印到终端显示

Explain:

I will not provide the username password here. For the form data to be submitted, chrome users can F12-> network-> Fill in the account password and log in-> find post in network ..., see screenshots.

Click "login" to enter the following image interface.

"from data" inside the data is more, usually need user name, password, the remaining data is necessary, need to test. For this website, also need "formhash".

No coding problems under Linux, win if the coding problem should be the terminal support for coding is not in place.

After the login is successful, the Cookie_handler that we create will automatically manage cookies, and if you need to access other pages later in the program, open their URLs with opener.

"user-agent" can also be found by F12.

More detailed and nice description please refer to here

This blog does not focus on the introduction of the principle, the focus is to record this simple block of code, other need to log on to the reptilian imitation.

The purpose of this program is to bulk download Mlook e-books. Now we have a problem:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More