Program simulation browser request and session persistence-python implementation, session-python

Last Update:2014-10-05 Source: Internet

Author: User

Tags set cookie

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Program simulation browser request and session persistence-python implementation, session-python

You can use urllib2 to read data from a page in python to easily implement requests.

import urllib2print urllib2.urlopen('http://www.baidu.com').read()

Header information, submitted POST data, and request pages are required for page post request operations.

The post data requires urllib. encode () to convert the dictionary to the format of "data1 = value1 & data2 = value2.

import urllibimport urllib2HEADER = {    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0',    'Referer' : 'http://202.206.1.163/logout.do'}POSTDATA = {    'data1': 'value1',    'data2': 'value2'}HOSTURL = 'http://xxx.com'enpostdata = urllib.urlencode(POSTDATA)urlrequest = urllib2.Request(hosturl,enpostdata,HEADER)urlresponse = urllib2.urlopen(urlrequest)print urlresponse.read()

After the request, the browser will have a session persistence process. sessions are stored in a cookie. The next page request will put the cookie in the request header, and the session will be disconnected if the cookie is lost.

Set cookie persistence in python.

# Cookie set # used to keep the session cj = cookielib. LWPCookieJar () cookie_support = encrypt (cj) opener = urllib2.build _ opener (cookie_support, urllib2.HTTPHandler) urllib2.install _ opener)

The following is a library file that summarizes the above knowledge points for ease of use:

# Filename: analogop. py

#! /Usr/bin/python #-*-coding: UTF-8-*-# author: First line # qq: 121866673 # mail: zxbd1016@163.com # message: I need a python job # time: import urllibimport urllib2import cookielib # cookie set # used to maintain session cj = cookielib. LWPCookieJar () cookie_support = urllib2.HTTPCookieProcessor (cj) opener = urllib2.build _ opener (cookie_support, urllib2.HTTPHandler) urllib2.install _ opener (opener) # default headerHEADER = {
'User-agent': 'mozilla/5.0 (Windows NT 6.1; WOW64; rv: 31.0) Gecko/20100101 Firefox/123456', 'Referer': 'http: // 202.206.1.163/logout. do '} # operate methoddef geturlopen (hosturl, postdata ={}, headers = HEADER): # encode postdata enpostdata = urllib. urlencode (postdata) # request url urlrequest = urllib2.Request (hosturl, enpostdata, headers) # open url urlresponse = urllib2.urlopen (urlrequest) # return url return urlresponse

This is a test file, because the reader does not have a test environment, you need to build your own or find a website for testing:

# Filename: test. py

from analogop import geturlopenpostd = {    'usernum': '2011411111',    'upw': '124569',    'userip': '192.168.10.1',    'token': 'xxx'}urlread = geturlopen('http://127.0.0.1:8000/login/', postd)print urlread.read().decode('utf-8')urlread = geturlopen('http://127.0.0.1:8000/chafen/', {})print urlread.read().decode('utf-8')

Tutorial: how to simulate a web browser to read dynamic web pages in python

It is troublesome to simulate the request process step by step. In addition, it is slow to bind a browser to crawl resources,

Or modify the browser engine to support js -------, which is too troublesome and difficult.

Python helps to simulate browser Problems

There are two routes in linux:
Using qt as the interface, you can still use webkit to open the webpage logon page, which is the same as in windows. This is my most recommended method.
2. Use the packet capture tool to analyze the login process. The browser does nothing more than security controls or js encryption. You can analyze these two encryption algorithms and then implement them using python. This is more difficult and is not recommended.

If you have to use python, you can also use tcl/tkinter as the interface, which is the same as the difficulty of the first method.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More