Program simulation browser request and session persistence-python implementation, session-python

Source: Internet
Author: User
Tags set cookie

Program simulation browser request and session persistence-python implementation, session-python

You can use urllib2 to read data from a page in python to easily implement requests.

import urllib2print urllib2.urlopen('http://www.baidu.com').read()

 

Header information, submitted POST data, and request pages are required for page post request operations.

The post data requires urllib. encode () to convert the dictionary to the format of "data1 = value1 & data2 = value2.

import urllibimport urllib2HEADER = {    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0',    'Referer' : 'http://202.206.1.163/logout.do'}POSTDATA = {    'data1': 'value1',    'data2': 'value2'}HOSTURL = 'http://xxx.com'enpostdata = urllib.urlencode(POSTDATA)urlrequest = urllib2.Request(hosturl,enpostdata,HEADER)urlresponse = urllib2.urlopen(urlrequest)print urlresponse.read()

 

After the request, the browser will have a session persistence process. sessions are stored in a cookie. The next page request will put the cookie in the request header, and the session will be disconnected if the cookie is lost.

 

Set cookie persistence in python.

# Cookie set # used to keep the session cj = cookielib. LWPCookieJar () cookie_support = encrypt (cj) opener = urllib2.build _ opener (cookie_support, urllib2.HTTPHandler) urllib2.install _ opener)

 

The following is a library file that summarizes the above knowledge points for ease of use:

# Filename: analogop. py

#! /Usr/bin/python #-*-coding: UTF-8-*-# author: First line # qq: 121866673 # mail: zxbd1016@163.com # message: I need a python job # time: import urllibimport urllib2import cookielib # cookie set # used to maintain session cj = cookielib. LWPCookieJar () cookie_support = urllib2.HTTPCookieProcessor (cj) opener = urllib2.build _ opener (cookie_support, urllib2.HTTPHandler) urllib2.install _ opener (opener) # default headerHEADER = {
'User-agent': 'mozilla/5.0 (Windows NT 6.1; WOW64; rv: 31.0) Gecko/20100101 Firefox/123456', 'Referer': 'http: // 202.206.1.163/logout. do '} # operate methoddef geturlopen (hosturl, postdata ={}, headers = HEADER): # encode postdata enpostdata = urllib. urlencode (postdata) # request url urlrequest = urllib2.Request (hosturl, enpostdata, headers) # open url urlresponse = urllib2.urlopen (urlrequest) # return url return urlresponse

 

This is a test file, because the reader does not have a test environment, you need to build your own or find a website for testing:

# Filename: test. py

from analogop import geturlopenpostd = {    'usernum': '2011411111',    'upw': '124569',    'userip': '192.168.10.1',    'token': 'xxx'}urlread = geturlopen('http://127.0.0.1:8000/login/', postd)print urlread.read().decode('utf-8')urlread = geturlopen('http://127.0.0.1:8000/chafen/', {})print urlread.read().decode('utf-8')

 


Tutorial: how to simulate a web browser to read dynamic web pages in python

It is troublesome to simulate the request process step by step. In addition, it is slow to bind a browser to crawl resources,

Or modify the browser engine to support js -------, which is too troublesome and difficult.

Python helps to simulate browser Problems

There are two routes in linux:
Using qt as the interface, you can still use webkit to open the webpage logon page, which is the same as in windows. This is my most recommended method.
2. Use the packet capture tool to analyze the login process. The browser does nothing more than security controls or js encryption. You can analyze these two encryption algorithms and then implement them using python. This is more difficult and is not recommended.

If you have to use python, you can also use tcl/tkinter as the interface, which is the same as the difficulty of the first method.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.