Summary of the methods for simulating HttpRequest in Python, pythonhttprequest

Last Update:2016-02-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python can be said to be a powerful tool for network crawling. This article mainly introduces some methods and techniques for simulating http requests using python.

Python processes two request class libraries: urllib and urllib2. These two libraries are not two different versions of a class library. urllib is mainly used to process url-related content. When sending a request, the request object can only be a url. Urllib2 can use a request object to implement a request, such as forging a header, setting a proxy, http get, http post, and other methods.

Read this article to learn about the basic knowledge of http requests, such:

What is httpwebrequest and httpwebresponse?
What is get and post?
What is cookie?

This article describes the methods used to simulate requests:

Set proxy
Counterfeit Header or Header information
Enable cookie
Processing url parameters

Use urllib2.urlopen to send messages directly

Import urllib2url = 'HTTP: // www.baidu.com/'response = urllib2.urlopen (url) # urlopen accepts the input parameter either string or requestresponse_text = response. read ()

Use urllib. build_opener to send requests directly

import urllib2url = 'http://www.baidu.com/'opener = urllib2.build_opener()response = opener.open(url)response_text = response.read()

Access the site through proxy

proxy_handler = urllib2.ProxyHandler({"http" : 'http://localhost:8888'})opener = urllib2.build_opener(proxy_handler)response = opener.open(url)response_text = response.read()

Request body (http post) attached to the request)

opener = urllib2.build_opener()response = opener.open(url,'request body')response_text = response.read()

If the body is in the key-value format, you can refer to the url Processing Section below for processing.

Enable Cookie

cookie = cookielib.CookieJar()opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))response = opener.open(url)response_text = response.read()

Use urllib2.Request to add custom Header information

request = urllib2.Request(url)request.add_data('1234567')request.add_header('User-Agent', 'fake-client')response = urllib2.urlopen(request)

Process parameter information in a url

Whether using the get or post method, parameters are often used. You can use the following class library to process parameters.

Convert the parameter set to string

para = {'111':'222','aaa':'bbb'}encodeurl = urllib.urlencode(para)

Output AAAS = bbb & 111 = 222

Convert url parameters to dictionary

url = 'https://www.baidu.com/s?wd=python%20url%20querystring&pn=10&oq=python%20url%20querystring&tn=baiduhome_pg&ie=utf-8&usm=1&rsv_idx=2&rsv_pq=d09af93600035cb8&rsv_t=d151qRmNNdybGINHcKbyO360E2%2Fg%2FUs2t0MiKqRQXwhHZuNF3IlKyyStzYuofVZczQA3'splitresult_instance = urlparse.urlsplit(url)

Output object:

SplitResult (scheme = 'https', netloc = 'www .baidu.com ', path ='/s ', query = 'wd = python % 20url % 20 querystring & pn = 10 & oq = python % 20url % 20 querystring & tn = baiduhome_pg & ie = UTF-8 & usm = 1 & rsv_idx = 2 & rsv_pq = d09af93600035cb8 & rsv_t = d151qRmNNdybGINHcKbyO360E2 % 2Fg % 2FUs2t0MiKqRQXwhHZuNF3IlKyyStzYuofVZczQA3 ', fragment = '')

If you want to convert to a set

result_dic=urlparse.parse_qs(splitresult.query)

In this way, put the data information in the url to implement http get and put it in the body to implement http post.

This article is also hosted on a http://simmon.club/blog/Python-HttpRequest/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Summary of the methods for simulating HttpRequest in Python, pythonhttprequest

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Summary of the methods for simulating HttpRequest in Python, pythonhttprequest

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support