HTTP proxy for Python

Source: Internet
Author: User
Tags haproxy
Everyone should be very familiar with HTTP proxy, which is widely used in many aspects. HTTP proxy is divided into two types: Forward proxy and reverse proxy. The latter is generally used to provide the services behind the firewall to users for access or load balancing. Typical examples include Nginx and HAProxy. This article discusses forward proxy. 0x00 preface

Everyone should be very familiar with HTTP proxy, which is widely used in many aspects. HTTP proxy is divided into two types: Forward proxy and reverse proxy. The latter is generally used to provide the services behind the firewall to users for access or load balancing. Typical examples include Nginx and HAProxy. This article discusses forward proxy.

The most common use of HTTP proxy is for network sharing, network acceleration, and network restriction breakthrough. In addition, HTTP proxy is often used for Web application debugging and Web API monitoring and analysis called in Android/IOS apps. Currently, well-known software include Fiddler, Charles, Burp Suite, and mitmproxy. The HTTP proxy can also be used to modify request/response content. when the server side is not changed, it can add additional functions or change application behavior for Web applications.

0x01 what is the HTTP proxy?

HTTP proxy is essentially a Web application, which is no different from other common Web applications. After receiving the request, the HTTP proxy determines the target Host based on the Host name in the Host field in the Header and the Get/POST request address, creates a new HTTP request, and forwards the request data, and forward the received response data to the client.

If the request address is an absolute address, the HTTP proxy uses the Host in the address; otherwise, the HOST field in the Header is used. Perform a simple test, assuming the network environment is as follows:

  • 192.168.1.2 Web server

  • 192.168.1.3 HTTP proxy server

Use telnet for testing

$ telnet 192.168.1.3GET / HTTP/1.0HOST: 192.168.1.2

Note that two consecutive carriage returns are required. this is the HTTP protocol requirement. After that, you can receive the http: // 192.168.1.2/page content. Make the following adjustments to bring the absolute address to the GET request.

$ telnet 192.168.1.3GET http://httpbin.org/ip HTTP/1.0HOST: 192.168.1.2

Note that the HOST is also set to 192.168.1.2, but the running result returns the content of the http://httpbin.org/ip page, that is, the public IP address information.

From the test process above, we can see that the HTTP proxy is not very complicated. you only need to send the original request to the proxy server. When an HTTP proxy cannot be set, the simplest way for a small number of hosts to use the HTTP proxy is to direct the IP address of the target Host domain name to the proxy server, you can modify the hosts file.

0x02 set the HTTP proxy urllib2/urllib proxy settings in the Python program

Urllib2 is a Python standard library with powerful functions, but it is a little troublesome to use. In Python 3, urllib2 is no longer retained and migrated to the urllib module. In urllib2, ProxyHandler is used to set the proxy server.

proxy_handler = urllib2.ProxyHandler({'http': '121.193.143.249:80'})opener = urllib2.build_opener(proxy_handler)r = opener.open('http://httpbin.org/ip')print(r.read())

You can also use install_opener to install the configured opener in the global environment, so that all urllib2.urlopen will automatically use the proxy.

urllib2.install_opener(opener)r = urllib2.urlopen('http://httpbin.org/ip')print(r.read())

Use urllib in Python 3.

proxy_handler = urllib.request.ProxyHandler({'http': 'http://121.193.143.249:80/'})opener = urllib.request.build_opener(proxy_handler)r = opener.open('http://httpbin.org/ip')print(r.read())
Requests proxy settings

Requests is currently one of the best HTTP libraries and is also the most frequently used library for constructing http requests. Its API design is user-friendly and easy to use. Setting a proxy for requests is simple. you only need to set a proxy for proxies{'http': 'x.x.x.x:8080', 'https': 'x.x.x.x:8080'}. Http and https are independent of each other.

In [5]: requests.get('http://httpbin.org/ip', proxies={'http': '121.193.143.249:80'}).json()Out[5]: {'origin': '121.193.143.249'}

You can directly set the proxies attribute of the session to save the trouble of bringing the proxies parameter to each request.

s = requests.session()s.proxies = {'http': '121.193.143.249:80'}print(s.get('http://httpbin.org/ip').json())
0x03 HTTP_PROXY/HTTPS_PROXY environment variable

Both the urllib2 and Requests libraries can recognize the HTTP_PROXY and HTTPS_PROXY environment variables. once these environment variables are detected, they are automatically set to use the proxy. This is useful when debugging with an HTTP proxy, because you can adjust the IP address and port of the proxy server according to the environment variables without modifying the code. * Most software in nix also supports HTTP_PROXY environment variable identification, such as curl, wget, axel, and aria2c.

$ http_proxy=121.193.143.249:80 python -c 'import requests; print(requests.get("http://httpbin.org/ip").json())'{u'origin': u'121.193.143.249'}$ http_proxy=121.193.143.249:80 curl httpbin.org/ip{  "origin": "121.193.143.249"}

In The IPython interaction environment, you may need to debug HTTP requests temporarily.os.environ['http_proxy']Add/cancel HTTP proxy.

In [245]: os.environ['http_proxy'] = '121.193.143.249:80'In [246]: requests.get("http://httpbin.org/ip").json()Out[246]: {u'origin': u'121.193.143.249'}In [249]: os.environ['http_proxy'] = ''In [250]: requests.get("http://httpbin.org/ip").json()Out[250]: {u'origin': u'x.x.x.x'}
0x04 MITM-Proxy

MITM originated from Man-in-the-Middle Attack, which refers to Man-in-the-Middle attacks. generally, it intercepts, listens to, and tamper with data on the network between the client and the server.

Mitmproxy is an open-source man-in-the-middle agent developed in Python. it supports SSL, transparent proxy, reverse proxy, traffic recording and playback, and custom scripts. Functions are similar to Fiddler in Windows, but mitmproxy is a console program with no GUI, but it is easy to use. Using mitmproxy, you can easily filter, intercept, and modify any HTTP request/response data packets that pass through the proxy, or even use its scripting API to write scripts to automatically intercept and modify HTTP data.

# test.pydef response(flow):    flow.response.headers["BOOM"] = "boom!boom!boom!"

The above script adds a header named BOOM to all Http response headers that have passed the proxy. Usemitmproxy -s 'test.py'Command to start mitmproxy. the curl verification result shows that there is indeed an additional BOOM header.

$ http_proxy=localhost:8080 curl -I 'httpbin.org/get'HTTP/1.1 200 OKServer: nginxDate: Thu, 03 Nov 2016 09:02:04 GMTContent-Type: application/jsonContent-Length: 186Connection: keep-aliveAccess-Control-Allow-Origin: *Access-Control-Allow-Credentials: trueBOOM: boom!boom!boom!...

Obviously, the mitmproxy script can do more than that. combined with the powerful functions of Python, many application approaches can be derived. In addition, mitmproxy also provides powerful APIs. Based on these APIs, you can customize a dedicated proxy server that implements special functions.

Performance tests show that the efficiency of mitmproxy is not very high. If it is only for debugging purposes, but if you want to use the production environment, when there are a large number of concurrent requests through the proxy, the performance is slightly worse. I use twisted to implement a simple proxy, which is used to add features to the company's internal websites and improve user experience. I will share with you later.

The above is a detailed explanation of the HTTP proxy of Python. For more information, see other related articles in the first PHP community!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.