Detailed HTTP proxy for Python

Source: Internet
Author: User

0x00 Preface

Everyone should be very familiar with HTTP proxy, which is widely used in many aspects. The HTTP proxy is divided into two kinds: forward proxy and reverse proxy, which is usually used to provide the service to the user or load balance after the firewall, typically have nginx, haproxy and so on. This article discusses the forward proxy.

The most common use of HTTP proxies is for network sharing, network acceleration, and network throttling breakthroughs. In addition, HTTP proxies are often used for Web application debugging, Web API monitoring and analysis invoked in the Android/ios app, and the current well-known software is fiddler, Charles, Burp Suite, and Mitmproxy. HTTP proxies can also be used for request/response content modification, adding additional functionality to Web applications or changing application behavior without changing the server.

What is an HTTP proxy for 0x01?

An HTTP proxy is essentially a Web application, and it is not fundamentally different from other common Web applications. After the HTTP proxy receives the request, the host name of the host field in the header and the Get/post request address synthetically determine the target host, establish a new HTTP request and forward the request data, and forward the received response data to the client.

If the request address is an absolute address, the HTTP proxy uses host in that address, otherwise the host field in the header is used. To do a simple test, assume that the network environment is as follows:

    • Web Server

    • HTTP proxy Server

To test with Telnet

$ telnet

Note that the last two consecutive carriage returns are required, which is the HTTP protocol requirement. When you are finished, you can receive the HTTP:// content. Here's an adjustment, get request with absolute address

$ telnet http/1.0host:

Note that the host is also set to, but the result of the operation is to return the contents of the HTTP://HTTPBIN.ORG/IP page, that is, the public IP address information.

As you can see from the test process above, the HTTP proxy is not a very complex thing, as long as the original request is sent to the proxy server. In the case of unable to set the HTTP proxy, for a small number of hosts need to take the HTTP proxy scenario, the simplest way is to target the host domain name of the IP to the proxy server, you can take the way to modify the hosts file to achieve.

Set HTTP proxy in 0x02 python program

Urllib2/urllib proxy Settings

URLLIB2 is a Python standard library that is very powerful, just a little bit cumbersome to use. In Python 3, URLLIB2 is no longer retained and migrated to the Urllib module. The proxy server is set up by using Proxyhandler in Urllib2.

Proxy_handler = Urllib2. Proxyhandler ({' http ': ' '}) opener = Urllib2.build_opener (proxy_handler) R = ('/HTTP/ HTTPBIN.ORG/IP ') print ( ())

You can also use Install_opener to install the configured opener into the global environment so that all Urllib2.urlopen will automatically use the proxy.

Urllib2.install_opener (opener) R = Urllib2.urlopen (' ') print ( ())

In Python 3, use Urllib.

Proxy_handler = Urllib.request.ProxyHandler ({' http ': ''}) opener = Urllib.request.build_ Opener (Proxy_handler) R = (' ') print ( ())

Requests proxy settings

Requests is one of the best HTTP libraries of all time, and it is the library I use most when I construct HTTP requests at ordinary times. Its API design is very user-friendly, easy to use to get started. It is very simple to set up a proxy for requests, just give proxies a shape {'http': 'x.x.x.x:8080', 'https': 'x.x.x.x:8080'} parameter. Where HTTP and HTTPS are independent of each other.

In [5]: Requests.get (' ', proxies={' http ': ' '}). JSON () Out[5]: {' origin ': ' '}

The proxies property of the session can be set directly, eliminating the hassle of taking the proxies parameter with each request.

s = requests.session () s.proxies = {' http ': ' '}print (s.get (' HTTP://HTTPBIN.ORG/IP '). JSON ())

0X03 http_proxy/https_proxy Environment variables

Both the URLLIB2 and the requests libraries recognize http_proxy and HTTPS_PROXY environment variables, and the use of proxies is automatically set once the environment variables are detected. This is useful when debugging with an HTTP proxy because the IP address and port of the proxy server can be adjusted arbitrarily according to the environment variables without modifying the code. Most of the software in *nix also supports HTTP_PROXY environment variable identification, such as curl, wget, Axel, ARIA2C, and so on.

$ http_proxy= python-c ' import requests; Print (Requests.get ("HTTP://HTTPBIN.ORG/IP"). JSON ()) ' {U ' origin ': U ' '}$ http_proxy= Curl{  "origin": ""}

In a Ipython interactive environment, it may often be necessary to temporarily debug HTTP requests, which can be done simply by setting the os.environ['http_proxy'] Add/Cancel HTTP proxy.

In [245]: os.environ[' http_proxy '] = ' ' in [246]: Requests.get ("HTTP://HTTPBIN.ORG/IP"). JSON () out[ 246]: {U ' origin ': U ' '}in [249]: os.environ[' http_proxy '] = ' in []: Requests.get (" /IP "). JSON () out[250]: {U ' origin ': U ' x.x.x.x '}

0x04 Mitm-proxy

MITM originates from Man-in-the-middle Attack, a man-in-the-middle attack that intercepts, listens to, and tamper with data in a network between clients and servers.

Mitmproxy is a Python language development of open-source man-in-the-middle agent artifact, support SSL, support transparent proxy, reverse proxy, support traffic recording replay, support custom script and so on. Functionally similar to fiddler in Windows, but Mitmproxy is a console program with no GUI interface, but it's easy to use. With Mitmproxy, you can easily filter, intercept, modify any proxy HTTP request/response packets, and even use its scripting API to write scripts to automatically intercept and modify HTTP data.

# Test.pydef Response (Flow):    flow.response.headers["BOOM"] = "boom!boom!boom!"

The script above adds a header named boom to all the proxied HTTP response headers. mitmproxy -s ''start the Mitmproxy,curl verification with a command the results found that there was indeed a boom-head.

$ http_proxy=localhost:8080 curl-i ' ' http/1.1 OKServer:nginxDate:Thu, Geneva 09:02:04 Gmtconten T-type:application/jsoncontent-length:186connection:keep-aliveaccess-control-allow-origin: * Access-control-allow-credentials:trueboom:boom!boom!boom!...

Obviously Mitmproxy scripts can do far more than that, and with Python's powerful capabilities, there are many ways to use it. In addition, Mitmproxy also provides a powerful API, on the basis of these APIs, you can fully customize a special function to achieve a dedicated proxy server.

After the performance test, it is found that the efficiency of mitmproxy is not particularly high. If only for debugging purposes that's OK, but if you want to use the production environment, there are a lot of concurrent requests through the proxy, the performance is still slightly near. I use twisted to achieve a simple proxy, for the company's internal Web site to add functionality, improve the user experience, and later have the opportunity to share with you.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.