0x00 Preface
The HTTP proxy should be very familiar to all, and it has a very wide application in many aspects. HTTP proxy is divided into forward proxy and reverse proxy, the latter is generally used to provide the services behind the firewall to the user access or load balancing, typically Nginx, haproxy, and so on. The forward proxy is discussed in this article.
The most common uses of HTTP proxies are for network sharing, network acceleration, and network limit breakthroughs. In addition, HTTP proxies are often used for Web application debugging, Web API monitoring and analysis invoked in the Android/ios APP, and the current well-known software is fiddler, Charles, Burp Suite and Mitmproxy. HTTP proxies can also be used for request/response content modification, adding additional functionality or changing application behavior to Web applications without changing the service side.
What is the 0x01 HTTP proxy?
The HTTP proxy is essentially a Web application, and it is not fundamentally different from other common Web applications. After the HTTP proxy receives the request, the host name and the Get/post request address are combined to determine the target host, create a new HTTP request and forward the request data, and forward the received response data to the client.
If the request address is an absolute address, the HTTP proxy takes the host in that address, otherwise the host field in the header is used. To do a simple test, assume that the network environment is as follows:
192.168.1.2 Web server
192.168.1.3 HTTP proxy Server
Using Telnet for testing
$ telnet 192.168.1.3
get/http/1.0
host:192.168.1.2
Note that the final need for two consecutive return, this is the HTTP protocol requirements. When you are finished, you can receive the HTTP://192.168.1.2/page content. Here's an adjustment, with an absolute address on a GET request
$ telnet 192.168.1.3 get
http://httpbin.org/ip http/1.0
host:192.168.1.2
Note that the host is also set to 192.168.1.2, but the result returns the content of the HTTP://HTTPBIN.ORG/IP page, that is, the public IP address information.
As you can see from the test process above, the HTTP proxy is not a very complex thing, just send the original request to the proxy server. In the case of unable to set the HTTP proxy, for a small number of host need to go to the HTTP proxy scenario, the simplest way is to target host domain name IP to the proxy server, you can take the way to modify the hosts file to achieve.
Setting up HTTP proxies in 0x02 python programs
Urllib2/urllib proxy Settings
urllib2
It's a Python standard library, it's powerful, but it's a little bit cumbersome to use. In Python 3, URLLIB2 is no longer retained and migrated to the Urllib module. A proxy server is set up using the Proxyhandler in Urllib2.
Proxy_handler = Urllib2. Proxyhandler ({' http ': ' 121.193.143.249:80 '})
opener = Urllib2.build_opener (proxy_handler)
r = Opener.open (' HTTP://HTTPBIN.ORG/IP ')
print (R.read ())
You can also use Install_opener to install the configured opener into the global environment, so that all urllib2.urlopen will automatically use the proxy.
Urllib2.install_opener (opener)
r = Urllib2.urlopen (' http://httpbin.org/ip ')
print (R.read ())
In Python 3, use Urllib.
Proxy_handler = Urllib.request.ProxyHandler ({' http ': ' http://121.193.143.249:80/'})
opener = Urllib.request.build_opener (proxy_handler)
r = Opener.open (' http://httpbin.org/ip ')
print (R.read ())
Requests proxy settings
Requests is one of the best HTTP libraries at the moment, and the most frequently used when constructing HTTP requests. Its API design is very user-friendly and easy to use. Setting up the proxy for requests is simple, and you only need to give proxies a parameter such as {' http ': ' x.x.x.x:8080 ', ' https ': ' x.x.x.x:8080 '}. Where HTTP and HTTPS are independent of each other.
In [5]: Requests.get (' http://httpbin.org/ip ', proxies={' http ': ' 121.193.143.249:80 '}). JSON ()
out[5]: {' origin ' : ' 121.193.143.249 '}
You can set the proxies property of the session directly, eliminating the hassle of having to take proxies parameters on each request.
s = requests.session ()
s.proxies = {' http ': ' 121.193.143.249:80 '}
print (S.get (' HTTP://HTTPBIN.ORG/IP '). JSON ())
0X03 http_proxy/https_proxy environment variable
Both the URLLIB2 and the Requests libraries recognize http_proxy and HTTPS_PROXY environment variables, and the use of proxies is automatically set when these environment variables are detected. This is useful when debugging with an HTTP proxy, because you can adjust the IP address and port of the proxy server according to the environment variables without modifying the code. Most of the software in *nix also supports HTTP_PROXY environment variable identification, such as curl, wget, Axel, ARIA2C, etc.
$ http_proxy=121.193.143.249:80 python-c ' import requests; Print (Requests.get ("HTTP://HTTPBIN.ORG/IP"). JSON ()) '
{U ' origin ': U ' 121.193.143.249 '}
$ http_proxy= 121.193.143.249:80 Curl httpbin.org/ip
{
"origin": "121.193.143.249"
}
In Ipython interactive environments, HTTP requests may often need to be debugged temporarily, and can be implemented simply by setting up os.environ[' http_proxy ' to add/remove HTTP proxies.
In [245]: os.environ[' http_proxy '] = ' 121.193.143.249:80 ' in
[246]: Requests.get ("HTTP://HTTPBIN.ORG/IP"). JSON ()
out[246]: {U ' origin ': U ' 121.193.143.249 '}
In [249]: os.environ[' http_proxy '] = ' in
[to]: Requests.get ("HTTP://HTTPBIN.ORG/IP"). JSON ()
out[250]: { U ' origin ': U ' x.x.x.x '}
0x04 Mitm-proxy
MITM originates from Man-in-the-middle Attack, which refers to man-in-the-middle attacks, typically intercepting, monitoring, and tampering data between the client and server networks.
Mitmproxy is a Python language developed open source broker artifact that supports SSL, supports transparent proxies, reverse proxies, supports traffic recording playback, supports custom scripts, and so on. Functionally similar to the Fiddler in Windows, but Mitmproxy is a console program, there is no GUI interface, but it is easy to use. Using Mitmproxy can easily filter, intercept, modify any proxy HTTP request/response packets, and even use its scripting API to write scripts to automatically intercept the purpose of modifying HTTP data.
# test.py
def response (flow):
flow.response.headers["BOOM" = "boom!boom!boom!"
The above script adds a header named boom to all the proxy HTTP response headers. Starting the Mitmproxy,curl verification with the Mitmproxy-s ' test.py ' command turns out to be a boom-head.
$ http_proxy=localhost:8080 curl-i ' httpbin.org/get '
http/1.1 OK
server:nginx date:thu
, Nov 2016 09: 02:04 GMT
content-type:application/json
content-length:186
connection:keep-alive
Access-control-allow-origin: *
access-control-allow-credentials:true
boom:boom!boom!boom!
...
Obviously the Mitmproxy script can do much more than that, and with Python's powerful features, there are many ways to use it. In addition, Mitmproxy also provides a powerful API, on the basis of these APIs, can fully customize a dedicated proxy server to achieve special functions.
After the performance test, it is found that the efficiency of mitmproxy is not particularly high. If it's just for debugging purposes, that's fine, but if you're using a production environment and there's a lot of concurrent requests passing through the proxy, the performance is a little bit close. I used twisted to implement a simple proxy, to add functionality to the company's internal Web site, improve the user experience, and then have the opportunity to share with you.