HTTP Proxy Server Technology selection tour

Source: Internet
Author: User
Tags virtualenv

HTTP Proxy Server Technology selection tour background

For a long time, paste bar developers, business coupling, demand changes frequently, so prone to bug. And I am responsible for advertising-related business, and the UI is closely related, once for some reason (or even changed the code) produced a bug, it is bound to significantly affect advertising revenue.

One way to solve the problem is to test frequently, since there is no code-level coupling, and it is always possible to avoid problems with timed checks. So we maintain a core set of case, keeping a close eye on the core features. Choosing a core case is actually a trade-off between coverage and test costs, but multiple case has different test steps, and testing efficiency is always difficult to improve.

Therefore, our goal is to create a proxy server that can change the data of any package (including the online package) to the way I want it to be at runtime. In other words, this proxy can also be understood as a server, it can obtain the client's request data and make changes, but also can get the response data and modify the service side.

Proxy Server Working model

In earlier versions, we chose the simple HTTP protocol. This choice of technology requires the least, we have implemented a proxy server, open the socket, listen to the port, and then send the client's request to the server, and then return the server back data to the client. This model is also known as the "man-in-the-middle mode" (Mitm:man in the middle).

Although the truth is very simple, but the realization of some places to pay attention to. First, when the socket accepts the data, it should open a new process/thread for processing. Since the new process/thread is involved, it is important to pay attention to its release time, otherwise it will result in unlimited memory increase.

Second, socket it doesn't wait for the function, which means I don't know when the data is readable, so this daunting task is given select . We pass in the socket object that needs to be monitored as a parameter, and the function blocks until there is a readable, writable object, or it reaches the time-out.

Keep-AliveFields can be reused for TCP connections, which is a common way to optimize HTTP protocols, which is already the default option in HTTP 1.1. When this field is filled in, the data returned by the server may be batched, which improves the user experience but also makes the proxy server more difficult to implement. Therefore, the proxy server should remove this field when it requests data to the real server as a client.

As the entire process is self-fulfilling, it is easier to HOOK up the downstream data and make changes. The whole process can be expressed simply by noticing that all data is received and then modified:


Working mode of Proxy Server technology selection short connection

Because the long connection is based on TCP, there is no need to create a new connection each time, also omitted unnecessary HTTP message header, efficiency is significantly better than HTTP. So the major companies have basically opted for a long connection as the actual production environment of the connection mode. However, since we are not familiar with the WebSocket protocol and we still support short connections, the proxy server finally chooses the HTTP protocol.

To achieve this, the simulation backend sends a control message to the client when the app starts, forcing the client to select an HTTP request. This way, even the online package can walk the proxy server.

HTTPS

Since Apple has forced the use of HTTPS, although it has been postponed, but also next year's trend. Taking into account the subsequent use, we decided to upgrade the previously implemented proxy server. Since HTTPS involves parsing of request protocols, as well as encryption and decryption and certificate management, the above-mentioned self-research program is difficult to hold. After some research, finally selected a more well-known open Source Library Mitmproxy.

Mitmproxy

The main reason to choose this library is that it supports HTTPS directly, but without Chinese documents, the domestic use is relatively small, so it may take a little time to access it.

This is a Python library, first to install virtualenv , if not installed locally input:

sudo pip install virtualenv
After the installation, enter the mitmproxy/venv3.5/bin folder input:
source./active
This enables the VIRTUALENV environment to be enabled.
Hook Script

This library can be understood as an interactive version of Charles in the command line, but I'm not going to use it for this function. Because my demand is mainly to use the script Hook request, so I chose mitmdump this tool. You can specify the script when you use it:

-s "xxx.py"

The script is also very simple, we can rewrite requeest or receive function:

def request(flow):        flow.response.content = "<p>hello world</p>"

After running the script, the phone's proxy is set to the native IP address, the port number is changed to 8080, and then the mobile browser to open http://mitm.it/, if everything is well-configured, you will see the installation interface of the certificate.

Once you have installed your certificate, you should see a small one on your mobile phone to access any website (including HTTPS), so that hello world all configurations are complete.

Bug modification

This open source Library has a very serious bug that can occur when parsing data of the multipart type. It uses splitline methods to split line breaks, but it is lost if there is one in the data \n . Unfortunately, there are many PROTOBUF encoded data \n , which, if lost, can cause parsing to fail.

If you're unlucky enough to meet me in the same hole, you can change the code to my version:

 forIinchContent.split (b"--"+boundary): Parts= I.split (b'\r\n\r\n',2)    ifLen (Parts) >1and parts[0][0:2]! = B"--": Match= Rx.search (parts[0])        ifMatch:key= Match.group (1) Value= parts[1][0: Len (parts[1])-2] # Remove last \ r \ n r.append ((key, value))
More

At this point, we have basically successfully implemented a proxy server that supports HTTPS. The follow-up may be to resolve PROTOBUF, perfect business code and so trivial things, as long as careful, basically will not have problems.

HTTP Proxy Server Technology selection tour

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.