What is "reverse proxy"

Source: Internet
Author: User

Generally, the proxy server is only used to proxy internal network connection requests to the Internet. The client must specify the proxy server and send the http requests that were originally sent directly to the Web server to the proxy server. Because hosts on the external network do not configure and use this proxy server, the common proxy server is also designed to search for multiple uncertain servers on the Internet, instead of accessing a fixed server for requests from multiple clients on the Internet, the common Web Proxy server does not support external access requests to the internal network. When a proxy server can proxy hosts on the external network and access the internal network, this proxy service is called reverse proxy service (also called reverse proxy ).
The reverse proxy server can be used to record all user access behaviors, but this will also cause additional overhead of system resources. However, for different servers, the types of resources they protect are limited. We only need to record and control users' access to the resources we care about. For example, for an mp3 service provider, you only need to control the number of user requests to. mp3 files.

From the perspective of user access, illegal access involves two main aspects: instantaneous malicious multiple requests and long-lasting attacks. To address these two situations, we have developed the following control policies:

1) defend against instant malicious attacks

In this case, malicious online users use multiple threads to access the same resource or access multiple resources in a short time. In the first case, you can define a parameter m, that is, the number of threads simultaneously accessed by the same user. If this number is exceeded, the request will be suspended or abolished. To deal with the second case, the access control system should define two parameters, one being time t, in seconds. The second is the number of resources g. Use these two parameters to determine a rule r, that is, r = F (g, t ). R means that Internet users can access up to g resources in t seconds. Meanwhile, rule r can have multiple rules, which facilitates precise control.

2) defend against sustained attacks for a long time

The rule r = F (g, t) mentioned above can also be used to defend against continuous download for a long time. Here we define a rule k = F (t1, t2 ). T1 and t2 are time lengths, in seconds. T1 is the longest duration. T2 is the time interval for determining whether a user can access data continuously. For example, it can be specified that online users can access a type of resources for three consecutive hours (parameter t1). If an 8-hour (parameter t2) user does not access such resources, the user can be considered to be re-accessing.

Defining t1 and t2 For A Long Time does not prevent regular attacks by malicious users. For example, in order not to violate the above rules, he downloads three hours every eight hours. To deal with this situation, you can define multiple k rules. For example, you must take one minute off every 3 minutes of access and three minutes off every 10 minutes of access.
Reverse Proxy is the core of the access control system, and its core technology is address conversion. It ensures that the user uses the access control system without any settings on the client computer. From the perspective of appearance, that is, from the perspective of common users, reverse proxy is like a common Web server. Every Web Server proxy of the reverse proxy is like a directory in the reverse proxy server. For example, the reverse proxy server's own URL is http: // reverse-proxy, which can act as the proxy for Tsinghua homepage server (html "> http://www.tsinghua.edu.cn/index.html ), to access the Tsinghua homepage server through the reverse proxy server, you only need to access http: // reverse-proxy/www.tsinghua.edu.cn/index.html.

Implementation of reverse proxy
In reverse proxy, the directory in the reverse proxy server is mapped to another server to be proxy. In this way, the user's access problem can be solved only once, and the user cannot be accessed continuously through reverse proxy. Because the information on the proxy server is unknown, it is very likely that there will be an absolute address or an absolute link for the HTTP protocol. In this case, A normal user's next access will skip the reverse proxy and directly access the Real Server. This is what we don't want to see. Therefore, you must filter the information that the server returns to the user and change all absolute URLs to relative URLs through reverse proxy. Therefore, we use regular expressions to detect and replace all links. Therefore, the function of continuous access through reverse proxy is realized.

After a user's request arrives at the reverse proxy server, the reverse proxy uses certain rules to parse the directory information in the URL into the server and port information for socket connection, that is, the reverse proxy server simulates a client to send an http request to the real Web server. When response information is obtained, the Content-Type in the HTTP header and the file extension are used to determine the information category. If it is not HTML information (such as jpg, gif, zip, etc.), it will be sent to the user as is, if it is HTML (html, htm, shtml, etc ), then, the regular expression is used to create a string recognition machine to search for all link information in html text. For example, <a href = "...">, . Then determine the link type. For an absolute URL (starting with "http: //") that is within the proxy range (beyond the proxy scope ), then it is rewritten to the reverse proxy format. For example

<A href = "http://www.test.org">

It will be changed

<A href = "http: // reverse-proxy/www.test.org">

For absolute URI (headers starting with "/"), it also needs to be changed, for example

To be changed

.

You do not need to change the relative link. In addition to making changes to the HTML itself, the HTTP header information should also be parsed. The most important thing is to rewrite the Cookie Path. For example, if you receive a cookie from www.test.org

Set-Cookie: PART_NUMBER = ROCKET_LAUNCHER_0001; path =/

Rewrite

Set-Cookie: PART_NUMBER = ROCKET_LAUNCHER_0001; path =/www.test.org/

In addition, not only does the server information need to be rewritten, but the user's request information sent to the server also needs to be rewritten. The situation is similar to the one mentioned above.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.