First, what is the reverse proxy
The reverse proxy method is a proxy server that accepts connection requests on the Internet, then forwards requests to servers on the internal network, and returns the results from the server to the clients that request connections on the Internet, Reverse. At this point the proxy server is represented as a server externally.
The usual proxy server, which is used only to proxy connection requests from the internal network to the Internet's external network, must specify a proxy server and send HTTP requests that would otherwise be sent directly to the Web server to the proxy server. External network connection requests to the internal network are not supported because the internal network is not visible to the external network. When a proxy server is able to proxy hosts on an external network, this proxy service is called a reverse proxy service when it accesses the internal network. At this point the proxy server is represented as a Web server, and the external network can simply treat it as a standard Web server without the need for a specific configuration. The difference is that the server does not save the real data of any Web page, all static Web pages or CGI programs are stored on the internal Web server. Therefore, the attack on the reverse proxy server does not cause the Web page information to be destroyed, which enhances the security of the Web server.
The reverse proxy is often referred to as Web server acceleration, a technique that reduces the load on the actual Web server by adding a high-speed Web caching server between the busy Web server and the external network. A reverse proxy is an acceleration feature for a Web server that acts as a proxy cache and is not intended for browser users, but for one or more specific Web servers, which can proxy external network access requests to the internal network.
The reverse proxy server forces the external network access to the server to be proxied, so that the reverse proxy server is responsible for receiving the client's request, then fetching the content on the source server, returning the content to the user, and saving the content locally so that it can receive the same information request later. It sends the contents of the local cache directly to the user to reduce the stress on the backend Web server and improve responsiveness.
Second, how the reverse proxy server works
A reverse proxy server typically has two models, which can be used as an alias for a content server or as a load balancer for a content server cluster.
1, as an alias for the content server
If your content server has sensitive information that must be kept secure, such as a credit card number database, you can set up a proxy server outside the firewall as an alias for the content server. When an external client tries to access the content server, it is sent to the proxy server. The actual content is located on the content server and is secured inside the firewall. The proxy server is outside the firewall and appears to the client as a content server.
When the client requests the site, the request goes to the proxy server. The proxy server then sends the client's request to the content server through a specific path in the firewall. The content server then passes the results back to the proxy server through the channel. The proxy server sends the retrieved information to the client, as if the proxy server is the actual content server (see Figure 2). If the content server returns an error message, the proxy server intercepts the message and changes any URLs that are listed in the header, and then sends the message to the client. This prevents external clients from getting the redirect URL of the internal content server.
In this way, the proxy server provides another barrier between the secure database and possible malicious attacks. As opposed to having access to the entire database, the perpetrator is at best limited to accessing the information involved in a single transaction, even if it is a fluke attack. An unauthorized user cannot access a real content server because the firewall path only allows the proxy server to have access.
2, as a load balancer for content servers
You can use multiple proxy servers within an organization to balance Network load across WEB servers. In this model, you can leverage the caching characteristics of the proxy server to create a server pool for load balancing. At this point, the proxy server can be on either side of the firewall. If the Web server receives a large number of requests per day, you can use a proxy server to share the load on the Web server and improve network access efficiency.
The proxy server acts as an intermediary mediator for requests made by the client to the real server. The proxy server stores the requested document in the cache. If there is more than one proxy server, DNS can select its IP address with "cyclic multiplexing" and randomly select the route for the request. The client uses the same URL each time, but the route taken by the request may go through a different proxy server at a time.
Multiple proxy servers can be used to handle requests for a high-volume content server, and the benefit is that content servers can handle higher loads and are more efficient than when they are working alone. During the initial boot, the proxy server retrieves the document from the content server for the first time, and thereafter the number of requests to the content server drops significantly.
Third, the advantages of reverse proxy
1, to solve the Web server external visible problems;
2, save a limited IP address resources, all the sites in the enterprise share a registered IP address in the Internet, these servers to assign private addresses, the use of virtual host to provide services;
3, the protection of the real Web server, the Web server is not visible, the external network can only see the reverse proxy server, and the reverse proxy server does not have real data, therefore, to ensure the security of the Web server resources;
4, speed up the website access speed, reduce the burden of Web server, reverse proxy has the function of caching Web pages, if the user needs content in the cache, you can directly from the proxy service to obtain, reduce the load of the Web server, but also speed up the user's access speed.
Iv. examples of Nginx as a reverse proxy for load balancing
Because Nginx has the advantage of dealing with concurrency, this application is now very common. Of course, Apache's mod_proxy and Mod_cache can also be used to implement reverse proxy and load balancing for multiple app servers, but Apache does not have the nginx expertise to handle concurrency.
1) Environment:
A. We are locally a Windows system and then use Virutalbox to install a virtual Linux system. Install Nginx (listen on port 8080) and Apache (listen for 80 ports) on your local Windows system. Install Apache on a virtual Linux system (listening on port 80). So we have 1 nginx in front as the reverse proxy server, and 2 Apache as the application server (can be considered as a small server cluster. ;-) ) ;
B. Nginx used as a reverse proxy server, placed before two Apache, as a user access to the portal; nginx only handles static pages, dynamic pages (PHP requests) are all delivered to the background of the two Apache to handle. In other words, the static pages or files of our website can be placed in the Nginx directory, and the dynamic pages and database access are reserved to the Apache server in the background.
C. The following two methods are introduced to implement server cluster load balancing.
We assume that the front-end nginx (for 127.0.0.1:80) contains only a static page index.html; two Apache servers in the background (LOCALHOST:80 and 158.37.70.143:80, respectively), One root directory to place the phpMyAdmin folder and test.php (the test code for the print "Server1";), the other root directory just put a test.php (inside the test code for the print "Server2";).
2) load balancing for different requests:
A. In the simplest way to build a reverse proxy (Nginx only handles static non-processing dynamic content, dynamic content to the background of the Apache server to handle), we are specifically set to: in nginx.conf: Location ~/.php$ {Proxy_ Pass 158.37.70.143:80; }
So when the client accesses the localhost:8080/index.html, the front-end Nginx will automatically respond;
When the user accesses the localhost:8080/test.php (at this time the Nginx directory does not have the file), but through the above settings location ~/.php$ ( Indicates that a regular expression matches a file that ends in. PHP, details see how location is defined and matched by Http://wiki.nginx.org/NginxHttpCoreModule), The Nginx server will automatically pass to 158.37.70.143 's Apache server. The server test.php will be automatically parsed, and then return the HTML results page to Nginx, and then the Nginx display (if nginx use memcached module or squid can also support cache), the output is printed server2.
As above is the simplest example of using nginx as a reverse proxy server;
B. We now extend the example above to support two servers, such as the one above.
We set the Server Module section of nginx.conf to modify the corresponding section to:
Location ^~/phpmyadmin/{Proxy_pass 127.0.0.1:80;} Location ~/.php$ {Proxy_pass 158.37.70.143:80;}
The first section above location ^~/phpmyadmin/, which means that no regular expression matching (^~) is used, but rather a direct match, that is, if the URL that the client accesses is prefaced with http://localhost:8080/phpmyadmin/( The local Nginx directory does not have a phpmyadmin directory at all, Nginx will automatically pass to the 127.0.0.1:80 Apache server, the server to the phpMyAdmin directory of the page to parse, and then send the results to Nginx, the latter display;
If the Client access URL is http://localhost/test.php, it will be processed by the Apache pass to 158.37.70.143:80.
Therefore, we have implemented load balancing for different requests.
If the user accesses the static page index.html, the most front-end Nginx responds directly;
If the user accesses the test.php page, 158.37.70.143:80 Apache responds;
If the user accesses the page under the directory phpMyAdmin, 127.0.0.1:80 Apache responds;
3) Load balancing to access the same page:
That is, when the user accesses http://localhost:8080/test.php this same page, we realize the load balancing of the two servers (in fact, the data on the two servers is consistent, Here we define the print Server1 and Server2 separately for the purpose of identifying the difference).
A. Now our situation is under Windows Nginx is localhost listening on port 8080;
Two Apache, one is 127.0.0.1:80 (contains test.php page but print server1), the other is the 158.37.70.143:80 of the virtual machine (including test.php page but print server2).
B. Therefore reconfigure the nginx.conf to:
First in the Nginx configuration file nginx.conf HTTP module Add, server Cluster server cluster (we are here are two) definition: upstream mycluster {server 127.0.0.1:80; server 158.37.70.143:80; } indicates that the server cluster contains 2 servers > is then defined in the server module, load balancing: Location ~/.php$ {Proxy_pass http://myCluster; The name here is the same as the name above cluster proxy_redirect off; Proxy_set_header Host $host; Proxy_set_header X-real-ip $remote _addr; Proxy_set_header x-forwarded-for $proxy _add_x_forwarded_for; In this case, if you visit the http://localhost:8080/test.php page, the Nginx directory does not have the file, but it will automatically pass it to the Mycluster defined service area cluster, respectively, by 127.0.0.1:80; Or 158.37.70.143:80, to do the work. Above the definition of the upstream, each server does not define a weight, indicating a balance between the two, if you want a more response, for example: upstream mycluster {server 127.0.0.1:80 weight=5;server 158.37.70.143:80; This represents a 5/6 chance to access the first SERVER,1/6 access to the second. In addition, parameters such as Max_fails and fail_timeout can also be defined.
In summary, we use the Nginx reverse proxy server reverse the function of proxy server, it is arranged to the front of multiple Apache server.
Nginx is only used to handle static page response and dynamic request of the proxy pass, the background Apache server as an app server to the foreground pass over the dynamic page processing and return to Nginx.
Through the above architecture, we can realize the load balance of nginx and multiple Apache cluster cluster. Two types of equalization:
1) can be defined in Nginx to access different content, proxy to a different background server, as in the example above to access the phpMyAdmin directory agent to the first server, access to the test.php agent on the second server;
2) can be defined in Nginx to access the same page, balanced (of course, if the server performance can be defined by the weight of a balanced) agent to different background server. As the example above accesses the test.php page, it will be evenly represented on the server1 or Server2.
In practice, the same app and data are kept on Server1 and Server2, and data synchronization between the two needs to be considered.
Article turned from: http://www.cnblogs.com/heluo/p/3922770.html
http://blog.csdn.net/keyeagle/article/details/6723408
Go: How reverse proxy server works