On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.
On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.
Now there are many large portals such as SINA are using squid reverse proxy technology to speed up access to the Web site, can be different URL requests distributed to the background of different WEB servers, while the Internet users can only see the reverse proxy server address, enhance the site's access security.
The concept of reverse proxy
The reverse proxy server, also known as the Web Accelerator server, sits on the front of the Web server and acts as a Web service
The content buffer for the device. Its system structure is shown in Fig. 1
Figure 1. System Structure
The reverse proxy server is set for the Web server, the background Web server is transparent to the Internet user, the user can only see the address of the reverse proxy server, it is not clear how the background Web server is structured. When an Internet user requests a WEB service, DNS resolves the requested domain name to the IP address of the reverse proxy server, so that the URL request is sent to the reverse proxy server, which is responsible for handling the user's request and reply and interacting with the backend WEB server. The use of reverse proxy server reduces the load of the back-end Web server, improves the access speed, and avoids the security hidden trouble caused by the user communicating directly with the Web server.
|
back to the top of the page |
|
Realization principle of SQUID reverse proxy
At present, there are many reverse agent software, more famous have Nginx and Squid. Nginx is a high-performance HTTP and reverse proxy server and a IMAP/POP3/SMTP proxy server developed by Igor Sysoev, the second rambler.ru site for Russian traffic.
Squid is a research project funded by the U.S. government, which aims to solve the problem of insufficient network bandwidth, support HTTP, Https,ftp and other protocols, is now used on Unix systems, the most versatile and the most complete set of software. The following will focus on the implementation of SQUID reverse proxy and improve the performance of the Web site application.
Squid The reverse proxy server is located between the local WEB server and the Internet, and the organizational structure is shown in Figure 2:
Figure 2. Organizational Structure
When a client requests access to a WEB service, DNS resolves the domain name visited to the IP address of the SQUID reverse proxy server, so that the client's URL request is sent to the reverse proxy server. If the resource of the request is cached in the Squid reverse proxy server, the requested resource is returned directly to the client, otherwise the reverse proxy server requests the resource from the background WEB server, and then returns the requested answer to the client, while also caching the answer locally for use by the next requester.
SQUID reverse proxy typically caches only buffered data (such as HTML pages and images), while some CGI scripts or dynamic programs such as ASP or JSP are not cached by default. It buffers the static page based on the HTTP header tag returned from the WEB server. There are four most important HTTP header tags: last-modified: Tell the Reverse proxy page what time was modified Expires: Tell the Reverse proxy page what time should be removed from the buffer Cache-control: Tell the reverse proxy page should be buffered Pragma: Used to contain implementation-specific directives, the most commonly used is Pragma:no-cache
|
back to the top of the page |
|
Use Squid reverse proxy to accelerate website instance
This instance's domain name is wenjin.cache.ibm.com.cn, through the DNS polling technology, distributes the client request to one Squid reverse proxy server processing, if this squid caches the user's request resources, then returns the request resources directly to the user, otherwise this Taiwan Squid sends no cached requests to the neighbor Squid and backend Web server processing according to the configured rules, which reduces the load on the background Web server and improves the performance and security of the entire site. The system structure Figure 3 is as follows:
Figure 3. System Structure
Configured system environment: one DNS server: Operating system FREEBSD, software BIND 9.5,ip 192.168.76.222; three Squid server: Operating system Linux as 4, software Squid 3.0, the corresponding IP is as follows:
Three WEB servers: Operating system Linux as 4, application software Tomcat 5.0+mysql, the corresponding IP address is as follows:
webserver1:192.168.76.227
|
Installation and configuration of application software
Configuring a DNS server
The software utilizes Freebsd's own bind 9.5. Bind is then configured for the system, first modifying the Bind profile/etc/namedb/named.conf, adding in the file
Add the cache.ibm.com.cn file in the/etc/namedb/master directory, which reads as follows:
$TTL
@ in SOA search. ibm.com.cn. Root ibm.com.cn.
20080807
3600
900
3600)
In NS
1 in PTR
Wenjin in A
Wenjin in A
Wenjin in A 192.168.76.225 |
In this way, when the user requests, DNS resolves the wenjin.cache.ibm.com.cn domain name to one of 192.168.76.223, 192.168.76.224, and 192.168.76.225 by polling mechanism.
After the configuration is complete, run RNDC star T to start the bind service. You can set the named_enable= "YES" in/etc/rc.conf to boot from boot.
Use Ps–a |grep named to see if the bind service is up;
Use Nslookup wenjin.cache.ibm.com.cn to test whether the BIND service is functioning properly.
Configure the SQUID1 server to download the squid-3.0.stable8.tar.gz source package and place it in the/home directory to extract TAR–ZXVF squid-3.0.stable8.tar.gz
Set Configuration parameters: CD Squid-3.0.stable10
./configure–prefix=/usr/local/squid |
The squid is installed in the/usr/local directory to compile and install: Make&make Install after installation will be in the/usr/local directory to see the squid directory. Configure Squid configuration file
Edit squid.conf file, vi/usr/local/squid/etc/squid.conf
Cache_effective_user Squid Cache_effective_group Squid ######### Set Squid host name, if no such squid will not start Visible_hostname squid1.nlc.gov.cn ############# configuration squid for accelerated mode ################# Http_port Accel Vhost Vport Icp_port 3130 ##### configures Squid2, SQUID3 as its neighbor, and when Squid1 does not find the requested resource in its cache, Get the cache from its neighbors via ICP query Cache_peer squid2.ibm.com.cn Sibling 80 3130 Cache_peer squid3.ibm.com.cn Sibling 80 3130 ##### SQUID1 Three parent nodes, the Originserver parameter indicates the source server, The Round-robin parameter indicates that squid distributes the request to one of the parent nodes by means of polling; Squid also checks the health status of these parent nodes, and if the parent node is down, Then squid will crawl data from the remaining Origin servers Cache_peer 210.82.118.195 parent 8080 0 no-query originserver round-robin/ Name=webserver1 Cache_peer 192.168.76.226 parent 8080 0 no-query originserver round-robin/ Name=webserver2 Cache_peer 192.168.76.227 parent 8080 0 no-query originserver round-robin/ Name=webserver3 # The request for the wenjin.cache.ibm.com.cn domain is forwarded to one of the three parent nodes through RR polling Cache_peer_domain webServer1 webServer2 WebServer3 wenjin.cache.ibm.com.cn ##### below are some settings for access control, log, and cache directories ACL localnet src 192.168.76.223 192.168.76.224 192.168.76.225 ACL all src 0.0.0.0/0.0.0.0 Http_access Allow all Icp_access Allow LocalNet Cache_log/usr/local/squid/var/logs/cache.log Access_log/usr/local/squid/var/logs/access.log Squid Cache_dir ufs/usr/local/squid/var/cache/1000 16 256 Some optimization of squid by ####### ############### Maximum_object_size 10240 KB ### can cache the maximum object is 10M Maximum_object_size_in_memory ### KB Maximum object 512K in memory cache Cache_mem 256 MB # # #squid The amount of memory used for caching |
After saving: Wq exit.
Add in/etc/hosts file
192.168.76.223
192.168.76.224
192.168.76.225 squid3.ibm.com.cn |
After saving: Wq exit.
Check Squid configuration file is correct or not:/usr/local/squid/bin/squid–k parse
Generate Cache Directory/usr/local/squid/bin/squid–z
Start Squid:/usr/local/squid/bin/squid
configuring SQUID2 and SQUID3 Servers
Squid2 and SQUID3 server configuration and configuration parameters as well as SQUID1, after the configuration is complete, start the Squid service on both servers respectively.
In the Squid log file Cache.log, the following log message shows the successful configuration of sibling between three squid and three parent agents configured.
2008/11/17 10:08:47| Ready to serve requests.
|
Test
Before testing, ensure that the DNS service, three squid services, and three Web services are normal. When you enter http://wenjin.cache.ibm.com.cn on the client, the page is displayed correctly. The server-side response is transparent to the client, the client does not know which WEB server the request is being processed by, and one of the Squid servers or WEB servers fails and does not affect the normal operation of the service.