Turn from: Http://www.ibm.com/developerworks/cn/linux/l-cn-squid/
On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.
Now there are many large portals such as SINA are using squid reverse proxy technology to speed up access to the Web site, can be different URL requests distributed to the background of different WEB servers, while the Internet users can only see the reverse proxy server address, enhance the site's access security. The concept of reverse proxy
The reverse proxy server, also known as the Web Accelerator server, sits on the front of the Web server and acts as a Web service
The content buffer for the device. Its system structure is shown in Fig . 1 Figure 1. System Structure
The reverse proxy server is set for the Web server, the background Web server is transparent to the Internet user, the user can only see the address of the reverse proxy server, it is not clear how the background Web server is structured. When an Internet user requests a WEB service, DNS resolves the requested domain name to the IP address of the reverse proxy server, so that the URL request is sent to the reverse proxy server, which is responsible for handling the user's request and reply and interacting with the backend WEB server. The use of reverse proxy server reduces the load of the back-end Web server, improves the access speed, and avoids the security hidden trouble caused by the user communicating directly with the Web server.
Back to the first SQUID reverse proxy implementation principle
At present, there are many reverse agent software, more famous have Nginx and Squid. Nginx is a high-performance HTTP and reverse proxy server and a IMAP/POP3/SMTP proxy server developed by Igor Sysoev, the second rambler.ru site for Russian traffic.
Squid is a research project funded by the U.S. government, which aims to solve the problem of network bandwidth shortage, support HTTP,HTTPS,FTP and other protocols, is now used on Unix systems, the most versatile and the most complete set of software. The following will focus on the implementation of SQUID reverse proxy and improve the performance of the Web site application.
Squid The reverse proxy server is located between the local WEB server and the Internet, and the organizational structure is shown in Figure 2: Figure 2. Organization Structure
When a client requests access to a WEB service, DNS resolves the domain name visited to the IP address of the SQUID reverse proxy server, so that the client's URL request is sent to the reverse proxy server. If the resource of the request is cached in the Squid reverse proxy server, the requested resource is returned directly to the client, otherwise the reverse proxy server requests the resource from the background WEB server, and then returns the requested answer to the client, while also caching the answer locally for use by the next requester.
SQUID reverse proxy typically caches only buffered data (such as HTML pages and images), while some CGI scripts or dynamic programs such as ASP or JSP are not cached by default. It buffers the static page based on the HTTP header tag returned from the WEB server. There are four most important HTTP header tags: last-modified: Tell the Reverse proxy page what time was modified Expires: Tell the Reverse proxy page what time should be removed from the buffer Cache-control: Tell the reverse proxy page should be buffered Pragma: Used to contain implementation-specific directives, the most commonly used is Pragma:no-cache
Back to the top use Squid reverse proxy to accelerate site instances
This instance's domain name is wenjin.cache.ibm.com.cn, through the DNS polling technology, distributes the client request to one Squid reverse proxy server processing, if this squid caches the user's request resources, then returns the request resources directly to the user, otherwise this Taiwan Squid sends no cached requests to the neighbor Squid and backend Web server processing according to the configured rules, which reduces the load on the background Web server and improves the performance and security of the entire site. The system structure Figure 3 is as follows: Figure 3. System Architecture configuration: a DNS server: Operating system FREEBSD, software BIND 9.5,ip 192.168.76.222; three Squid servers: Damn As the system Linux as 4, software Squid 3.0, the corresponding IP is as follows:
squid1:192.168.76.223
squid2:192.168.76.224
squid3:192.168.76.225
Three WEB servers: Operating system Linux as 4, application software Tomcat 5.0+mysql, the corresponding IP address is as follows:
webserver1:210.82.118.195
webserver2:192.168.76.226
webserver1:192.168.76.227
installation and configuration of application software
Configuring a DNS Server
The software utilizes Freebsd's own bind 9.5. Bind is then configured for the system, first modifying the Bind profile/etc/namedb/named.conf, adding in the file
Zone "cache.ibm.com.cn" {
type master;
File "master/cache.ibm.com.cn";
Add the cache.ibm.com.cn file in the/etc/namedb/master directory, which reads as follows:
$TTL 3600
@ in SOA search. ibm.com.cn. Root ibm.com.cn . (
20080807 ; Serial
3600 ; Refresh
900 ; Retry
3600000; Expire
3600) ; Minimum
in NS search.ibm.com.cn.
1 in PTR localhost.ibm.com.cn.
Wenjin in a 192.168.76.223
Wenjin in a 192.168.76.224
Wenjin in A 192.168.76.225
In this way, when the user requests, DNS resolves the wenjin.cache.ibm.com.cn domain name to one of 192.168.76.223, 192.168.76.224, and 192.168.76.225 by polling mechanism.
After the configuration is complete, run RNDC star T to start the bind service. You can set the named_enable= "YES" in/etc/rc.conf to boot from boot.
Use Ps–a |grep named to see if the bind service is up;
Use Nslookup wenjin.cache.ibm.com.cn to test whether the BIND service is functioning properly.
Configure the SQUID1 server to download the squid-3.0.stable8.tar.gz source package and place it in the/home directory to extract TAR–ZXVF squid-3.0.stable8.tar.gz
Set Configuration parameters: CD Squid-3.0.stable10
./configure–prefix=/usr/local/squid
The squid is installed in the/usr/local directory to compile and install: Make&make Install after installation will be in the/usr/local directory to see the squid directory. Configure Squid configuration file
Edit squid.conf file, vi/usr/local/squid/etc/squid.conf
Cache_effective_user squid cache_effective_group squid ######### set Squid host name, if no such squid will not be able to start visible_hostname sq uid1.nlc.gov.cn ############# configured squid for accelerated mode ################# http_port Accel vhost-vport icp_port 3130 #####
Configure SQUID2, SQUID3 as its neighbors, and get the cache through the ICP query to its neighbor when Squid1 does not find the requested resource in its cache cache_peer squid2.ibm.com.cn sibling 80 3130 Cache_peer squid3.ibm.com.cn sibling 3130 ##### squid1 Three parent nodes, the Originserver parameter indicates that the source server, the Round-robin parameter indicates squid through the wheel The request is distributed to one of the parent nodes; Squid also checks the health status of these parent nodes, and if the parent node is down, squid crawls the data from the remaining origin server Cache_peer 210.82.118.195 Parent 8080 0 no-query originserver round-robin \ name=webserver1 cache_peer 19 2.168.76.226 parent 8080 0 no-query originserver round-robin \ Name=webserv
Er2 cache_peer 192.168.76.227 Parent 8080 0 no-query originserver round-robin \ Name=webserver3 #### forward requests for wenjin.cache.ibm.com.cn domains to one of the three parent nodes through RR polling cache_peer_domain webServer1 webServer2 webServer3 Wenjin.cache . ibm.com.cn ##### The following are some access control, log and cache directory settings ACL localnet src 192.168.76.223 192.168.76.224 192.168.76.225 ACL all src 0.0. 0.0/0.0.0.0 http_access allow all icp_access allow localnet Cache_log/usr/local/squid/var/logs/cache.log access_l Og/usr/local/squid/var/logs/access.log Squid cache_dir ufs/usr/local/squid/var/cache/1000 16 256 ####### to squid on a Some of the most optimized ############### maximum_object_size 10240 KB ### can cache are 10M maximum_object_size_in_memory KB ### The largest pair of cache in memory Like 512K cache_mem 256 MB # #squid The amount of memory used for caching
After saving: Wq exit.
Add in/etc/hosts file
192.168.76.223 squid1.ibm.com.cn
192.168.76.224 squid2.ibm.com.cn
192.168.76.225 squid3.ibm.com.cn
After saving: Wq exit.
Check Squid configuration file is correct or not:/usr/local/squid/bin/squid–k parse
Generate Cache Directory/usr/local/squid/bin/squid–z
Start Squid:/usr/local/squid/bin/squid
Configuring SQUID2 and SQUID3 Servers
Squid2 and SQUID3 server configuration and configuration parameters as well as SQUID1, after the configuration is complete, start the Squid service on both servers respectively.
In the Squid log file Cache.log, the following log message shows the successful configuration of sibling between three squid and three parent agents configured.
2008/11/17 10:08:47| Configuring sibling squid1.ibm.com.cn/80/3130
2008/11/17 10:08:47| Configuring sibling squid3.ibm.com.cn/80/3130
2008/11/17 10:08:47| Configuring Parent 210.82.118.195/8080/0
2008/11/17 10:08:47| Configuring Parent 192.168.76.226/8080/0
2008/11/17 10:08:47| Configuring Parent 192.168.76.227/8080/0
2008/11/17 10:08:47| Ready to serve requests.
Test
Before testing, ensure that the DNS service, three Squid service and three Web services are normal. When you enter http://wenjin.cache.ibm.com.cn on the client, the page is displayed correctly. The server-side response is transparent to the client, the client does not know which WEB server the request is being processed by, and one of the Squid servers or WEB servers fails and does not affect the normal operation of the service.
Back to the top of the page summary
Squid is an open-source software that uses its reverse proxy technology to improve the access speed of the Web site system. In the real network environment, the use of three squid reverse proxy server to speed up the performance of the Web site, coupled with the DNS polling technology to achieve a Web site load balance. After a period of testing and commissioning, the site's access speed and availability have been greatly improved, never seen the site service interruption.