Using Squid reverse proxy to improve website performance

Source: Internet
Author: User
Tags website performance

Turn from: Http://www.ibm.com/developerworks/cn/linux/l-cn-squid/


On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.

Now there are many large portals such as SINA are using squid reverse proxy technology to speed up access to the Web site, can be different URL requests distributed to the background of different WEB servers, while the Internet users can only see the reverse proxy server address, enhance the site's access security. The concept of reverse proxy

The reverse proxy server, also known as the Web Accelerator server, sits on the front of the Web server and acts as a Web service

The content buffer for the device. Its system structure is shown in Fig . 1 Figure 1. System Structure

The reverse proxy server is set for the Web server, the background Web server is transparent to the Internet user, the user can only see the address of the reverse proxy server, it is not clear how the background Web server is structured. When an Internet user requests a WEB service, DNS resolves the requested domain name to the IP address of the reverse proxy server, so that the URL request is sent to the reverse proxy server, which is responsible for handling the user's request and reply and interacting with the backend WEB server. The use of reverse proxy server reduces the load of the back-end Web server, improves the access speed, and avoids the security hidden trouble caused by the user communicating directly with the Web server.

Back to the first SQUID reverse proxy implementation principle

At present, there are many reverse agent software, more famous have Nginx and Squid. Nginx is a high-performance HTTP and reverse proxy server and a IMAP/POP3/SMTP proxy server developed by Igor Sysoev, the second rambler.ru site for Russian traffic.

Squid is a research project funded by the U.S. government, which aims to solve the problem of network bandwidth shortage, support HTTP,HTTPS,FTP and other protocols, is now used on Unix systems, the most versatile and the most complete set of software. The following will focus on the implementation of SQUID reverse proxy and improve the performance of the Web site application.

Squid The reverse proxy server is located between the local WEB server and the Internet, and the organizational structure is shown in Figure 2: Figure 2. Organization Structure

When a client requests access to a WEB service, DNS resolves the domain name visited to the IP address of the SQUID reverse proxy server, so that the client's URL request is sent to the reverse proxy server. If the resource of the request is cached in the Squid reverse proxy server, the requested resource is returned directly to the client, otherwise the reverse proxy server requests the resource from the background WEB server, and then returns the requested answer to the client, while also caching the answer locally for use by the next requester.

SQUID reverse proxy typically caches only buffered data (such as HTML pages and images), while some CGI scripts or dynamic programs such as ASP or JSP are not cached by default. It buffers the static page based on the HTTP header tag returned from the WEB server. There are four most important HTTP header tags: last-modified: Tell the Reverse proxy page what time was modified Expires: Tell the Reverse proxy page what time should be removed from the buffer Cache-control: Tell the reverse proxy page should be buffered Pragma: Used to contain implementation-specific directives, the most commonly used is Pragma:no-cache

Back to the top use Squid reverse proxy to accelerate site instances

This instance's domain name is wenjin.cache.ibm.com.cn, through the DNS polling technology, distributes the client request to one Squid reverse proxy server processing, if this squid caches the user's request resources, then returns the request resources directly to the user, otherwise this Taiwan Squid sends no cached requests to the neighbor Squid and backend Web server processing according to the configured rules, which reduces the load on the background Web server and improves the performance and security of the entire site. The system structure Figure 3 is as follows: Figure 3. System Architecture configuration: a DNS server: Operating system FREEBSD, software BIND 9.5,ip 192.168.76.222; three Squid servers: Damn As the system Linux as 4, software Squid 3.0, the corresponding IP is as follows:

squid1:192.168.76.223 
 squid2:192.168.76.224 
 squid3:192.168.76.225
Three WEB servers: Operating system Linux as 4, application software Tomcat 5.0+mysql, the corresponding IP address is as follows:
webserver1:210.82.118.195 
 webserver2:192.168.76.226 
 webserver1:192.168.76.227
installation and configuration of application software

Configuring a DNS Server

The software utilizes Freebsd's own bind 9.5. Bind is then configured for the system, first modifying the Bind profile/etc/namedb/named.conf, adding in the file

Zone "cache.ibm.com.cn" { 
        type master; 
        File "master/cache.ibm.com.cn"; 
 

Add the cache.ibm.com.cn file in the/etc/namedb/master directory, which reads as follows:

$TTL    3600 
 @       in      SOA     search. ibm.com.cn. Root ibm.com.cn  . ( 
                                20080807        ; Serial 
                                3600    ; Refresh 
                                900     ; Retry 
                                3600000; Expire 
                                3600)  ; Minimum 
        in      NS      search.ibm.com.cn. 
 1       in      PTR     localhost.ibm.com.cn. 
 Wenjin  in      a       192.168.76.223 
 Wenjin  in      a       192.168.76.224 
 Wenjin  in      A       192.168.76.225

In this way, when the user requests, DNS resolves the wenjin.cache.ibm.com.cn domain name to one of 192.168.76.223, 192.168.76.224, and 192.168.76.225 by polling mechanism.

After the configuration is complete, run RNDC star T to start the bind service. You can set the named_enable= "YES" in/etc/rc.conf to boot from boot.

Use Ps–a |grep named to see if the bind service is up;

Use Nslookup wenjin.cache.ibm.com.cn to test whether the BIND service is functioning properly.

Configure the SQUID1 server to download the squid-3.0.stable8.tar.gz source package and place it in the/home directory to extract TAR–ZXVF squid-3.0.stable8.tar.gz
Set Configuration parameters: CD Squid-3.0.stable10

./configure–prefix=/usr/local/squid

The squid is installed in the/usr/local directory to compile and install: Make&make Install after installation will be in the/usr/local directory to see the squid directory. Configure Squid configuration file

Edit squid.conf file, vi/usr/local/squid/etc/squid.conf

Cache_effective_user squid cache_effective_group squid ######### set Squid host name, if no such squid will not be able to start visible_hostname sq uid1.nlc.gov.cn ############# configured squid for accelerated mode ################# http_port Accel vhost-vport icp_port 3130 #####  
 Configure SQUID2, SQUID3 as its neighbors, and get the cache through the ICP query to its neighbor when Squid1 does not find the requested resource in its cache cache_peer squid2.ibm.com.cn sibling 80 3130 Cache_peer squid3.ibm.com.cn sibling 3130 ##### squid1 Three parent nodes, the Originserver parameter indicates that the source server, the Round-robin parameter indicates squid through the wheel  The request is distributed to one of the parent nodes; Squid also checks the health status of these parent nodes, and if the parent node is down, squid crawls the data from the remaining origin server Cache_peer 210.82.118.195 Parent 8080 0 no-query originserver round-robin \ name=webserver1 cache_peer 19 2.168.76.226 parent 8080 0 no-query originserver round-robin \ Name=webserv 
                                            Er2 cache_peer 192.168.76.227 Parent 8080 0 no-query originserver round-robin \ Name=webserver3 #### forward requests for wenjin.cache.ibm.com.cn domains to one of the three parent nodes through RR polling cache_peer_domain webServer1 webServer2 webServer3 Wenjin.cache . ibm.com.cn ##### The following are some access control, log and cache directory settings ACL localnet src 192.168.76.223 192.168.76.224 192.168.76.225 ACL all src 0.0. 0.0/0.0.0.0 http_access allow all icp_access allow localnet Cache_log/usr/local/squid/var/logs/cache.log access_l Og/usr/local/squid/var/logs/access.log Squid cache_dir ufs/usr/local/squid/var/cache/1000 16 256 ####### to squid on a Some of the most optimized ############### maximum_object_size 10240 KB ### can cache are 10M maximum_object_size_in_memory KB ### The largest pair of cache in memory Like 512K cache_mem 256 MB # #squid The amount of memory used for caching

After saving: Wq exit.

Add in/etc/hosts file

192.168.76.223  squid1.ibm.com.cn 
 192.168.76.224  squid2.ibm.com.cn 
 192.168.76.225  squid3.ibm.com.cn

After saving: Wq exit.

Check Squid configuration file is correct or not:/usr/local/squid/bin/squid–k parse

Generate Cache Directory/usr/local/squid/bin/squid–z

Start Squid:/usr/local/squid/bin/squid

Configuring SQUID2 and SQUID3 Servers

Squid2 and SQUID3 server configuration and configuration parameters as well as SQUID1, after the configuration is complete, start the Squid service on both servers respectively.

In the Squid log file Cache.log, the following log message shows the successful configuration of sibling between three squid and three parent agents configured.

2008/11/17 10:08:47| Configuring sibling squid1.ibm.com.cn/80/3130 
 2008/11/17 10:08:47| Configuring sibling squid3.ibm.com.cn/80/3130 
 2008/11/17 10:08:47| Configuring Parent 210.82.118.195/8080/0 
 2008/11/17 10:08:47| Configuring Parent 192.168.76.226/8080/0 
 2008/11/17 10:08:47| Configuring Parent 192.168.76.227/8080/0 
 2008/11/17 10:08:47| Ready to serve requests.

Test

Before testing, ensure that the DNS service, three Squid service and three Web services are normal. When you enter http://wenjin.cache.ibm.com.cn on the client, the page is displayed correctly. The server-side response is transparent to the client, the client does not know which WEB server the request is being processed by, and one of the Squid servers or WEB servers fails and does not affect the normal operation of the service.

Back to the top of the page summary

Squid is an open-source software that uses its reverse proxy technology to improve the access speed of the Web site system. In the real network environment, the use of three squid reverse proxy server to speed up the performance of the Web site, coupled with the DNS polling technology to achieve a Web site load balance. After a period of testing and commissioning, the site's access speed and availability have been greatly improved, never seen the site service interruption.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.