Using Squid reverse proxy to improve website performance

Source: Internet
Author: User
Tags website performance
On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.

On the basis of introducing the working principle of squid Reverse proxy, this paper points out that reverse proxy technology has a good use in improving Web site access speed and enhancing usability and security of web sites. The author uses DNS polling and SQUID reverse proxy technology to achieve the load balance of the website, which improves the usability and reliability of the website under the specific experimental environment.

Now there are many large portals such as SINA are using squid reverse proxy technology to speed up access to the Web site, can be different URL requests distributed to the background of different WEB servers, while the Internet users can only see the reverse proxy server address, enhance the site's access security.

The concept of reverse proxy

The reverse proxy server, also known as the Web Accelerator server, sits on the front of the Web server and acts as a Web service

The content buffer for the device. Its system structure is shown in Fig. 1


Figure 1. System Structure

The reverse proxy server is set for the Web server, the background Web server is transparent to the Internet user, the user can only see the address of the reverse proxy server, it is not clear how the background Web server is structured. When an Internet user requests a WEB service, DNS resolves the requested domain name to the IP address of the reverse proxy server, so that the URL request is sent to the reverse proxy server, which is responsible for handling the user's request and reply and interacting with the backend WEB server. The use of reverse proxy server reduces the load of the back-end Web server, improves the access speed, and avoids the security hidden trouble caused by the user communicating directly with the Web server.




back to the top of the page


Realization principle of SQUID reverse proxy

At present, there are many reverse agent software, more famous have Nginx and Squid. Nginx is a high-performance HTTP and reverse proxy server and a IMAP/POP3/SMTP proxy server developed by Igor Sysoev, the second rambler.ru site for Russian traffic.

Squid is a research project funded by the U.S. government, which aims to solve the problem of insufficient network bandwidth, support HTTP, Https,ftp and other protocols, is now used on Unix systems, the most versatile and the most complete set of software. The following will focus on the implementation of SQUID reverse proxy and improve the performance of the Web site application.

Squid The reverse proxy server is located between the local WEB server and the Internet, and the organizational structure is shown in Figure 2:


Figure 2. Organizational Structure

When a client requests access to a WEB service, DNS resolves the domain name visited to the IP address of the SQUID reverse proxy server, so that the client's URL request is sent to the reverse proxy server. If the resource of the request is cached in the Squid reverse proxy server, the requested resource is returned directly to the client, otherwise the reverse proxy server requests the resource from the background WEB server, and then returns the requested answer to the client, while also caching the answer locally for use by the next requester.

SQUID reverse proxy typically caches only buffered data (such as HTML pages and images), while some CGI scripts or dynamic programs such as ASP or JSP are not cached by default. It buffers the static page based on the HTTP header tag returned from the WEB server. There are four most important HTTP header tags: last-modified: Tell the Reverse proxy page what time was modified Expires: Tell the Reverse proxy page what time should be removed from the buffer Cache-control: Tell the reverse proxy page should be buffered Pragma: Used to contain implementation-specific directives, the most commonly used is Pragma:no-cache




back to the top of the page


Use Squid reverse proxy to accelerate website instance

This instance's domain name is wenjin.cache.ibm.com.cn, through the DNS polling technology, distributes the client request to one Squid reverse proxy server processing, if this squid caches the user's request resources, then returns the request resources directly to the user, otherwise this Taiwan Squid sends no cached requests to the neighbor Squid and backend Web server processing according to the configured rules, which reduces the load on the background Web server and improves the performance and security of the entire site. The system structure Figure 3 is as follows:


Figure 3. System Structure

Configured system environment: one DNS server: Operating system FREEBSD, software BIND 9.5,ip 192.168.76.222; three Squid server: Operating system Linux as 4, software Squid 3.0, the corresponding IP is as follows:



squid3:192.168.76.225

Three WEB servers: Operating system Linux as 4, application software Tomcat 5.0+mysql, the corresponding IP address is as follows:



webserver1:192.168.76.227

Installation and configuration of application software

Configuring a DNS server

The software utilizes Freebsd's own bind 9.5. Bind is then configured for the system, first modifying the Bind profile/etc/namedb/named.conf, adding in the file




};

Add the cache.ibm.com.cn file in the/etc/namedb/master directory, which reads as follows:

$TTL    
@ in SOA search. ibm.com.cn. Root ibm.com.cn.
20080807
3600
900

3600)
In NS
1 in PTR
Wenjin in A
Wenjin in A
Wenjin in A 192.168.76.225

In this way, when the user requests, DNS resolves the wenjin.cache.ibm.com.cn domain name to one of 192.168.76.223, 192.168.76.224, and 192.168.76.225 by polling mechanism.

After the configuration is complete, run RNDC star T to start the bind service. You can set the named_enable= "YES" in/etc/rc.conf to boot from boot.

Use Ps–a |grep named to see if the bind service is up;

Use Nslookup wenjin.cache.ibm.com.cn to test whether the BIND service is functioning properly.

Configure the SQUID1 server to download the squid-3.0.stable8.tar.gz source package and place it in the/home directory to extract TAR–ZXVF squid-3.0.stable8.tar.gz
Set Configuration parameters: CD Squid-3.0.stable10

./configure–prefix=/usr/local/squid

The squid is installed in the/usr/local directory to compile and install: Make&make Install after installation will be in the/usr/local directory to see the squid directory. Configure Squid configuration file

Edit squid.conf file, vi/usr/local/squid/etc/squid.conf

Cache_effective_user Squid
Cache_effective_group Squid
######### Set Squid host name, if no such squid will not start
Visible_hostname squid1.nlc.gov.cn
############# configuration squid for accelerated mode #################
Http_port Accel Vhost Vport
Icp_port 3130
##### configures Squid2, SQUID3 as its neighbor, and when Squid1 does not find the requested resource in its cache,
Get the cache from its neighbors via ICP query
Cache_peer squid2.ibm.com.cn Sibling 80 3130
Cache_peer squid3.ibm.com.cn Sibling 80 3130
##### SQUID1 Three parent nodes, the Originserver parameter indicates the source server,
The Round-robin parameter indicates that squid distributes the request to one of the parent nodes by means of polling;
Squid also checks the health status of these parent nodes, and if the parent node is down,
Then squid will crawl data from the remaining Origin servers
Cache_peer 210.82.118.195 parent 8080 0 no-query originserver round-robin/
Name=webserver1
Cache_peer 192.168.76.226 parent 8080 0 no-query originserver round-robin/
Name=webserver2
Cache_peer 192.168.76.227 parent 8080 0 no-query originserver round-robin/
Name=webserver3
# The request for the wenjin.cache.ibm.com.cn domain is forwarded to one of the three parent nodes through RR polling
Cache_peer_domain webServer1 webServer2 WebServer3 wenjin.cache.ibm.com.cn
##### below are some settings for access control, log, and cache directories
ACL localnet src 192.168.76.223 192.168.76.224 192.168.76.225
ACL all src 0.0.0.0/0.0.0.0
Http_access Allow all
Icp_access Allow LocalNet
Cache_log/usr/local/squid/var/logs/cache.log
Access_log/usr/local/squid/var/logs/access.log Squid
Cache_dir ufs/usr/local/squid/var/cache/1000 16 256
Some optimization of squid by ####### ###############
Maximum_object_size 10240 KB ### can cache the maximum object is 10M
Maximum_object_size_in_memory ### KB Maximum object 512K in memory cache
Cache_mem 256 MB # # #squid The amount of memory used for caching

After saving: Wq exit.

Add in/etc/hosts file

192.168.76.223  
192.168.76.224
192.168.76.225 squid3.ibm.com.cn

After saving: Wq exit.

Check Squid configuration file is correct or not:/usr/local/squid/bin/squid–k parse

Generate Cache Directory/usr/local/squid/bin/squid–z

Start Squid:/usr/local/squid/bin/squid

configuring SQUID2 and SQUID3 Servers

Squid2 and SQUID3 server configuration and configuration parameters as well as SQUID1, after the configuration is complete, start the Squid service on both servers respectively.

In the Squid log file Cache.log, the following log message shows the successful configuration of sibling between three squid and three parent agents configured.






2008/11/17 10:08:47| Ready to serve requests.

Test

Before testing, ensure that the DNS service, three squid services, and three Web services are normal. When you enter http://wenjin.cache.ibm.com.cn on the client, the page is displayed correctly. The server-side response is transparent to the client, the client does not know which WEB server the request is being processed by, and one of the Squid servers or WEB servers fails and does not affect the normal operation of the service.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.