Ji zengxiong from: http://www.cnlinux.net
Reprinted. Please keep this information. Thank you!
1. What is proxy server? The role of proxy.
In the real world, we often help people to do things, such as pay for electricity or something. In this case, you are not the electricity meter owner, but the agent identity. In the online world, proxy is equivalent to the person who pays for electricity. When we send a connection request, proxy will help us directly communicate with the target server, help us obtain information.
The cache proxy is usually used to change the space for time, just like that.
The steps for the client to access the Internet through the proxy server are as follows:
① The client sends a request to the server.
② After receiving the request, the server compares and judges that there is information the client wants in the cache. If not, the server sends a data request to the remote server.
③ Store the requested data in the cache and then send the data to the client.
④ When the required data in the request sent by the client is in the cache, the data in the cache is directly sent to the client.
Although the data cache requested from the proxy is not in the first access, the proxy will first save the data in the cache after capturing the data, which slows down the access speed, however, when the second and later visitors need this information, the proxy does not want to remotely request the server and directly sends the information in the cache to the later requestor, this reduces the traffic to connect to the remote server. In addition, because the proxy is local, the transmission speed is faster.
2. Use squid to build Proxy Server
The environment used by the author in this article is:
Operating System: RedHat 9.0, kernel: 2.4.20-31.9, other system suites have been updated to the latest through Apt
1. Compile and install squid
Because squid has high requirements on system hardware, we should try our best to optimize it during installation.
Code: |
# Groupadd squid # Useradd squid |
Add suqid users and user groups
Code: |
# Export cflages = '-O2-mcpu = pentium4-March = pentium4-mmmx-MSSE-msse2' |
You can select parameters based on your CPU.
GCC-3.1 and above can be optimized for CPU optimization:
Code: |
Pentium2:-O2-mcpu = i686-March = i686-mmmx Pentium3:-O2-mcpu = pentium3-March = pentium3-mmmx-MSSE Pentium4:-O2-mcpu = pentium4-March = pentium4-mmmx-MSSE-msse2#. /Configure -- prefix =/usr/local/squid -- enable-gnuregex -- enable-async-IO = 80 -- enable-ICMP -- enable-kill-parent-hack -- enable-SNMP -- disable-Ident-lookups -- enable-cahce-digests -- enable-ARP-ACL -- enable-err-Language = "simplify_chinese" -- enable-default-err-extensions ages = "simplify_chinese "-- enable-poll -- enable-Linux-netfilter -- enable-underscore # Make # Make install |
I personally prefer to use the source code package to compile the software. I feel that you know what you are doing, and the RPM package seems to have been installed without knowing what you are doing. Next we will explain each compilation parameter. Of course, you can use./configure -- help to view other parameters and explain each parameter in English.
-- Prefix =/usr/local/squid: Specifies the installation path of the software.
-- Enable-gnuregex: Because squid uses a large number of string processing for various judgments, this item can be used for better processing.
-- Enable-async-IO = 80: This mainly sets the async mode to run squid. In my understanding, it is set to run squid with threads. If the server configuration is good, if you have more than 1 GB of memory, you can set the CPU usage to 160 or higher in SMP mode. If the server is poor, it is set based on the actual situation. In addition, the cache file supports aufs.
-- Enable-ICMP: Supported By ICMP
-- Enable-kill-parent-hack: Do you want to disable suqid together with the parent process?
-- Enable-SNMP: This option allows MRTG to monitor the server's traffic status using the SNMP protocol. Therefore, you must select this option to enable squid to support the SNMP interface.
-- Disable-Ident-lookups: prevents the system from using the identity recognition method specified by rfc931.
-- Enable-cahce-digests: accelerate the cache content retrieval speed during requests.
-- Enable-ARP-ACL: you can manage the MAC address of the client directly in Rule settings to prevent IP spoofing.
-- Enable-err-Language = "simplify_chinese" and
-- Enable-default-err-ages = "simplify_chinese": specifies that the error is displayed as an error page in simplified Chinese.
-- Enable-Poll: the poll () function should be enabled instead of the select () function. Generally, Poll (polling) is better than select, but configure (script program) it is known that poll is invalid on some platforms. If you think you are smarter than configure compiling and configuring a script program, you can use this option to enable poll. In short, this can improve performance.
-- Enable-Linux-netfilter: supports transparent proxy.
-- Enable-underscore.
Here we have installed it, and the next step is to modify the configuration file.
2. Modify the definition configuration parameters
The following is my squid. conf file
Code: |
# Network options (related network options) #-----------------------------------------------------------------------------Http_port 3128 # proxy Port Icp_port 3130 # ICP Port # Options which affect the neighbor Selection Algorithm (related options acting on the neighbor Selection Algorithm) #----------------------------------------------------------------------------- # Disable Cache Hierarchy_stoplist cgi-bin? Hierarchy_stoplist-I ^ https ://? ACL query urlpath_regex-I cgi-bin /? /. Asp/. php/. jsp/. cgi ACL denyssl urlpath_regex-I ^ https :// No_cache deny Query No_cache deny denyssl # The above mentioned items mean that the URLs that contain cgi-bin and those that start with https: // are not cached, # Do not cache dynamic scripts such as ASP, CGI, and PHP, # Because these scripts are usually dynamically updated, the data is not synchronized. # There is also https: // The enabled non-cache is because we generally conduct e-commerce transactions, # For example, this is used for bank payment. It is not dangerous to cache the credit card number. # Options which affect the cache size (the option that defines the cache size) #----------------------------------------------------------------------------- Cache_mem 8 Mb # additional memory usage, which can be set according to your system, generally 1/3 of the actual memory Cache_swap_low 90 # minimum cache percentage Cache_swap_high 95 # Maximum Cache percentage, that is, the usage percentage of the above extra memory Maximum_object_size 4096 kb # Maximum Cache size of a single file. If the Maximum Cache size is exceeded, the file is not cached. Maximum_object_size_in_memory 8 KB # Maximum Cache size of a single file in the memory. exceeding this size will not be cached in the memory. # The IP address obtained by DNS resolution has the size of the cache, which can speed up resolution. Ipcache_size 1024 Ipcache_low 90 Ipcache_high 95 Fqdncache_size 1024 # Logfile pathnames and cache directories (defines the path of the log file and the cache directory) #----------------------------------------------------------------------------- # <Cache_dir> <aufs | UFS> <directory> <Mbytes size> <dir1> <dir2> # The aufs is supported only when the -- enable-async-io option is added during compilation, # Depending on your host, # Dir1 and dir2 are the sizes of two directories, which are usually 16 256 or 64 64, # Generally, the number should be a multiple of 16. It is said that the performance will be better! Cache_dir aufs/cache1 100 16 256 Cache_dir aufs/cache2 100 16 256 # Log storage location Cache_access_log/usr/local/squid/var/logs/access. Log Cache_log/usr/local/squid/var/logs/cache. Log # Tag: cache_store_log Cache_store_log/usr/local/squid/var/logs/store. Log # Tag: pid_filename Pid_filename/usr/local/squid/var/logs/squid. PID # Options for External Support Programs (External Support Program Options) #----------------------------------------------------------------------------- # Use proxy to log on to the anonymous FTP service # Tag: ftp_user Ftp_user squid @ # User Name Ftp_passive on # Passive Mode # Authentication # Auth_param basic children 5 # Auth_param basic realm Squid proxy-caching Web Server # Auth_param basic credentialsttl 2 hours # Auth_param basic casesensitive off # Options for tuning the cache (the cache adjustment option) #----------------------------------------------------------------------------- # Tag: refresh_pattern cache Update time settings # <Refresh_pattern> <RegEx> <Minimum Time> <percentage> <maximum time> Refresh_pattern ^ ftp: 1440 20% 10080 Refresh_pattern ^ gopher: 1440 0% 1440 Refresh_pattern. 0 20% 4320 # If the first line of the above line starts with FTP, after one day (1440 minutes, # If the proxy uses this file again, the data in the cache will be updated! # Timeouts (timeout) #----------------------------------------------------------------------------- # Maximum attempt time to connect to another machine Connect_timeout 1 minute # Timeout value for connecting to the upper-layer proxy Peer_connect_timeout 30 seconds # Return timeout Request_timeout 2 minutes # Duration Persistent_request_timeout 1 minute # Access Controls) #----------------------------------------------------------------------------- # Tag: ACL # Examples: # ACL myexample dst_as 1241 # ACL password proxy_auth required # ACL fileupload req_mime_type-I ^ multipart/form-data $ # ACL JavaScript rep_mime_type-I ^ application/X-JavaScript $ # # Recommended minimum Configuration: ACL all SRC 0.0.0.0/0.0.0.0 ACL manager proto cache_object ACL localhost SRC 127.0.0.1/255.255.255.255 ACL to_localhost DST 127.0.0.0/8 ACL ssl_ports port 443 563 ACL safe_ports port 80 # HTTP ACL safe_ports port 21 # ftp ACL safe_ports port 443 563 # https, snews ACL safe_ports port 70 # Gopher ACL safe_ports port 210 # wais ACL safe_ports port 1025-65535 # unregistered ports ACL safe_ports port 280 # http-Mgmt ACL safe_ports port 488 # GSS-HTTP ACL safe_ports port 591 # FileMaker ACL safe_ports port 777 # multiling HTTP ACL connect method connect ACL inside SRC 192.168.1.0/24 # Intranet IP address segment ACL localmac ARP "/usr/local/squid/localmac" # MAC Address File # Tag: http_access Http_access allow inside # Allow the inside rule to pass Http_access allow localmac # Allow the registered MAC address in localmac to pass through # # Recommended minimum Configuration: # # Only allow cachemgr access from localhost Http_access allow manager localhost Http_access deny Manager # Deny requests to unknown ports Http_access deny! Safe_ports # Deny connect to other than SSL ports Http_access deny connect! Ssl_ports # # Http_access deny to_localhost # # And finally deny all other access to this proxy Http_access deny all # Tag: http_reply_access Http_reply_access allow all # Tag: icp_access # Icp_access allow all # Tag: cache_peer_access # Administrative parameters) #----------------------------------------------------------------------------- # Tag: cache_mgr Cache_mgr webmaster @ localhost # administrator mailbox # Tag: cache_inclutive_user Cache_inclutive_user squid # user who runs squid Cache_effective_group squid # group for running squid # Tag: visible_hostname Visible_hostname proxyserver # proxy server name # Options for the cache Registry Service (Cache registration service options) #----------------------------------------------------------------------------- # HTTPD-ACCELERATOR options (httpd acceleration options) #----------------------------------------------------------------------------- # Set transparent proxy Httpd_accel_host proxyserver # Host Name Httpd_accel_port 80 # transparent proxy Port Httpd_accel_with_proxy on Httpd_accel_uses_host_header on # Miscellaneous (Miscellaneous) #----------------------------------------------------------------------------- # Tag: logfile_rotate # Squid regularly renames and packs log files. # For example, if the log file being used is access. Log, squid will rename it and package it as access.log.1.gz; After a certain period of time, squidwill change access.log.1.gzto access.log.2.gz. In this loop, replace the previous log file name with access.log.1.gz. # The number specified by logfile_rotate is the number of files packaged and backed up. When this number is reached, # Squid will delete the oldest backup file. The default value is 1 0. If you want to perform these operations manually, # You can use logfile_rotate 0 to cancel Automatic operations. Logfile_rotate 4 # Tag: forwarded_for on | off # If you disable this option, the IP address displayed when you access some forums is unknown, # If it is enabled, the Intranet IP address of your client is displayed. Forwarded_for off # Icon file directory # Icon_directory/usr/local/squid/share/icons # Error message file directory # Error_directory/usr/local/squid/share/errors/simplify_chinese # Tag: snmp_port # Squid can now serve statistics and status information via SNMP. # By default it listens to port 3401 on the machine. If you don't # Wish to use SNMP, set this to "0 ". # # Default: # Snmp_port 3401 # Tag: snmp_access # Allowing or denying access to the SNMP port. # # All Access to the agent is denied by default. # Usage: # # Snmp_access allow | deny [!] Aclname... # # Example: # Snmp_access allow snmppublic localhost # Snmp_access deny all # # Default: # Snmp_access deny all # Delay pool parameters (all require delay_pools compilation option) (latency pool parameter) #----------------------------------------------------------------------------- # Tag: coredump_dir # When squid suddenly fails or suddenly fails, write the information in the memory of squid to the hard disk Coredump_dir/usr/local/squid/var/Cache |
3. Set iptables to support transparent proxy
Set Nat before setting squid + iptables to support transparent proxy. You can use the following simple statement:
Code: |
Echo "1">/proc/sys/NET/IPv4/ip_forward # Set forwarding /Sbin/iptables-T Nat-A postrouting-J masquerade # Set the NAT Function Iptables-T Nat-A prerouting-I eth0-P TCP-s 192.168.1.0/24 -- dport 80-J redirect -- to-ports 3128 # Forward all requests from port 80 to port 3128 of suqid |
192.168.1.0/24 indicates that the network segment 192.168.1.1-254 is transparent proxy through squid and Nat.
In this way, when users access the WWW Service, they can use cache as a high-speed proxy to reduce traffic, while other services are forwarded through NAT.
4. Use the upper-layer proxy
When you access a foreign website slowly, you can set up proxy access. Can our own proxy server also set someone else's proxy to access a foreign website? The answer is yes.
For example, a proxy proxy1.cnlinux.net can access foreign countries at a fast speed and we can access it quickly. Therefore, we use it as the upper-layer proxy for accessing foreign websites.
We need to add the following parameters in Squid. conf:
The category mainly includes the upper-layer parent and the same-layer sibling. Here we mainly introduce the upper-layer proxy, that is, parent. If you need to set up a proxy server cluster, you can use sibling, we will not discuss it here.
Other parameters include:
Code: |
Proxy-only: the data is only required by the upstream proxy and is not cached in the local proxy. Weight = N: weight. When we set multiple upper-layer proxies, the functions of these proxies are the same. You can set this option to determine which upper-layer proxies are more important, the greater N, the more important it is. No-query: When sibling type is used, an ICP request is sent to the proxy at the same layer when requesting data. You can use no-query to cancel the ICP request, generally, you do not need to send an ICP packet when requesting data from the upstream proxy to reduce traffic. Default: set this proxy to the default proxy. No-netdb-exchange: indicates that the imcp packet is not sent to the proxy. No-Digest: indicates that the request submitted to the upstream proxy is not recorded.# Upper proxy settings Cache_peer proxy1.cnlinux.net parent 3128 3130 no-Digest No-netdb-exchange # Set access rules. You can use domain names or IP addresses. Acl usa dstdomain. com. US # us. com. Us website ACL usaip DST 18.0.0.0/8 # Some IP address segments in the United States # Allow prohibited rules Cache_peer_access proxy1.cnlinux.net allow USA # Allow the USA rule to use this upper-layer proxy Cache_peer_access proxy1.cnlinux.net deny! USA # prohibit all non-USA rules from using this upper-layer proxy Cache_peer_access proxy1.cnlinux.net allow usaip Cache_peer_access proxy1.cnlinux.net deny! Usaip |
5. Start and close squid
A. Change the owner of the cache directory to squid.
Code: |
# Chown-r squid: Squid/cache1 # Chown-r squid: Squid/cache2 |
B. initialize the cache directory.
Code: |
#/Usr/local/squid/sbin/squid-z 23:06:29 | creating swap Directories Fatal: failed to make swap directory/cache1/00: (13) Permission denied Squid cache (version 2.5.stable7): terminated abnormally. CPU usage: 0.000 seconds = 0.000 user + 0.000 sys Maximum resident size: 0 KB Page faults with physical I/O: 10 |
If the preceding error message is displayed, it indicates that your/cache1 directory permission is incorrect. Check whether the/cache1 directory owner is owned by the squid user.
C. Start squid
Code: |
# Su squid-c "/usr/local/squid/bin/runcache &" |
D. Disable squid.
Code: |
#/Usr/local/squid/sbin/squid-K shutdown Suqid can be disabled normally only after two executions. |
E. re-read the squid. conf file.
Code: |
#/Usr/local/squid/sbin/squid-K reconfigure You need to execute the command twice to re-read the squid. conf file. |
6. Log Analysis
After the proxy server is installed, we certainly need to monitor the server. Through log analysis, we can know how much traffic those users use on those websites, the following describes the log analysis tool Sarg. Several other log analysis tools are also introduced on Squid's official website. If you are interested, go and have a look.
A. Install
Code: |
#./Configure -- prefix =/usr/local/Sarg -- enable-bindir =/usr/local/Sarg/bin # Make & make install |
B. Set the Sarg. conf file
Code: |
# Vi/usr/local/Sarg. conf Language language English # since the Chinese version has not been released on the official website, we can use English. If you are interested, translate it yourself. Access_log/usr/local/squid/var/logs/access. log.0 # storage location of squid log files Title "Squid use report" # title Temporary_dir/tmp # temporary directory Output_dir/var/www/html/Sarg # Save the generated HTML to your website directory for browsing. Overwrite_report no # Whether to overwrite the report and whether to overwrite the report if the report of that date already exists Mail_utility mail Topsites_num 100 Exclude_codes/usr/local/Sarg/exclude_codes # Max_elapsed 28800000 Charset gb2312 # Character Set |
C. generate a report
After setting the Sarg. conf file, run
Code: |
#/Usr/local/Sarg/bin/Sarg Prompt: Sarg: successful report generated on/usr/local/Apache/htdocs/Sarg/2004oct31-2004nov01 |
This indicates that the report has been generated successfully, and the report storage location exists. You can open your browser to view the report immediately.
Iii. Suggestions on cache directory
Because the cache directory is frequently read/write, it is best to use SCSI on the hard disk, which is fast and stable. If the cache size is about 40 GB, we should try to use multiple hard disks. Instead of simply using a 40 Gb hard disk, we can use four 10 Gb hard disks, the cache speed is faster. For example, if you have 10 MB of data to be written to the cache and only use one hard disk, you may have put four cache directories in four partitions, however, you only have one hard disk and only one is writing data. But when you have four hard disks, you only need to write MB of data to each hard disk. Is that faster?
In addition, we recommend that you store each cache directory in a single partition. Do not divide the partition too much. Generally, 2 to 4 GB is enough, so that the proxy does not need to spend too much time searching data.
This article has referenced the articles of previous generations. If you have any offense, please forgive me!