Set up a proxy server in Linux (1)

Source: Internet
Author: User
Tags internet cache squid proxy
Article Title: Set up a proxy server in Linux (1 ). Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
I. Proxy Server Overview
  
1.1 What is a proxy server
  
In a TCP/IP network, the traditional communication process is as follows: the client requests data from the server, the server responds to the request, and transmits the data to the client. After the proxy server is introduced, this process becomes like this: the client initiates a request to the server and the request is sent to the proxy server; the proxy server analyzes the request, first, check whether there is any request data in the cache. If yes, it is directly sent to the client. If not, the client sends a request to the server instead. After the server responds, the proxy server sends the response data to the client and keeps a copy of the data in its own cache. In this way, when a client requests the same data, the proxy server can directly send the data to the client without initiating a request to the server.
  
1.2 Proxy Server Functions
  
Generally, the proxy server has the following features:
  
1. Increase access speed through Cache
  
With the rapid development of the Internet, network bandwidth becomes more and more precious. To speed up access, Many ISPs provide proxy servers and use the proxy server's cache function to speed up network access. Generally, most proxy servers support HTTP caching, but some proxy servers also support FTP caching. When selecting a proxy server, the HTTP cache function is sufficient for most organizations.
In general, there are active cache and passive cache. Passive cache means that the Proxy Server caches the data returned by the server only when the client requests data. If the data expires and the client requests the same data, the proxy server must initiate a new data request again, and then cache the response data when it delivers it to the client. Active cache means that the proxy server constantly checks the data in the cache. Once the data expires, the proxy server initiates a new data request to update the data. In this way, the response time is greatly shortened when a client requests the data. It should also be noted that most proxy servers do not cache the authentication information in the data.
  
2. provides a method to access the Internet with a private IP address.
  
IP addresses are valuable resources that cannot be recycled. If you only have a limited IP address but need to provide Internet access capabilities for the entire organization, you can achieve this by using a proxy server.
  
3. Improve Network Security
  
This is obvious. If internal users access the Internet through a proxy server, the proxy server becomes the only channel to access the Internet, the proxy server is also the only channel for Internet access to the Intranet. If you do not have a reverse proxy, only the proxy server is visible to hosts on the Internet, this greatly enhances network security.
  
1.3 Classification and features of proxy servers
  
Generally, proxy servers are classified into line layer proxy, application layer proxy, and intelligent line layer proxy. Here, I want to divide proxy servers into traditional proxy servers and transparent proxy servers.
I think it is necessary to clarify the differences between the two. Only by truly understanding the internal mechanism can we have rules to follow when encountering problems, so that we will not be confused, and do not know where to solve the problem. For this reason, we will explain it through specific instances. The idea of writing this chapter comes from IPCHAINS-HOWTO written by Paul Russell. The example below also comes from this article. I think the biggest benefit of reading this article is to have a clear understanding of the implementation methods for Intranet access to external networks and Intranet access from external networks. Of course, the so-called Intranet refers to the internal network using private IP addresses.
Our examples are based on the following assumptions:
Your domain name is sample.com, your intranet (192.168.1. *) The user accesses the Internet through the proxy server of proxy.sample.com (external interface eth0: 1.2.3.4; Internal interface eth1: 192.168.1.1). In other words, the proxy server is the only machine directly connected to the Internet and the Intranet. And assume that some agent server software (such as squid) is running on the proxy server ). Assume that a client in the Intranet is client.sample.com (192.168.1.100 ).
  
+ ------------------- +
| Intranet (192.168.1. *) | eth1 + -------- + eth0 DDN
| + ------------ | Proxy | <====================> Internet
| Client198.168.1.100 | + -------- +
+ ------------------- +
  
Eth0: 1.2.3.4
Eth1: 198.168.1.1
  
  
1.3.1 traditional proxy
  
Based on the above, we will do the following:
1. The proxy service software is bound to port 8080 of the proxy server.
2. The client browser is configured to use port 8080 of the proxy server.
3. The client does not need to configure DNS.
4. Configure the proxy server on the proxy server.
5. The client does not need to configure the default route.
  
When we open a web request in a client browser, such as a http://www.linuxaid.com.cn, the following events will happen one after another:
1. The client uses a port (such as 1025) to connect to port 8080 of the proxy server and request the web page "http://www.linuxaid.com.cn"
2. the proxy server requests "www.linuxaid.com.cn" from the DNS to obtain the IP address 202.99.11.120. Then, the proxy server uses a port (such as 1037) to initiate a web connection request to port 80 of the IP address to request the web page.
3. After receiving the response to the web page, the proxy server sends the data to the client.
4. This page is displayed in the client browser.
  
From the perspective of www.linuxaid.com.cn, the connection is established between port 1037 in 1.2.3.4 and port 80 in Port 202.99.11.120. From the client perspective, the connection is established between port 1025 of 192.168.1.100 and port 8080 of 1.2.3.4.
  
1.3.2 transparent proxy
  
Transparent proxy means that the client does not need to know the existence of a proxy server.
Based on the above, we will do the following:
1. Configure the transparent proxy server software to run on port 8080 of the proxy server.
2. Configure the proxy server to redirect all the connections to port 80 to port 8080.
3. Configure the client browser to directly connect to the Internet.
4. Configure DNS on the client.
5. Configure the default gateway of the client as 192.168.1.1.
  
When we open a web request in a client browser, such as a http://www.linuxaid.com.cn, the following events will happen one after another:
1. The client requests "www.linuxaid.com.cn" from the DNS to obtain the IP address 202.99.11.120. Then, the client uses a port (such as 1066) to initiate a web connection request to port 80 of the IP address to request the web page.
2. When the request packet passes through the transparent proxy server, it is redirected to the binding port 8080 of the proxy server. Therefore, the transparent proxy server uses a port (such as 1088) to initiate a web connection request to port 80 of 202.99.11.120 to request the web page.
3. After receiving the response to the web page, the proxy server sends the data to the client.
4. This page is displayed in the client browser.
  
From the perspective of www.linuxaid.com.cn, the connection is established between port 1088 in 1.2.3.4 and port 80 in Port 202.99.11.120. From the client perspective, the connection is established between port 1066 of 192.168.1.100 and port 80 of port 202.99.11.120.
  
These are the differences between traditional proxy servers and transparent proxy servers.
  
2. Comparison of various proxy servers
  
There are a lot of agent server software in linux. I checked it from www.freshmeat.com (a famous linux software site), and there are more than 60. However, Apache, socks, squid, and other widely used practices prove to be high-performance proxy software. Next we will compare these software:
  
2.1 Apache
  
Apache is the most widely used HTTP server in the world. It is most widely used because of its powerful functions, high efficiency, security, and speed. Apache contains a proxy module starting from version 1.1.x. The performance advantage of using Apache as the proxy server is not obvious and is not recommended.
  
  
Socks 2.2
  
Socks is a network proxy protocol that allows clients to access the Internet through the Socks Server. Scoks establishes a secure proxy data channel between the server and the client. From the perspective of the customer, Scoks is transparent; from the perspective of the server, Socks is the client. The client does not need to have direct access to the Internet (that is, private IP addresses can be used), because the Socks Server can redirect connection requests from the client to the Internet. In addition, the Socks Server can authenticate user connection requests and allow legal users to establish proxy connections. Socks also prevents unauthorized Internet users from accessing the internal network. Socks is often used as a firewall.
Common browsers such as netscape and IE can directly use Socks, and we can also use the client in socsk5 to enable internet software that does not directly support socks to use Socks.
For more information, see Socks official site http://www.socks.nec.com.
  
2.3 Squid
  
For web users, Squid is a high-performance Proxy Cache Server. Squid supports FTP, gopher, and HTTP protocols. Unlike general proxy cache software, Squid uses a separate, non-modular, I/O-driven process to process all client requests.
Squid caches data elements in the memory and DNS query results. In addition, Squid also supports non-modular DNS queries to negatively cache failed requests. Squid supports SSL and access control. Because of the use of ICP (lightweight Internet Cache Protocol), Squid can implement a layered proxy array to maximize bandwidth savings.
Squid is composed of a major service program squid, a DNS query program dnsserver, several programs that rewrite requests and perform authentication, and several management tools. After Squid is started, it can derive a specified number of dnsserver processes in advance, and each dnsserver process can execute a separate DNS query, in this way, the time for the server to wait for DNS query is greatly reduced.
  
2.4 select
  
From the above comparison, we can see that the main function of Apache is the web server, and the proxy function is just a module. Although Socks is powerful, it is not flexible, so we recommend you use Squid. The following sections let us learn the exciting features and related installation and configuration of Squid.
  
3. Install Squid Proxy Server
  
3.1 obtain software
  
You can obtain the software through the following channels:
1. download the software from Squid's official site http://www.squid-cache.org;
2. Obtain the software from your linux release;
Generally, Squid packages have two types:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.