What is LVS?
?? Let's start with a brief introduction to LVS (Linux Virtual Server), which is actually a cluster (Cluster) technology that uses IP load balancing technology and content-based request distribution technology. The scheduler has a good throughput rate, transfers the request evenly to the different server execution, and the scheduler automatically shields off the server's failure, thereby forming a set of servers into a high-performance, highly available virtual server. The structure of the entire server cluster is transparent to the customer and eliminates the need to modify client and server-side programs.
?? To do this, you need to consider system transparency, scalability, high availability, and manageability at design time. In general, the LVS cluster uses a three-tier structure with its architecture:
Main components of LVS
?? Load Balancer/director, which is the entire cluster to the outside of the front end machine, is responsible for sending the customer's request to a set of servers to execute, and the customer believes that the service is from an IP address (we can call the virtual IP address) on the.
?? Server Pool/realserver, a set of servers that actually perform customer requests, typically have web, MAIL, FTP, and DNS services.
?? Shared storage, which provides a shared storage area for a server pool, makes it easy to have the same content for the server pool and provide the same service.
LVS Load Balancing mode: Virtual Server via Network Address translation NAT (Vs/nat)
?? Vs/nat is one of the simplest ways, all realserver only need to point their gateways to the director. The client can be any operating system, but in this way a director can drive a realserver that is relatively limited. In the Vs/nat way, the director can also be a realserver. The architecture of the Vs/nat.
Virtual Server via IP tunneling (Vs/tun)
?? IP tunneling (IP tunneling) is the technique of encapsulating an IP message in another IP packet, which enables a packet of data that is targeted for one IP address to be encapsulated and forwarded to another IP address. IP tunneling technology is also known as IP encapsulation Technology (IP encapsulation). IP tunneling is primarily used for mobile hosts and virtual private networks where tunnels are statically established, one end of the tunnel has an IP address, and the other end has a unique IP address. Its connection scheduling and management is the same as in Vs/nat, but its message forwarding method is different. According to the load situation of each server, the scheduler chooses a server dynamically, encapsulates the request message in another IP packet, forwards the encapsulated IP message to the selected server, and the server receives the message, the message is first unpacked to obtain the original target address of the VIP message, The server discovers that the VIP address is configured on the local IP tunneling device, so it processes the request and then returns the response message directly to the client based on the routing table.
Virtual Server via Direct Routing (VS/DR)
?? The Vs/dr method is implemented by rewriting the MAC address portion of the request message. The director and Realserver must physically have a network card connected through an uninterrupted LAN. Realserver binding VIP configuration on their respective NON-ARP network devices (such as Lo or TUNL), the director's VIP address is visible externally, and the VIP of realserver is outside the invisible. The Realserver address can be either an internal address or a real address.
?? VS/DR Workflow: Its connection scheduling and management as in the Vs/nat and Vs/tun, its message forwarding method is different, the message is routed directly to the target server. In Vs/dr, the scheduler dynamically chooses a server based on the load of each server, does not modify or encapsulate IP packets, but instead converts the MAC address of the data frame to the MAC address of the server and sends the modified data frame to the local area network of the server group. Because the MAC address of the data frame is the selected server, the server is sure to receive the data frame from which the IP message can be obtained. When the server discovers that the destination address of the message is on the local network device, the server processes the message and then returns the response message directly to the client based on the routing table.
?? Vs/dr's approach is one of the most widely used load balancing methods available on large Web sites today.
Comparison of three load balancing methods
??The advantage of Vs/nat is that the server can run any TCP/IP-enabled operating system, it only needs an IP address configured on the scheduler, and the server group can use a private IP address. The disadvantage is that its scalability is limited, when the number of server nodes rise to 20 o'clock, the scheduler itself may become a new bottleneck in the system, because in Vs/nat request and response messages are required through the load scheduler. We measured the average delay of the rewritten message on the host of the Pentium166 processor at 60US, and the latency on the processor with higher performance was shorter. Assuming that the average length of the TCP message is 536 Bytes, the maximum throughput of the scheduler is 8.93 mbytes/s. Let's assume that the throughput of each server is 800KBYTES/S, so a scheduler can drive 10 servers. (Note: This is the data measured long ago)
?? The Vs/nat-based cluster system can be adapted to the performance requirements of many servers. If the load scheduler becomes a new bottleneck in the system, there are three ways to solve this problem: mixed methods, Vs/tun, and VS/DR. In the DNS hybrid cluster system, there are several vs/nat negative schedulers, each with its own server cluster, and these load schedulers form a simple domain name through Rr-dns.
?? But Vs/tun and VS/DR are a better way to improve system throughput.
?? For those network service that transmits IP address or port number in the message data, the corresponding application module should be written to convert the IP address or port number in the message data. This brings the amount of work implemented, while the application module checks the packet overhead to reduce the throughput of the system.
?? In the Vs/tun cluster system, the load scheduler only dispatches requests to a different back-end server, and the back-end server returns the answered data directly to the user. In this way, the load scheduler can handle a large number of requests, it can even dispatch more than hundred servers (the same size of the server), and it will not become a system bottleneck. Even if the load scheduler has only 100Mbps full-duplex NICs, the maximum throughput for the entire system can exceed 1Gbps. Therefore, Vs/tun can greatly increase the number of servers dispatched by the load scheduler.
?? The Vs/tun Scheduler can dispatch hundreds of servers, which in itself do not become a bottleneck in the system and can be used to build high-performance super servers.Vs/tun technology requires the server that all servers must support an "IP tunneling" or "IP Encapsulation" protocol. Currently, the Vs/tun backend server is primarily running the Linux operating system, we did not test the other operating systems. Because IP tunneling is becoming the standard protocol for each operating system, Vs/tun should apply to back-end servers running other operating systems.
?? As with the Vs/tun method, the VS/DR scheduler only handles client-to-server connections, and the response data can be returned directly from a separate network route to the customer. This can greatly improve the scalability of the LVS cluster system.compared with Vs/tun, this method does not have the overhead of IP tunneling, but requires that the load scheduler and the actual server have a NIC attached to the same physical network segment, the server network device (or device alias) does not make ARP response, or can redirect the message (Redirect) To the local socket port.
?? The advantages and disadvantages of three LVS load balancing techniques are summarized in the following table:
Indicators |
Vs/nat |
Vs/tun |
VS/DR |
Server operating System |
Any |
Tunnel support |
Majority (support Non-arp) |
Server network |
Private network |
LAN/WAN |
Lan |
Number of servers (100M network) |
10~20 |
100 |
Greater than 100 |
Server Gateway |
Load Balancer |
Own routing |
Own routing |
Efficiency |
So so |
High |
Highest |
?? Note: The estimation of the maximum number of servers supported by the above three methods is assumed that the scheduler uses a 100M network adapter, the hardware configuration of the scheduler is the same as the hardware configuration of the back-end server, and is for the general Web services. With higher hardware configurations (such as gigabit NICs and faster processors) as the scheduler, the number of servers that the scheduler can dispatch increases accordingly. When the application is not the same, the number of servers changes accordingly. Therefore, the above data estimates are mainly for the scalability of three methods of quantitative comparison.
Attached: Other load balancing algorithms
HTTP REDIRECT Load Balancing
?? When the user sends a request, the Web server returns a new URL by modifying the location tag in the HTTP response header, and then the browser continues to request the new URL, which is actually page redirection. The goal of "load balancing" is achieved through redirection. For example, when we download the PHP source package, click on the download link, in order to solve the problem of different countries and regions download speed, it will return a close to us. The redirected HTTP return code is 302.
Advantages: relatively simple.
Disadvantage: The browser requires two requests for the server to complete one visit and has poor performance. The ability of the redirection service itself can become the bottleneck, the whole cluster's scaling national model is limited, and the use of HTTP302 response code redirection may make the search engine be judged as SEO cheat and reduce the search rank.
DNS domain name resolution load Balancing
?? DNS (domain name System) is responsible for the domain name resolution service, the domain name URL is actually the server alias, the actual mapping is an IP address, the parsing process, is the DNS to complete the domain name to IP mapping. and a domain name can be configured to correspond to multiple IPs. Therefore, DNS is also available as a load balancing service.
?? In fact, large sites are always partially using DNS domain name resolution, using domain name resolution as the first level of load balancing means, that is, the domain name resolution of a group of servers is not actually the physical server to provide Web services, but also to provide load balancing Service internal server, this set of internal load Balancer Server load Balancing, Distribute the request to a true Web server.
Advantages: Load balancing work to the DNS, eliminating the hassle of web site management to maintain load balancer server, while many DNS also support geo-location-based domain name resolution, the domain name will be resolved to an example of the most recent user Geography server address, which can speed up user access and improve performance.
Cons: It is not possible to define rules freely, and it is cumbersome to change the mapped IP or machine failure, and there is a problem with DNS effective delay. and DNS load Balancing control in the domain Name service provider, the site can not do more to improve and more powerful management.
Reverse Proxy load Balancing
?? The reverse proxy service can cache resources to improve site performance. In fact, in the deployment location, the reverse proxy server is in front of the Web server (so that it is possible to cache the Web corresponding, accelerated access), this location is exactly the location of the Load Balancer server, so most of the reverse proxy server provides load balancing functions, management of a set of Web servers, The request is forwarded to a different Web server based on the load balancing algorithm. The response that the Web server processes to completion also needs to be returned to the user through the reverse proxy server. Because the Web server does not provide access directly, the Web server does not need to use an external IP address, whereas the reverse proxy server needs to configure both the dual network adapter and the internal external two sets of IP addresses.
Pros: Integrated with reverse proxy server capabilities, easy to deploy.
Disadvantage: A reverse proxy server is a broker for all requests and responses, and its performance can be a bottleneck.
Reference:
1.http://soft.chinabyte.com/25/13169025.shtml
2. "Large-scale website technology architecture-core principles and technical analysis"
3. http://blog.csdn.net/u013256816/article/details/48707505
LVS: Three load balancing modes compared with three load balancing modes