One of Linux load Balancing software LVS (concept article)
A brief introduction of LVS
LVS is the short name of Linux virtual server, the Linux VM, a free software project initiated by Dr. Zhangwensong, and its official site is www.linuxvirtualserver.org. Now that LVS is already part of the Linux standard kernel, Before the Linux2.4 kernel, the use of LVS must be recompiled to support the LVS function module, but from the Linux2.4 kernel, has been completely built in the LVS function modules, no need to make any patches to the kernel, you can directly use the various functions of LVS.
The goal of using LVS technology is to achieve a high-performance, highly available server cluster through LVS's load balancing technology and Linux operating system, which is reliable, scalable, and operable. In order to achieve the best service performance at low cost.
LVS since the beginning of 1998, has developed to now is a relatively mature technology project. You can use the LVS technology to achieve highly scalable, highly available network services such as WWW services, cache services, DNS services, FTP services, mail services, video/audio on-demand services, and so on, there are many more well-known sites and organizations are using LVS set up the cluster system, For example: Linux portal (www.linux.com), the RealPlayer to provide audio and video services known real company (www.real.com), the world's largest open source Web site (sourceforge.net) and so on.
Second, the LVS system structure
The use of LVS set up the server cluster system consists of three parts: the most front-end load balancing layer, with the load balancer, the middle of the server group layer, with the server array, the bottom of the data sharing storage layer, with shared storage, in the view of the user, All internal applications are transparent, and users are simply using a high-performance service provided by a virtual server.
The LVS architecture is shown in Figure 1:
Fig. 1 The architecture of LVS
The following is a detailed introduction to the various components of LVS:
load Balancer Layer: The most front-end to the entire cluster system, one or more load scheduler (Director Server) composed of the LVS module installed on the Director Server, and the main role of Director is similar to a router, It contains the routing table set up by the LVS function to distribute the user's request to the application server (real server) on the server array layer. Also, a monitoring module Ldirectord for Real server services is installed on the director server to monitor the health status of the various real server services. Remove it from the LVS routing table when real server is unavailable, and rejoin when it resumes.
server Array layer: Consists of a set of machines that actually run application services, which can be Web servers, mail servers, FTP servers, DNS servers, one or more video servers, and each real Servers are connected through a high-speed LAN or distributed across a WAN. In a practical application, Director server can also concurrently serve as real server role.
shared Storage layer: is a storage area that provides shared storage space and content consistency for all real servers, and, in physics, typically consists of disk array devices, which, in order to provide consistency of content, can generally share data through the NFS network file system, But NFS does not perform well in busy business systems, with clustered file systems such as the GFS file system for Red Hat, the OCFS2 file system provided by Oracle, and so on.
From the entire LVS structure can be seen, Director server is the core of the entire LVS, currently, the operating system for Director server can only be Linux and FreeBSD, The linux2.6 kernel can support the LVS feature without any setting, and FreeBSD as director server is not a lot of application and performance is not very good.
For real Server, almost any system platform, Linux, Windows, Solaris, AIX, BSD series can be well supported.
Iii. characteristics of the LVS cluster
3.1 IP load balancing and load scheduling algorithm
1. IP Load Balancing Technology
Load balancing technology has a number of implementation scenarios, there are methods based on DNS domain name rotation, there is a method based on client scheduling, there are scheduling methods based on application layer system load, and scheduling method based on IP address, in these load scheduling algorithms, IP load Balancing technology is the most efficient.
LVS IP Load Balancing technology is implemented through the Ipvs module, Ipvs is the core software of LVS cluster system, its main function is to install on Director server and virtual an IP address on Director server. The user must access the service through this virtual IP address. This virtual IP is generally called the VIP of LVS, that is virtual IP. The access request is first reached through the VIP to the load scheduler, and then the load Scheduler picks a service node from the real server list to respond to the user's request.
When the user's request arrives at the load scheduler, how the dispatcher sends the request to the real server node that provides the service, and how the real server node returns the data to the user, is the key technology implemented by Ipvs, Ipvs implements three load balancing mechanisms, NAT, Tun, and Dr, Details are as follows:
vs/nat: That is, Virtual Server via network address translation
That is, network address translation technology to implement a virtual server, when the user requests to the scheduler, the dispatcher will request the message's destination address (that is, the virtual IP address) to the selected real server address, and the target port of the message is also changed to the corresponding port of the selected real server, Finally, the message request is sent to the selected real Server. After the server has obtained the data, real server returns the data to the user, needs again through the load dispatcher to change the source address and the source port of the message to the virtual IP address and the corresponding port, then sends the data to the user, completes the entire load dispatch process.
It can be seen that in the NAT mode, the user requests and response messages must be rewritten by Director server address, when the user requests more and more, the scheduler's processing ability will be called the bottleneck.
vs/tun: That is, Virtual Server via IP tunneling
That is, the IP tunneling technology implements the virtual server. Its connection scheduling and management is the same as the Vs/nat way, but its message forwarding method is different, Vs/tun way, the scheduler uses IP tunneling technology to forward user requests to a real server, and this real server will respond directly to the user's request, no longer through the front-end scheduler, In addition, there is no requirement for the geographic location of real server to be located in the same network segment as the director server or as a stand-alone network. Therefore, in the Tun mode, the scheduler will only handle the user's message request, the throughput of the cluster system is greatly improved.
VS/DR: That is, Virtual Server via Direct Routing
That is, using direct routing technology to implement virtual server. Its connection scheduling and management is the same as in Vs/nat and Vs/tun, but its message forwarding method is different, vs/dr by overwriting the request message MAC address, send the request to real server, and real server will respond directly back to the customer, eliminates the vs/ The IP tunneling overhead in the Tun. This is the best performance in three load scheduling mechanisms, but requires that both Director server and real server have a single network card connected to the same physical network segment.
2. Load scheduling algorithm
Above we talked about, the load scheduler is based on the load of each server, dynamically select a real server to respond to user requests, then dynamic selection is how to achieve it, in fact, we have to say here the load scheduling algorithm, according to different network service requirements and server configuration, Ipvs implemented the following eight kinds of load scheduling algorithms, here we detail the most commonly used four scheduling algorithms, the remaining four scheduling algorithms please refer to other data.
Wheel Call scheduling (Round Robin)
"Wheel call" scheduling is also called 1:1 scheduling, the scheduler through the "round call" scheduling algorithm to the external user request in order 1:1 of the allocation to each real server in the cluster, this algorithm treats each of the actual server equally, regardless of the server's physical load status and connection state.
Weighted wheel call scheduling (Weighted Round Robin)
The weighted wheel call scheduling algorithm dispatches access requests based on the different processing capabilities of real server. You can set different dispatch weights for each real server, and for real server with relatively good performance, you can set a higher weight, but for real server with less processing power, you can set a lower weight value, which ensures that the processing capacity of the server handles more traffic. Fully and reasonably utilize the server resources. At the same time, the scheduler can also automatically query the load of real server and adjust its weights dynamically.
Minimum link scheduling (least connections)
The minimum connection scheduling algorithm dynamically dispatches network requests to the server with the fewest number of links established. If the real server of the cluster system has similar system performance, the "Minimal connection" scheduling algorithm can be used to balance the load well.
Weighted minimum link scheduling (weighted least connections)
The weighted minimum link schedule is a superset of the minimum connection schedule. Each service node can represent its processing power with the corresponding weights, while the system administrator can dynamically set the corresponding weights, the default weight is 1, and the weighted minimum connection dispatch makes the connection number of the service node and its weights proportional to the new connection request.
The other four scheduling algorithms are: Minimal link based on locality (locality-based least connections), least-based local link with replication (locality-based least connections with Replication), Destination address hash (destination hashing) and source address hashes (sources hashing), for the meaning of these four scheduling algorithms, this article no longer describes, if you want to know more about the remaining four scheduling strategies, You can login to the LVS Chinese site zh.linuxvirtualserver.org for more detailed information.
3.2 High Availability
LVS is a kernel-level application software, therefore has the very high processing performance, uses the LVS frame load balanced cluster system to have the outstanding processing ability, each service node's fault does not affect the entire system the normal use, simultaneously realizes the load reasonable balance, causes the application to have the ultra-high load service ability, Millions of concurrent connection requests can be supported. If the configuration of the Hundred Gigabit Network card, using Vs/tun or VS/DR scheduling technology, the entire cluster system throughput can be as high as 1gbits/s; If you configure a Gigabit NIC, the system's maximum throughput is close to 10gbits/s.
3.3 High Reliability
LVS load Balancing software has been widely used in enterprises, schools and other industries, many large and critical Web sites have also adopted the LVS cluster software, so its reliability in practice has been well confirmed. There are many load balancing systems in LVS that run for a long time and have never been restarted. All these show the high stability and high reliability of LVS.
3.4 Applicable environment
LVS to front-end Director server currently supports only Linux and FreeBSD systems, but supports most TCP and UDP protocols, and applications that support TCP protocols are: Http,https, Ftp,smtp,,pop3,imap4,proxy, Ldap,ssmtp and so on. Applications that support UDP protocols are: DNS,NTP,ICP, video, audio streaming protocols, and so on.
LVS has no restrictions on real server operating systems, and real server can run on any TCP/IP-enabled operating system, including Linux, various Unix (such as FreeBSD, Sun Solaris, HP UNIX, etc.), mac/ OS, Windows, and so on.
3.5 Open Source Software
The LVS cluster software is free software issued under the GPL (GNU public License) license, so the user can obtain the source code of the software and make various modifications according to their own needs, but the modification must be issued under the GPL.
Original works, allow reprint, reprint, please be sure to hyperlink form to indicate the original source of the article, author information and this statement. Otherwise, legal liability will be held. http://ixdba.blog.51cto.com/2895551/552947