Linux load balancer software one of the LVS (concept article)

Source: Internet
Author: User
Tags app service server array

I. Introduction of LVS
LVS is the short name of Linux virtual server, which is a free software project initiated by Dr. Zhangwensong, and its official site is www.linuxvirtualserver.org. Now LVS is already part of the Linux standard kernel, Prior to the Linux2.4 kernel, the LVS had to be recompiled to support the LVS function module, but since the Linux2.4 kernel, the various functions of LVS have been built-in, without any patching of the kernel, and the functions provided by LVS can be used directly.
The goal of using LVS technology is to achieve a high-performance, highly available server cluster with the load balancing technology provided by LVS and the Linux operating system, which has good reliability, scalability and operability. To achieve the best service performance at low cost.
LVS since 1998, has developed into a more mature technology project now. LVS technology can be used to achieve highly scalable, highly available network services, such as WWW services, cache services, DNS services, FTP services, mail services, video/audio-on-demand services, and so on, there are many more famous sites and organizations are using LVS set up the cluster system, For example: The Linux portal (www.linux.com), real Company (www.real.com), the world's largest open source website (sourceforge.net), which provides audio and video services to RealPlayer.
II. structure of the LVS system
The server cluster system with LVS is composed of three parts: the most front-end load balancer layer, represented by load balancer, the middle server group layer, with server array, the bottom of the data sharing storage layer, with shared storage, in the view of users, All internal applications are transparent and users are using only the high-performance services provided by a virtual server.
The LVS architecture is shown in 1:

Below is a detailed description of the various components of LVS:
? Load Balancer layer: At the forefront of the entire cluster system, there is one or more load scheduler (Director server), the LVS module is installed on the director server, and director's main role is similar to a router, It contains the routing tables set up to complete the LVS function, which distribute the user's requests to the application server (Real server) at the server array level through these routing tables. Also, on the director server, you install the Monitoring module Ldirectord for the real Server service, which is used to monitor the health status of each real Server service. When real server is unavailable, remove it from the LVS routing table and rejoin it upon recovery.
? Server Array layer: Consists of a set of machines that actually run the app service, one or more of the Web server, mail server, FTP server, DNS server, video server, and each real Servers are connected to each other over a high-speed LAN or across a WAN. In a real-world application, Director server can also be the role of real server concurrently.
? Shared storage layer: is a storage area that provides shared storage space and content consistency for all real servers, physically consisting of disk array devices and, in order to provide consistency of content, can generally share data via NFS Network file systems. But NFS in a busy business system, performance is not very good, at this time can use the cluster file system, such as Red Hat GFs file system, Oracle provides the OCFS2 file system and so on.
As can be seen from the entire LVS structure, director server is the core of the entire LVS, currently, the operating system for director server can only be Linux and FreeBSD, The linux2.6 kernel can support LVS without any setup, and FreeBSD as a director server is not a lot of applications, performance is not very good.
For real Server, almost all system platforms, Linux, Windows, Solaris, AIX, BSD series can be very well supported.

Three, the characteristics of the LVS cluster
3.1 IP load balancing and load scheduling algorithm

1. IP Load Balancing Technology
Load balancing technology has many implementations, there are methods based on DNS domain name rotation, a method based on client scheduling access, a scheduling method based on application layer system load, and a scheduling method based on IP address, in which the most efficient implementation is IP load balancing technology.
The IP load balancing technology of LVS is realized by Ipvs module, Ipvs is the core software of LVS cluster system, its main function is: Install on Director server, and virtual an IP address on Director server. The user must access the service through this virtual IP address. This virtual IP is generally called the LVS VIP, namely virtual IP. The requests that are accessed first go through the VIP to the load scheduler, and then the load Scheduler picks a service node from the real server list to respond to the user's request.
When the user's request arrives at the load scheduler, how the scheduler sends the request to the real server node that provides the service, and how the real server node returns the data to the user, is the key technology implemented by Ipvs, and there are three kinds of load balancing mechanisms Ipvs, namely NAT, Tun, and Dr, Details are as follows:
? Vs/nat: (Virtual Server via Network Address translation)
That is, the network address translation technology to implement a virtual server, when the user requests to reach the scheduler, the scheduler will request the message's destination address (that is, the virtual IP address) to the selected real server address, while the destination port of the message is also changed to the corresponding port of the selected real server, Finally, the message request is sent to the selected real Server. After the data is obtained on the server side, when Real server returns the data to the user, it needs to go through the load scheduler again to change the source address and source port of the message to the virtual IP address and the corresponding port, then send the data to the user to complete the load scheduling process.
As can be seen, in the NAT mode, the user request and response messages must be rewritten by the Director server address, when the user requests more and more time, the scheduler's processing power will be called bottlenecks.
? Vs/tun: That is (Virtual Server via IP tunneling)
That is, IP tunneling technology to implement virtual server. Its connection scheduling and management is the same as the Vs/nat way, but its message forwarding method is different, Vs/tun mode, the Scheduler uses IP tunneling technology to forward user requests to a real server, and this real server will directly respond to the user's request, no longer through the front-end scheduler, In addition, there is no requirement for the GEO location of the real server, either in the same network segment as the director server or as a standalone network. Therefore, in the Tun mode, the scheduler will only process the user's message request, the throughput of the cluster system is greatly improved.
? VS/DR: That is (Virtual Server via Direct Routing)
That is, the use of direct routing technology to implement virtual server. Its connection scheduling and management is the same as in Vs/nat and Vs/tun, but its message forwarding method is different, vs/dr by overwriting the request message's MAC address, send the request to real server, and real server to return the response directly to the customer, eliminating the vs/ The IP tunneling overhead in the Tun. This is the best performance in three load scheduling mechanisms, but it must be required that both the Director server and the real server have a NIC attached to the same physical network segment.

2. Load scheduling algorithm
As we mentioned above, the load scheduler is based on the load situation of each server, dynamically select a real server to respond to user requests, then the dynamic selection is how to implement, in fact, we are here to say the load scheduling algorithm, according to different network service requirements and server configuration, Ipvs implements the following eight kinds of load scheduling algorithms, here we detail the most commonly used four scheduling algorithms, the remaining four scheduling algorithms please refer to other information.
? Round call scheduling (Round Robin)
"Round Call" dispatch is also called 1:1 scheduling, the scheduler through the "round call" scheduling algorithm to the external user request in order 1:1 to each real server in the cluster, the algorithm treats each real server equally, regardless of the actual load status and connection status on the server.
? Weighted round call scheduling (Weighted Round Robin)
The "Weighted round call" scheduling algorithm dispatches access requests based on the different processing capabilities of real server. You can set different scheduling weights for each real server, and for a relatively good real server, you can set a higher weight, and for a less powerful real server, you can set a lower weight value, which ensures that the processing power of the server handles more traffic. The server resources are utilized fully and rationally. At the same time, the scheduler can automatically query the real server load situation, and dynamically adjust its weight value.
? Minimal link scheduling (Least Connections)
The "least connection" scheduling algorithm dynamically dispatches network requests to the server with the fewest number of established links. If the real server of the cluster system has similar system performance, the "Minimum connection" scheduling algorithm can be used to balance the load well.
? Weighted minimum link scheduling (Weighted Least Connections)
"Weighted least link scheduling" is a superset of "least connection scheduling", each service node can use the corresponding weights to represent its processing power, and the system administrator can dynamically set the corresponding weights, the default weight is 1, the weighted minimum connection scheduling when allocating new connection requests as far as possible to make the service node's established connection number and its weight is proportional.
The other four scheduling algorithms are: local-based least-link (locality-based Least Connections), local-based least-link with replication (locality-based Least Connections with Replication), the destination address hash (Destination Hashing), and the source address hash (sources Hashing), the meaning of these four scheduling algorithms, this article no longer described, if you want to learn more about the remaining four scheduling strategies, You can log in to the LVS Chinese site zh.linuxvirtualserver.org for more detailed information.

3.2 High Availability
LVS is a kernel-level application software, so has high processing performance, with the LVS framework of load Balancing cluster system has excellent processing capacity, each service node failure will not affect the normal use of the entire system, while achieving a reasonable load balance, so that the application has an ultra-high load of service capacity, Millions of concurrent connection requests can be supported. If you configure a Gigabit network card, using Vs/tun or VS/DR scheduling technology, the entire cluster system throughput can be as high as 1gbits/s, such as the configuration of gigabit network cards, the maximum throughput of the system is close to 10gbits/s.

3.3 High Reliability
LVS load Balancing cluster software has been widely used in enterprises, schools and other industries, many large, critical Web sites have also adopted the LVS cluster software, so its reliability in practice has been well confirmed. There are a lot of LVS-made load-balancing systems that run for a long time and have never been restarted. These demonstrate the high stability and high reliability of LVS.

3.4 Applicable environment
LVS for front-end director server currently supports only Linux and FreeBSD systems, but supports most TCP and UDP protocols, and applications that support TCP protocols are: Http,https, Ftp,smtp,,pop3,imap4,proxy, Ldap,ssmtp and so on. The applications that support UDP protocol are: DNS,NTP,ICP, video, audio streaming protocol and so on.
LVS has no limitations on real server's operating system, and real server can run on any TCP/IP-enabled operating system, including Linux, Unix (such as FreeBSD, Sun Solaris, HP UNIX, etc.), mac/ OS, Windows, and more.

3.5 Open Source Software
The LVS cluster software is a free software issued under the GPL (GNU public License) license, so the user can obtain the source code of the software, and can make various changes according to their own needs, but the modification must be distributed in GPL mode.

This article is from the "Technical Achievement Dream" blog, please be sure to keep this source http://ixdba.blog.51cto.com/2895551/552947

Linux load balancer software one of the LVS (concept article)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.