Analysis and implementation of LVS load Balancing algorithm for enterprise cluster platform

Source: Internet
Author: User
Tags app service server array node server

I. Common architecture diagram of LVS cluster

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M01/7F/65/wKioL1cdqYuR1FM-AABfNtjHVlQ038.jpg "title=" 111111111.jpg "alt=" Wkiol1cdqyur1fm-aabfntjhvlq038.jpg "/>


Load Balancer Layer: Located at the forefront of the entire cluster system, consisting of one or more load schedulers (Director Server). The LVS core template Ipvs is installed on the director server, and the Director's primary role is similar to a router, which contains the routing tables set up to complete the LVS function, distributing the user's requests to the server through these routing tables. The application server (Real server) of the array layer. At the same time, the monitoring module Ldirectord for real server is installed on the director server, which is used to monitor the health status of each real Server service. When real server is unavailable, it can be removed from the LVS routing table and re-added upon recovery.

Server Array layer: Consists of a set of machines that actually run the app service, one or more of the Web server, mail server, FTP server, DNS server, video server, and each real Servers are connected to each other over a high-speed LAN or across a WAN. In a real-world application, Director server can also be the role of real server concurrently.

hared storage layer: is a storage area that provides shared storage space and content consistency for all real servers, typically consisting of disk array devices. , in order to provide consistency of content, it is generally possible to share data through the NFS network File system, but NFS is not very good in a busy business system, which can be implemented using a clustered file system such as the GFS file system of Red Hat, the OCFS2 file system provided by Oracle, etc.


As can be seen from the entire LVS structure, the Director server is the core of the entire LVS, currently, the operating system for director server is only the Linux and Freebsd,linux 2.6 kernel completely built-in the LVS of each module, The LVS feature can be supported without any setup.

For real Server, almost all system platforms, Linux, Windows, Solaris, AIX, BSD series are well supported.


Second, load scheduling algorithm


Load balancing technology has many implementations, there are methods based on DNS domain name rotation, a method based on client scheduling access, a scheduling method based on application layer system load, and a scheduling method based on IP address, in which the most efficient implementation is IP load balancing technology.

The IP load balancing technology of LVS is realized by Ipvs module, Ipvs is the core software of LVS cluster system, its main function is: Install on Director server, and virtual an IP address on Director server. The user must access the server through this virtual IP address. This virtual IP is generally called the LVS VIP, namely virtual IP. The requests that are accessed first go through the VIP to the load scheduler, and then the load Scheduler picks a service node from the real server list to respond to the user's request.

When a user's request arrives at the load scheduler, how the scheduler sends the request to the real server node that provides the service, and how the real server node returns data to the user, is a key technology implemented by Ipvs. There are three ways to load balance Ipvs, namely NAT (full NAT), Tun, and Dr, which are described in detail below.


Three, Dr mode


Here is the DR Mode data transfer diagram:

650) this.width=650; "src=" http://s1.51cto.com/wyfs02/M02/7F/65/wKioL1cdqfDDuDE5AAEH-FhXD0k761.jpg "title=" 1.jpg " alt= "Wkiol1cdqfddude5aaeh-fhxd0k761.jpg"/>


Dr Mode: Virtual server via direct Routing, that is, the use of direct routing technology to implement the VM. The connection scheduling and management in this way is the same as the first two, but its message forwarding method is different, vs/dr the request message by overwriting the MAC address, send the request to real server, and real server to return the response directly to the customer, eliminating the IP tunneling overhead in Vs/tun. This approach is the best performance in three load scheduling modes.

The following is the DR Mode IP packet scheduling process diagram:

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M02/7F/68/wKiom1cdqUDzx7VHAABsJqlqUIA056.png "title=" 2.png " alt= "Wkiom1cdqudzx7vhaabsjqlquia056.png"/>



schematic Description:

The DR mode routes the message directly to the target real server. In DR Mode, according to the load situation of each real server, the dispatcher chooses a server dynamically, does not modify the target IP address and destination port, and does not encapsulate the IP packet, but instead the target MAC address of the data frame of the request message to the MAC address of the real server. The modified data frame is then sent on the local area network of the server group. Because the MAC address of the data frame is the MAC address of the real server, and it is on the same LAN. Then according to the network communication principle, the real reset is bound to receive the packet sent by Lb. When the real server receives the request packet, it is the VIP to unlock the IP header to see the target IP.

At this point only your own IP will be received in accordance with the target IP, so we need to configure the VIP on the local loopback pretext. Another: Because the network interface will be ARP broadcast response, but the other machines in the cluster have the VIP LO interface, the response will conflict. So we need to shut down the ARP response to the LO interface of the real server.

Then the real server makes the request response, then sends the response packet back to the customer according to its routing information, and the source IP address is the VIP.

Dr Mode Summary:

1. Forwarding is implemented by modifying the destination MAC address of the packet on the scheduler lb. Note The source address is still CIP, and the destination address is still the VIP address.

2, the requested message passes through the scheduler, and the RS response processing message does not need to go through the scheduler lb, so the concurrent access volume is very high efficiency (and NAT mode ratio)

3, because the DR mode is through the MAC address rewriting mechanism for forwarding, so all RS node and scheduler lb only in one LAN

4, the RS host needs to bind the VIP address on the LO interface, and need to configure ARP suppression.

5, the RS node default gateway does not need to be configured to LB, but directly configured as a superior Route gateway, can let RS directly out of the network.

6, because the DR Mode scheduler only makes the MAC address rewrite, so the scheduler lb can not overwrite the target port, then the RS server will have to use the same port as the VIP service.


Four,nat/full NAT mode


650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M00/7F/68/wKiom1cdqp3yaR3KAAEEmB3Cm0c435.jpg "title=" Www.jpg "alt=" Wkiom1cdqp3yar3kaaeemb3cm0c435.jpg "/>

NAT Mode: Virtual server via network address translation, which is the translation technology for Web addresses. When a user requests to reach the scheduler, the scheduler overwrites the destination address of the request message (that is, the virtual IP address) to the selected real server address while the destination port of the message is also changed to the corresponding port of the selected real server, and the message request is sent to the selected real server. After the data is obtained on the server side, when Real server returns the data to the user, it needs to go through the load scheduler again to change the source address and source port of the message to the virtual IP address and the corresponding port, then send the data to the user to complete the load scheduling process.


The following is the NAT mode IP packet scheduling process diagram:

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M01/7F/68/wKiom1cdqxjwrML1AABRSqP-LGU142.png "title=" Qq20160425132454.png "alt=" Wkiom1cdqxjwrml1aabrsqp-lgu142.png "/>


Schematic Description:

1, the client requests the data, the target IP is the VIP

2, the request data arrives the LB server, the LB according to the scheduling algorithm changes the destination address to the RIP address and the corresponding port (this RIP address is according to the scheduling algorithm obtains. ) and record the connection in the connection hash table.

3. The packet arrives at Rs server webserver from the LB server, and then webserver responds. The webserver gateway must be lb and then return the data to the LB server.

4, after receiving the return of RS data, according to the connection hash table to modify the source address for the VIP, the target address is CIP, and the corresponding port 80. Then the data arrives from LB to the client.

5, the client receives only to see the VIP\DIP information.

Nat Mode pros and Cons:

1, NAT technology will request the message and response of the message needs to be addressed through the LB address rewrite, so the site traffic is relatively large when the LB load Balancer Scheduler has a larger bottleneck, the general requirements of up to 10-20 nodes

2, only need to configure a public network IP address on lb.

3. The gateway address of each internal Realserver server must be the intranet address of the scheduler lb.

4. Nat mode supports the conversion of IP address and port. That is, the port that the user requests and the port of the real server can be inconsistent.



Full NAT mode


The fundamentals of full Natt:

Full NAT not only replaces the package's DST IP, but also replaces the package's SRC IP when the client requests the VIP, but also replaces the SRC IP when the VIP returns to the client.

650) this.width=650; "src=" Http://s4.51cto.com/wyfs02/M02/7F/68/wKiom1cdq1nBnILvAABQjnjqUnw364.png "title=" Qq20160425132440.png "alt=" Wkiom1cdq1nbnilvaabqjnjqunw364.png "/>


(1), the first client to send a request package to VIP;

(2), VIP received package, will be based on the LVS set LB algorithm to select a suitable realserver, and then the package's DST IP to realserver IP; convert sorce IP to lvs cluster lb IP

(3), Realserver received the package after the Judge DST IP is their own, on the processing of the packages, after processing the packet sent to the LVS LB IP.

(4), LVS received the package after the Sorce IP changed to VIP IP,DST IP to client IP and then sent to the client

Considerations for full NAT mode:

Full NAT mode also does not require LBIP and realserver IP in the same network segment;

The advantage of full NAT compared to NAT is that the RS return package must be able to return to LVS, because the source address is lvs--> indeterminate

Full NAT performance is 10% lower than NAT mode due to updating the Sorce IP


Five, IP tunnel mode

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/7F/66/wKioL1cdrIzDBW97AAEQpBC5Ne4591.jpg "title=" Aaa.jpg "alt=" wkiol1cdrizdbw97aaeqpbc5ne4591.jpg "width=" 650 "/>

TUN: That is, virtual server via IP tunneling, that is, through the IP tunneling technology to implement the VM. In the Vs/tun mode, the Scheduler uses IP tunneling technology to forward user requests to a real server, and the real server responds directly to the user's request and no longer passes through the front-end scheduler. In addition, there is no requirement for the GEO location of the real server, which can be in the same network segment as the director server or in a separate network. Therefore, in the Tun mode, the scheduler will only process the user's message request, thus the cluster system throughput greatly improved.


The Workflow flowchart for Tun is as follows:


650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/7F/68/wKiom1cdq_KjUFNoAABiHriJPGA251.png "title=" Fff.png "alt=" Wkiom1cdq_kjufnoaabihrijpga251.png "/>

It differs from NAT mode in that it does not overwrite the IP address between the LB and Rs transmissions. Instead, the client request package is encapsulated in an IP tunnel, and then sent to the RS node server, and the node server receives the IP tunnel after it has been unpacked and responds to processing. and directly send the package through their own extranet address to customers without going through the LB server.


Schematic process Brief:

1, the customer request packet, the destination address VIP sent to lb.

2, LB receives the customer request package, carries on the IP tunnel encapsulation. That is, in the original Baotou plus IP tunnel header. Then a suitable realserver is selected based on the LB algorithm set by the LVS, and the package sent by the client is packaged into a new IP packet, and the new IP packet DST is realserver IP.

3, the RS node server receives the request packet according to the IP tunnel header information, Realserver receives this package to judge the DST IP is oneself, then resolves out the package DST is the VIP; detects if the IP address of the VIP is bound on our NIC If it is bound, the package will be processed if it is not thrown directly. We usually lo:0 on the realserver. By binding the IP address of the VIP, you can process the client's request package and respond to processing.

4. After the response has been processed, the RS server packets the response data to the client using its own public network line. The source IP address is the VIP address.


Considerations for IP Tunnel mode:

Tunnel mode must bind VIP IP address on all realserver machines

Tunnel mode VIP------>realserver packet communication through the tunnel mode, both the intranet and the external network can communicate, so do not need the LVS VIP and Realserver in the same network segment

Tunnel mode Realserver will send the packet directly to the client, not to the LVS.

Tunnel mode to walk the tunnel mode, so the operation of the more difficult, so generally do not use.


Six, LVS load scheduling algorithm


The LVS scheduling algorithm determines how the workload is distributed among the cluster nodes. When the Director scheduler receives an inbound request from the Cluster service on the client to access the VIP, the Director scheduler must decide which cluster node should process the request. The scheduling method of director dispatcher is basically divided into two categories:

Fixed scheduling algorithm: Rr,wrr,dh,sh

Dynamic Scheduling algorithm: WLC,LC,LBLC,LBLCR

650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M02/7F/68/wKiom1cdrEfzxCRLAAA7hr0juKw018.png "title=" Ssss.png "alt=" Wkiom1cdrefzxcrlaaa7hr0jukw018.png "/>

production environment selection of LVS scheduling algorithm :

1, general network services, such as www,mail,mysql, such as the common LVS scheduling algorithm:

A. Basic polling scheduling RR

B. Weighted minimum connection scheduling WLC

C. Weighted polling schedule WRC

2, the minimum connection based on local LBLC and with replication to give local minimum connection LBLCR is mainly applicable to Web cache and DB cache

3, the source address hash dispatch SH and the target address hash schedule DH can be used in the firewall cluster, can guarantee the entire system access port unique.

Practical application of these algorithms are many applications, the best reference in the core of the implementation of the connection scheduling algorithm principle, and then according to the specific business needs of a reasonable selection.

This article is from the "Love Linux" blog, make sure to keep this source http://ixdba.blog.51cto.com/2895551/1767494


This article from "There is nothing, know in Providence" blog, please be sure to keep this source http://yangsj.blog.51cto.com/8702844/1789111

Analysis and implementation of LVS load Balancing algorithm for enterprise cluster platform

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.