High-availability Implementation--lvs

Last Update:2018-07-24 Source: Internet

Author: User

Tags iptables

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Linux Virtual Server
several terms:Director: Also can be called dispatcher, LVS front-end equipment;

Realserver: Also known as the real internal server, is really in the provision of services;

VIP: External IP, that is, the client requests to come in IP address;

DIP: Address of communication between dispatcher and Realserver; Introduction to LVS working mode
LVS to achieve server cluster load balancing There are three ways, NAT,DR and Tun, the following simple talk about the difference between the three ways:

Lvs-nat:
The idea of this approach is to implement the network layer (IP layer) data spoofing, which is to send the client to the Redirector data IP packet target address to replace.
1, the network environment a director + N Taiwan realserver,director and Realserver in the same private network segment, director is the Realserver default gateway. Only director has a public IP that can be exposed to a wide area network.
2, the client requests the client to request first to the public IP (director), the IP packet destination address in the request message is replaced by a director according to the load-balancing strategy to select a Realserver IP.

3, Realserver response
Realserver processing the request generated a return packet, the source address of the data IP packet is realserver IP address, the destination address is the client server IP address. Because the default gateway for Realserver is redirector, the return packet is first sent back to the redirector, although the destination address of the returned IP packet is the IP address of the client. Redirector again to cheat, the return of the IP packet to the source address of their own IP. Then forward to the switch to return to the client. The whole process redirector the task is to implement 2 IP layer spoofing modification, one is to modify the request packet to the target address, the purpose of this modification is to achieve the distribution of data load balance. The other is to modify the source address of the response packet, in order to hide the realserver, so that the user does not feel the presence of realserver.
3. Limit: The throughput of the whole cluster is limited by redirector (mainly export bandwidth).

LVS-DR:
The idea of this method is to implement data link layer data spoofing and modify the MAC address of the network frame data.
1. Network environment
A Director + N Realserver,director and realserver all have public IP, are exposed to the WAN, in addition to Realserver has a Director IP address the same IP alias. That is to say, Realserver has 2 IP, a real IP address, an IP alias (i.e. public IP) that is the same as DIRECOTR address, and public IP is an IP address that is open to client access.
Realserver also need to make a configuration, so that they ignore all the ARP broadcast for public IP, when the system ARP broadcast asked which MAC address has a public IP, only the dispatch server will respond, the external data will not be sent to the actual server.

2. Client Request
Client request first to public IP (director), because the network environment has been configured to only director response to ARP broadcast, the request message MAC address will be changed to Realserver MAC address. In other words, redirector implements link layer spoofing and replaces the target MAC address of the frame data with a Realserver MAC address determined by the load balancing strategy.

3, Realserver response
Realserver received the Mac frame, and then the Mac frame into an IP packet, found that the IP packet destination address and its own IP alias the same, no problem, continue processing, (this is required IP alias the same reason, if different, the operating system may directly ignore) generate response data, send back. This is because redirector is not the default gateway, so the data is sent directly to the WAN, and the WAN sends the data to the client.

4, Advantages: The return packet does not need to go through redirector, no redirector bandwidth bottlenecks. In principle, the bandwidth of the cluster is the sum of all the actual server bandwidth, and of course they cannot exceed the bandwidth of the connected WAN switch.

5, limit: need to buy multiple public ip,director and Realserver must be on the same WAN segment, that is, on the same switch. Why must be in the same network segment. Quite simply, if the Realserver is on another segment, redirector the entire packet and Mac frame and then sends it to the switch, the switch finds that the MAC address is not found in its WAN and cannot be forwarded.
Lvs-tun:
This method is to break through the limitations of the same network LVS-DR. It does not do any deception, but aboveboard communication, in the network layer carried out two times packaging.

1, the network environment a director + N Taiwan realserver,director and realserver all have the public IP, are exposed in the WAN. Public IP is different, there is no alias limit, and no need to be on the same network segment.
2. Client Request
The client sends the data to the Redirector,redirector to put the IP packet as payload into a new IP packet, and according to the dispatch policy to determine a specific realserver IP as the new IP packet destination address. These new IP packets are fully compliant with the network protocol and do not have any deceptive activities, so these IP packets go through the WAN segment in a fair way, reaching the specified realserver.

3, Realserver response
Realserver get the data, it needs to do a thing, the IP packet to extract the payload, and then the payload as the IP packet composed of TCP, and then up the final request data. According to the request data, Realserver returned to the client after generating the returned data.
4, Advantages: As with LVS-DR, no redirector export bandwidth bottlenecks.
5, shortcomings: the need for additional packaging and reconciliation package, there is a certain cost.

LVS ConfigurationLvs-nat configuration 1, realserver configuration requirements:

Configure the internal private network address, and the default gateway points to the dispatch server

2, directer Configuration requirements Basic configuration:

Dispatch server requires 2 network card (a network card external, a network card internal. 1 block can also, configure the sub-interface, the external VIP and dip are all configured on the same network card, but this will be more to reduce the performance of the scheduler, suggest or dual network card

Close SELinux and Iptables

Setenforce 0

Service Iptables Stop

(to avoid unnecessary hassle, turn off these 2 services on each server)

Turn on the package forwarding feature

echo "1" >/proc/sys/net/ipv4/ip_forward ipvsadm-a–t $VIP: $Port-S RR

Explanation:-A means to add a Cluster service (you can add multiple, such as a Web 80 and an HTTPS 443);-T is the TCP protocol;-S means the scheduling algorithm is polling (a total of 10 scheduling algorithms, you can choose according to their actual needs)

Ipvsadm-a–t $VIP: $Port-R $DIP: $Port-M

Explanation:-A represents the addition of a realserver, followed by the address port of the Cluster service defined previously, and-R indicates an increase in the address of the specific realserver, the-M representation mode is NAT mode
Configuration of LVS-DR

1, realserver Configuration requirements Basic Configuration

First configure the limit ARP, otherwise configure the address after the address conflict, by modifying the kernel parameters to achieve.

In Linux, the default is to advertise IP ARP broadcasts on all interfaces on an interface, answering ip arp requests on all interfaces on an interface

Arp_announce Limit ARP Notification

Limit level

0: On the interface to advertise the IP ARP broadcast on all interfaces

1: For other Devices ARP request, on the interface as far as possible to limit the broadcast notification response (not strict)

2: Only advertise the ip ARP broadcast on this interface

Arp_ignore limit ARP reply

Limit level

0: For ARP requests from other devices, answer the ARP reply on IP on all other interfaces

1: For other devices ARP request, only answer the IP on this interface ARP reply

echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore

echo "2" >/proc/sys/net/ipv4/conf//lo/arp_announce

echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore

echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce

A VIP is configured on the LO port on the realserver, so that the configuration restricts the VIP from generating the MAC address table on the physical switch, thus avoiding the IP conflict

Ifconfig lo:1 $VIP broadcast $VIP netmask 255.255.255.255

Ifconfig eth0 $DIP up

Note that the broadcast address of this VIP's interface is still VIP, limit its broadcast, subnet mask is 32 bits, the following configuration on the scheduler also notice this

Configure a special route to make the packet with the target VIP go out at the source address for the VIP's Lo port

Route add–host $VIP Dev lo:1

Directer Configuration Requirements

Configure the VIP and DIP,VIP configuration on the sub-interface of the physical network card

Ifconfig eth0 $DIP broadcast $VIP netmask 255.255.255.0 up

Ifconfig eth0:1 $VIP broadcast $VIP netmask 255.255.255.255 up

Configure a special route, the target is the VIP package from the physical interface configured with the VIP out

Route add–host $VIP Dev eth0:1

Cluster configuration

Ipvsadm-a–t $VIP: $Port-S RR

Explanation:-A indicates adding a Cluster service (this is the same as the configuration in NAT)

Ipvsadm-a–t $VIP: $Port-R $DIP: $Port-G

Explanation: Other and Nat are also similar, in the final mode to change to-G, that is, Dr mode

LVS Scheduling Algorithm Introduction:

-s Specifies the algorithm used by the service, the commonly used algorithm parameters are as follows:
RR Wheel Call (Round Robin)
The scheduler assigns external requests sequentially to the real servers in the cluster through a "wheel call" scheduling algorithm, which treats each server equally regardless of the actual number of connections and system load on the server.

WRR weighted wheel called (Weighted Round Robin)
The scheduler dispatches the access request according to the different processing ability of the real server through the "Weighted round call" scheduling algorithm. This ensures that the processing capacity of the server to handle more access traffic. The scheduler can automatically inquire about the load of the real server and adjust its weights dynamically.

LC Minimum Link (least connections)
The scheduler dynamically dispatches network requests to the server with the fewest number of links that have been established through the least-connection scheduling algorithm. If the real server of the cluster system has similar system performance, the "Minimal connection" scheduling algorithm can be used to balance the load well.

WLC weighted Minimum link (weighted least connections)
In the case of the server performance difference in the cluster system, the scheduler uses the "Weighted least link" scheduling algorithm to optimize the load balancing performance, and the server with higher weights will bear a large proportion of the active connection load. The scheduler can automatically inquire about the load of the real server and adjust its weights dynamically.

LBLC minimal links based on locality (locality-based least connections)
The "least local link" scheduling algorithm is based on the load balance of target IP address, and is mainly used in cache cluster system. The algorithm finds the most recently used server for the destination IP address based on the requested destination IP address. If the server is available and is not overloaded, send the request to the server, if the server does not exist, or if the server is overloaded and the server is half of the workload, select an available server with the "least link" principle , the request is sent to the server.

LBLCR-based least-localized links with replication (locality-based least connections with Replication)
The "Local least link" scheduling algorithm with replication is also a load balance for target IP address, which is mainly used in cache cluster system. It differs from the LBLC algorithm in that it maintains mappings from one destination IP address to a group of servers, while the LBLC algorithm maintains mappings from one destination IP address to a single server. The algorithm finds the corresponding server group for the target IP address according to the target IP address of the request. According to the "minimum connection" principle, select a server from the server group, if the server is not overloaded, send the request to the server, if the server overload; Select a server from this cluster by the "minimum Connection" principle, Add the server to the server group and send the request to the server. Also, when the server group has not been modified for some time, the busiest server is removed from the server group to reduce the degree of replication.

DH Target Address hash (destination hashing)
The destination hash scheduling algorithm finds the corresponding server from a statically allocated hash table as the hash key (hash key) according to the requested destination IP address, and if the server is available and not overloaded, sends the request to the server, or returns empty.

SH Source Address hash (source hashing)
The "Source address hash" Scheduling algorithm based on the requested source IP address, as the hash key (hash key) from the statically allocated hash table to find the corresponding server, if the server is available and not overloaded, send the request to the server, otherwise return empty.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More