Detailed description of LVS Server Load balancer cluster service establishment
I. LVS Overview
1. LVS: Linux Virtual Server
Layer-4 Switching (Routing): forwards request packets to a server in the backend host Cluster Based on the target IP address and target PORT (based on Scheduling Algorithm );
Load Balancing at the application layer cannot be achieved
Lvs (also known as ipvs) is implemented based on the firewall netfilter in the kernel.
2. lvs cluster terminology:
Vs: Virtual Server |
Virtual services, such as Director, Dispatcher, and Balancer |
Rs: Real Server |
Real Server |
CIP: Client IP |
Client IP |
VIP: Director Virtual IP |
Equivalent to FIP (Mobile IP) and virtual IP address of the Server Load balancer |
DIP: Director IP |
Scheduling IP address (second Nic IP address) |
RIP: Real Server IP Address |
Real Server IP Address |
3. LVS: ipvsadm/ipvs
(1) ipvsadm: CLI Tool
User space command line tool for managing cluster services and RS on cluster services; # yum install-y ipvsadm
(2) ipvs: Kernel exists (supported by CentOS by default)
Program code working on the netfilterINPUT hook on the kernel; its cluster function depends on the Cluster Server rules defined by ipvsadm;
Supports a wide range of services based on TCP, UDP, SCTP, AH, EST, AH_EST, and other protocols;
4. Key points for timing in a server Load balancer cluster:
(1) session persistence
Session sticky (iphash): IP Address binding. The source ip address records are uniformly scheduled in the IP hash table.
Session cluster (multicast/broadcast/unicast): broadcast cluster synchronization (replication) session, applicable only to small-scale scenarios
Session server (): session server
(2) Data Sharing (providing consistent storage)
1) shared storage;
NAS: Network Attached Storage (File level), Network Attached Storage, File Server
SAN: Storage Area Network (block level), Storage Area Network
DS: Distributed Storage, Distributed in early spring
2) Data Synchronization: rsync... ...
Ii. LVS Model
1. lvs-nat: address camouflage Model
Multi-target DNAT: Modify the target address and PORT of the request message to select the RIP and PORT of An RS;
The client host initiates a request message CIP pointing to the VIP. Through the core Nic forwarding function of the kernel, the VIP will send the request to the DIP for scheduling, DIP performs load balancing to the RIP of the RS host based on the set algorithm. In this process, the DIP scheduling function will rewrite the target IP address to RIP. The request and return request read must be scheduled with DIP for conversion.
(1) RIP and DIP should use private IP addresses, and the RS mesh should point to DIP;
(2) request and response messages must be forwarded by director; in scenarios with extremely high load, Director may become a System Bottleneck (large response reports );
(3) Support for port ing (forwarding );
(4) VS must be Linux, and RS can be any operating system;
(5) The RIP of RS and the DIP of ctor must be on the same IP network;
2. lvs-dr (direct routing): Gateway Model
Modify the MAC address of the Request Message for forwarding. The IP Address Header does not change (the source IP address is CIP and the target IP address is always VIP)
The client initiates a request to the vswitch closest to the VS server through a layer-by-layer route. The vswitch forwards the request to the VS server, and the VS Server Load balancer forwards the request to the RS server. In this process, the VIP modifies the MAC address and schedules the request to the real host. In this process, the ARP protocol is used to broadcast in a LAN to find the MAC address of a real host. The NIC of each RS host has an alias VIP address. The source IP address of the whole process is CIP, And the destination IP address is VIP. Scheduling is based on MAC search. All hosts in the gateway model must be able to communicate with the Internet. In this way, the RS host can directly respond to the client.
(1) Ensure that the front-end router will send the request packet with the target IP address as VIP to ctor;
Solution:
1) Static binding;
2) Disable RS from responding to vip arp requests;
A) defined on arptables;
B) modify the kernel parameters of each RS, and configure the VIP to disable the response on a specific interface;
(2) The RIP of RS can use a private address or a public address;
RIP uses a private address to directly respond to the client by adding a router to communicate with the Internet.
(3) RS and ctor must be in the same physical network;
(4) The request message must be scheduled by ctor, but the response message must not pass through Director;
(5) Port ing is not supported;
(6) most operating systems can be used for each RS;
3. lvs-tun (ip tunneling): IP tunneling Model
Forwarding method: the IP address header of the request message is not modified (the source IP address is CIP, and the target IP address is VIP). Instead, an IP address header is encapsulated outside the original IP Address Header (the source IP address is DIP, target IP address is RIP );
(1) All RIP, DIP, and VIP addresses must be public IP addresses;
(2) The RS gateway cannot or cannot point to DIP;
(3) request packets are scheduled by ctor, but the response packets are directly sent to CIP;
(4) Port ing is not supported;
(5) The rs OS must support IP tunneling;
4. lvs-fullnat: complete model (changing the source IP address and target IP address of the request packet at the same time)
You can also modify the source IP address (cip --> dip) and target IP address (vip --> rip) of the Request Message for forwarding;
Note: The first three types are standard, and the fourth type is post-added. The kernel may not be supported by default, and you need to compile the kernel automatically.
(1) VIP is a public IP address; RIP and DIP are private IP addresses, which may not be in the same IP network, but must communicate with each other through routes;
(2) The source IP address of the request packet received by RS is DIP, so the response packet will be sent to DIP;
(3) request and response messages must both be directed by ctor;
(4) supports port ing;
(5) RS can use any OS;
Iii. LVS scheduler Scheduling Algorithm
1. Static Method: only scheduling based on the algorithm itself
(1) RR: round robin, round robin mechanism, which distributes requests in sequence. The method is simple, but the load balancing effect is average.
(2) WRR: weighted rr, weighted Round Robin. The higher the weight, the higher the load.
(3) SH: source ip hash, source address hash, bind requests from the same ip address to the same server in the ip hsash table, to achieve session persistence
Disadvantages: Large scheduling granularity, poor Load Balancing performance; Different session stickiness, different connection durations
(4) DH: desination ip hash, destination address hash. Supports Connection Tracing without considering the effect of Server Load balancer.
Forward requests from the Server Load balancer Intranet to the Internet through web proxy;
Client --> Director --> Web Cache Server (Forward proxy)
2. Dynamic Method: Evaluate the algorithm and the Current Load Status of each RS.
Overhead |
Load Value. VS records the Active and Inactive quantity (or even weight) of each RS during forwarding for algorithm calculation. |
Active |
Activity link value. When a new request is initiated and remains in the status of ESTABLISHED, a request response still exists. |
Inactive |
The value of the inactive link. In the ESTABLISHED status, the link has not been disconnected and remains idle. |
(1) LC: least connection, minimum connection
Overhead = Active * 256 + Inactive
The RS at the backend distributes requests to the RS when there are few connections. If the overhead is the same, the RS in the top-down round-robin list is
(2) WLC: weighted least connection, weighted least connection
Overhead = (Active * 256 + Inactive)/weight. If the calculation result is small, it will be the selected next hop RS server.
Disadvantage: When the Overhead is the same, the top-down round-robin response. If the weight is small, the response will be returned if it is above the list.
(3) SED: Shortest Expection Delay, minimum expected latency
Overhead = (Active + 1) * 256/weight
Disadvantage: Solve the WLC problem, but the host with low weight cannot respond to it.
(4) NQ: never Queue, never queuing, SED Algorithm Improvement
Sort the RS weights. Each RS server first allocates a request, and the rest are allocated based on the weight.
(5) LBLC: Locality-Based LC, Based on local minimum connections, dynamic DH connection Algorithm
(6) LBLCR: LBLC with Replication, LBLC with Replication
Iv. ipvsadm command
1. Manage cluster services:
Ipvsadm-A | E-t | u | f service-address [-s schedout] [-p [timeout]
Ipvsadm-D-t | u | f service-address
-A: Add,-E: modify, and-D: Delete
Service-address |
The service address and-t | u | f are used in combination. The specific format is as follows: |
-T, tcp, vip: port |
TCP/IP and port |
-U, udp, vip: port |
UDP ip and port |
-F, fwm, MARK |
Firewall tag |
-S sched: The default value is the WLC scheduling algorithm, which can be saved;
-P [timeout]: exceeds the duration and is related to persistent connections. The default duration is 300 seconds.
2. Manage the RS on the cluster service:
Ipvsadm-a | e-t | u | f service-address-rserver-address [-g | I | m] [-w weight]
Ipvsadm-d-t | u | fservice-address-rserver-address
-A: add an RS,-e: Modify an RS,-d: delete an RS
Server-address refers to rip [: port]. The port can be saved to indicate the same as the previous service-address. Only nat mode supports port ing.
[-G | I | m]
-G: GATEWAY (default), lvs-dr Model
-I: IPIP, lvs-tun tunnel Model
-M: MASQUERADE, lvs-nat Model
3. View
Ipvsadm-L | l [options]
-N: numeric. The address and port are displayed in digital format;
-C: connection. The ipvs connection is displayed;
-- Stats: displays statistics;
-- Rate: rate
-- Exact: exact value, which is not converted by unit.
4. Clearing Rules:
Ipvsadm-C
5. Zero digit:
Ipvsadm-Z [-t | u | f service-address]
6. Save and reload:
Save:
Ipvsadm-S>/PATH/TO/SOME_RULE_FILE
Export SADM-save>/PATH/TO/SOME_RULE_FILE
Overload:
Ipvsadm-R </PATH/FROM/SOME_RULE_FILE
Ipvsadm-restore </PATH/FROM/SOME_RULE_FILE
Note: you need to use it with redirection to import and export data from the custom rule file.
Appendix (ipvsadm-h ):
Ipvsadm-A | E-t | u | f service-address [-s scheduler]
[-P [timeout] [-M netmask] [-B sched-flags]
Ipvsadm-D-t | u | f service-address
Ipvsadm-C
Ipvsadm-R
Ipvsadm-S [-n]
Ipvsadm-a | e-t | u | f service-address-r server-address
[-G | I | m] [-w weight] [-x upper] [-y lower]
Ipvsadm-d-t | u | f service-address-r server-address
Ipvsadm-L | l [options]
Ipvsadm-Z [-t | u | f service-address]
Ipvsadm -- set tcp tcpfin udp
Ipvsadm-h
V. lvs-nat Model Construction
1. lvs-nat Model
This lvs-nat model is constructed as follows. All servers and test clients use the vmwarevm for simulation and the CentOS 7
The VS kernel supports ipvs and installs ipvsadm to control and write lvs rule tools.
The two RS servers balance the request load for the httpd server.
Note;
1) The client can use a Windows browser and cache the results after the meeting. Therefore, it is more intuitive to use the curl command on CentOS to request http protocol display.
2) iptables rules cannot be configured on DIP
2. VS Nic Configuration
(1) Add a NIC
Add a network adapter device to "Virtual Machine Settings" and set the network to VMnet2. In this example, to simulate that the two NICs of the Server Load balancer server are in different network segments
(2) configure the IP addresses of the two NICs
[Root @ localhost ~] # Nmtui # CentOS 7 graphical interface for configuring Nic commands
[Root @ localhost ~] # Systemctl start network. service
Note:
Network Adapter 1 (172.16.249.57) is simulated as an Internet Nic, network adapter 2 (192.168.100.1) is simulated as an intranet, and the ip address of this Nic must be in the same network segment as the ip address obtained by the RS server, DIP is used as the network scheduling (GATEWAY) of RIP, and GATEWAY does not need to be configured.
[Root @ localhost ~] # Ifconfig
3. RS Nic Configuration
Here, two CentOS 7 virtual machines are used as the Server Load balancer backend real response host, the RPM package format httpd service is installed, and the service is started. The nmtui command configures the NIC information. The IP address of RS1 is 192.168.100.2 and the IP address of RS2 is 192.168.100.3. The RIP and DIP are in the same network segment. The VM Nic and DIP match the value in VMnet2 mode at the same time, and the two RS server host gateways point to DIP: 192.168.100.1
[Root @ localhost ~] # Yum install-y httpd
[Root @ localhost ~] # Systemctl start httpd. service
Note: After installation, configure the test page on each httpd server,/var/www/html/index.html.
[Root @ localhost ~] # Nmtui # The configuration method is the same as above, which is omitted here
... ...
[Root @ localhost ~] # Systemctl start network. service
[Root @ localhost ~] # Ifconfig
4. Test whether all hosts can communicate with each other.
Run the ping command to test the communication between each node, such as RIP1 and VIP, DIP, and RIP2.
[Root @ localhost ~] # Ping IPADDR
5. VS host: Core forwarding and installation of ipvsadm
(1) install the ipvsadm component: [root @ localhost ~] # Yum install-y ipvsadm
(2) Enable the core forwarding function between NICs: [root @ localhost ~] # Sysctl-w net. ipv4.ip _ forward = 1
[Root @ localhost ~] # Cat/proc/sys/net/ipv4/ip_forward
6. VS host: define and configure the lvs-nat service (rr algorithm is used here)
(1) define ingress SADM Server Load balancer cluster rules and view
Here, we define DIP as the rr Algorithm for round-robin scheduling and-m as the lvs-nat mode. The configuration command is as follows:
[Root @ localhost ~] # Ipvsadm-A-t 172.16.249.57: 80-s rr
[Root @ localhost ~] # Ipvsadm-a-t 172.16.249.57: 80-r 192.168.100.2: 80-m
[Root @ localhost ~] # Ipvsadm-a-t 172.16.249.57: 80-r 192.168.100.3: 80-m
[Root @ localhost ~] # Ipvsadm-L-n
(2) Client Test
Use the curl command on the client host to initiate a request to the VIP. The Server Load balancer will schedule the request to different hosts for processing according to the rr Algorithm in sequence, requests are sent to the host 192.168.100.2 and 192.168.100.3 in turn for response.
[Root @ localhost ~] # Curl http: // 172.16.249.57
7. VS host: define and configure the lvs-nat service (wrr algorithm is used here)
(1) define ingress SADM Server Load balancer cluster rules and view
Modify the rr of lvs-nat to wrr Weighted Round Robin. Set the weight of 192.168.100.2 to 1 and 192.168.100.3 to 3.
[Root @ localhost ~] # Ipvsadm-E-t 172.16.249.57: 80-s wrr
[Root @ localhost ~] # Ipvsadm-e-t 172.16.249.57: 80-r 192.168.100.2-w 1-m
[Root @ localhost ~] # Ipvsadm-e-t 172.16.249.57: 80-r 192.168.100.3-w 1-m
[Root @ localhost ~] # Ipvsadm-L-n
(2) Client Test
When a client host initiates a request using curl, Server Load balancer host VS forwards the request to each host based on its weight. Three of the four requests are sent to 192.168.100.3 for response, one is sent to the host 192.168.100.2 for processing. Round-Robin load requests using this algorithm
[Root @ localhost ~] # Curl http: // 172.16.249.57
Vi. lvs-dr Model Construction
1. lvs-dr Model
The three hosts are CentOS 7, each host has only one Nic, And the bridging method directs to the external network gateway 172.16.100.1.
2. Configure VIP addresses for VS and RS servers
VIP addresses are configured in the form of aliases on the card, VS is configured on the DIP Nic for external communication, and RS is configured on the lo local loopback Nic.
Note: At this time, the subnet mask of the VIP must be 255.255.255.255, and the broadcast address must be its own.
VS: [root @ localhost ~] # Ifconfig eno16777736: 0 172.16.50.50 netmask 255.255.255.255 broadcast172.16.50.50 up
RS: [root @ localhost ~] # Ifconfig lo: 0 172.16.50.50 netmask 255.255.255.255broadcast 172.16.50.50 up
3. Configure routes on the RS Server
[Root @ localhost ~] # Route add-host 172.16.50.50 dev lo: 0
4. Modify APR Kernel Parameters for RS Server Configuration
[Root @ localhost ~] # Ll/proc/sys/net/ipv4/conf
(1) Kernel Parameters for ARP response and ARP Resolution:
1) arp_annouce defines the announcement level
0: by default, the addresses configured on any local interface are advertised in the network.
1: Try to avoid network communication to other NICs on the master machine. In special cases, other interfaces can also
2: always use the best network address interface (only use the defined network interface)
2) arp_ignore defines the response level (-9) and ignores the response time.
0: All responses
1: only responds to requests from this interface, and the interface address is a network address
... ...
Note: arp_annouce = 2, arp_ignore = 1
(2) Configure parameters for each RS host
Note: all must be configured, eno16777736 (local), and lo can be both configured or configured.
RealServer kernel parameters:
# Echo 1>/proc/sys/net/ipv4/conf/all/arp_ignore
# Echo 2>/proc/sys/net/ipv4/conf/all/arp_announce
# Echo 1>/proc/sys/net/ipv4/conf/INTERFACE/arp_ignore
# Echo 2>/proc/sys/net/ipv4/conf/INTERFACE/arp_announce
Note: The INTERFACE is your physical INTERFACE. The nic interface here refers to eno16777736 and lo
5. VS host: defines the lvs-dr mode (rr algorithm is used here)
(1) view configurations
[Root @ localhost ~] # Ipvsadm-A-t 172.16.50.50: 80-s rr
[Root @ localhost ~] # Ipvsadm-a-t 172.16.50.50: 80-r 172.16.200.10-g
[Root @ localhost ~] # Ipvsadm-a-t 172.16.50.50: 80-r 172.16.200.11-g
[Root @ localhost ~] # Ipvsadm-L-n
(2) test
[Root @ localhost ~] # Curl http: // 172.16.50.50
It is distributed to the RS host in sequence based on the rr Algorithm Scheduling.
7. Define lvs through firewall Marking
1. FWM firewall tag Function
Firewall labeling allows you to bind multiple cluster services to the same one for unified scheduling. Cluster services that share a set of RS are defined in a unified manner.
FWM implements the firewall tag function based on the mangle table of iptables, and defines the tag for policy routing.
2. FWM defines the cluster mode
(1) The PREROUTING of the netfilter mangle table on director defines the rule for "tagging ".
~] # Iptables-t mangle-a prerouting-d $ vip-p $ protocol -- dport $ port-j MARK -- set-mark #
$ Vip: VIP address
$ Protocol: protocol
$ Port: protocol port
(2) define cluster services based on FWM:
~] # Ipvsadm-A-f #-s scheduler
3. instance demonstration
[Root @ localhost ~] # Iptables-t mangle-a prerouting-d 172.16.50.50-p tcp -- dport 80-j MARK -- set-mark 5
[Root @ localhost ~] # Ipvsadm-A-f 5-s rr
[Root @ localhost ~] # Ipvsadm-a-f 5-r 172.16.200.10-g
[Root @ localhost ~] # Ipvsadm-a-f 5-r 172.16.200.11-g
8. LVS persistent connection: lvs persistence
1. lvs persistence Function
Regardless of scheduler S, it can always send requests from the same IP address to the same RS within a specified time range. The implementation method is independent of the ten algorithms scheduled by lvs, implemented using the lvs persistent connection template (hash table). When the custom persistent connection duration is exceeded, scheduling is performed based on the LVS algorithm.
In the ipvsadm command, the-p option is implemented. After-p, no specific number is specified (unit: Second). The default value is 300, which is automatically extended by 2 minutes. For the web, the value is 15 seconds.
2. Mode
(1) Persistent per port (PPC)
When a client initiates a request to the same service port, it implements a persistent connection between the request and the same RS server over a period of time;
For example, a cluster with two hosts serving as the RS server for http and hssh services only makes each port persistent over http, and Client requests are bound, however, port 22 requests are not bound to the same RS
(2) Persistent per client (PCC): defines port 0 of tcp or udp as the cluster service port
Director recognizes all user requests as cluster services and schedules them to RS. Requests from the same client from any port are sent to the same RS server selected for the first time.
(3) PFWMC)
Bind two or more services together through firewall tagging. The request implementation of these services is directed to the same RS server at the same time, and the service is bound to the same RS
Instance:
Use the rr algorithm to bind http and https services in lvs-dr Mode
~] # Iptables-t mangle-a prerouting-d 172.16.100.9-p tcp -- dport 80-j MARK -- set-mark 99
~] # Iptables-t mangle-a prerouting-d 172.16.100.9-p tcp -- dport 443-j MARK -- set-mark 99
~] # Ipvsadm-A-f 99-s rr-p
~] # Ipvsadm-a-f 99-r 172.16.100.68-g
~] # Ipvsadm-a-f 99-r 172.16.100.69-g
Appendix: LVS-DR type RS script example
#! /Bin/bash
#
Vip = 172.16.50.50
Interface = "lo: 0"
Case $1 in
Start)
Echo1>/proc/sys/net/ipv4/conf/all/arp_ignore
Echo1>/proc/sys/net/ipv4/conf/lo/arp_ignore
Echo2>/proc/sys/net/ipv4/conf/all/arp_announce
Echo2>/proc/sys/net/ipv4/conf/lo/arp_announce
Ifconfig $ interface $ vip broadcast $ vip netmask bandwidth 255.255.255 up
Routeadd-host $ vip dev $ interface
;;
Stop)
Echo0>/proc/sys/net/ipv4/conf/all/arp_ignore
Echo0>/proc/sys/net/ipv4/conf/lo/arp_ignore
Echo0>/proc/sys/net/ipv4/conf/all/arp_announce
Echo0>/proc/sys/net/ipv4/conf/lo/arp_announce
Ifconfig $ interface down
;;
Status)
Ififconfig lo: 0 | grep $ vip &>/dev/null; then
Echo "s is running ."
Else
Echo "s is stopped ."
Fi
;;
*)
Echo "Usage: 'basename $ 0' {start | stop | status }"
Exit1
Esac