Lvs theory summary

Source: Internet
Author: User

Linux Virtual Server (LVS) is a mechanism used in the Linux kernel to provide Load Balancing services at Layer 4 of the OSI model. To use LVS, You need to compile it into the kernel or become the kernel module. LVS is similar to Linux's firewall mechanism netfilter. The function framework provided by the kernel allows you to define rule sets through corresponding applications.

LVS works on the INPUT chain of netfilter, which may be inaccurate. The kernel sets five hooks (PREROUTING, INPUT, FORWARD, OUTPUT, and POSTROUTING) in the packet processing process. lvs works on the INPUT hook.
"The data packet that arrives at ctor first passes through PREROUTING, and then the route finds that its target address is the address of a local interface. Therefore, the data packet is sent to the INPUT (LOCAL_IN HOOK ). At this point, the running kernel's ipvs always monitors LOCAL_IN HOOK) The process checks the target IP address and port or firewall flag, and finds that this packet requests a cluster service. Therefore, the original route of this packet to the Local Machine (Director) is changed to be sent to the RealServer through the postrouting hook. This process of changing the normal route of a data packet is implemented according to the IPVS table (defined by the Administrator through ipvsadm. "

Therefore, on the director of LVS, you should be careful when configuring firewall rules to avoid conflicts.

LVS is divided into director and realserver. A virtual ip address is configured on ctor to provide external services. Director forwards client requests to the pre-defined realserver based on the policy. The ip address that director contacts with the backend realserver is DIP, and the ip address of realserver is RIP.

LVS has four working modes, or type:

NAT: network address translation. Director selects a realserver, modifies the virtual ip request packet to the RIP of the selected realserver, and forwards the request to the realserver. Then, the realserver constructs a response packet and returns it to the client. Because the source IP address of the response packet of the realserver is RIP, the source ip address must be changed to VIP before it can be sent to the client after ctor, otherwise, the client will regard this as an unrequested response and discard it.
Hook used by director in the NAT Model
Inbound message: prerouting ---> input ---> postrouting
Response Message: prerouting ---> forward ---> postrouting

Features of the NAT model:
Realserver needs to set DIP ctor's DIP to a gateway
Because the response packet must be processed by ctor, and the response packet is usually much larger than the request packet, the load of director will be larger.
Support Port ing

DR: Direct Routing
To avoid the burden on ctor when the response packet is processed by director, you can consider enabling the realserver to build a response packet directly using the VIP as the source IP address. This is the so-called direct routing.
In the DR model, the backend realserver is also configured with a VIP. director receives a request packet sent from the client with the VIP as the destination IP address, and director modifies the layer-2 frame header of the packet, set the target MAC address to the MAC address corresponding to the RIP of the selected backend realserver and forward it. After receiving the packet, the realserver unlocks the two-layer frame header because the target MAC address is its own address, if the target IP address is the same as the target IP address, it will continue to be unencapsulated and then respond. The response packet is sent to the client using the VIP address as the source IP address.

Features of the DR model:
Both director and realserver have VIP configurations.
Special policies are required to avoid conflicts between VIPs configured on multiple machines.
The response of realserver is directly sent to the client without going through ctor.
Port ing is not supported.

TUN
Director encapsulates the requested IP packet to another IP packet that uses the RIP of the selected realserver as the destination IP address, that is, the IP tunnel ), after the realserver unpacks the outer IP header, it finds that the target IP address is the VIP address already configured on the local machine. The realserver uses the VIP address as the source IP address to construct a response packet and return it to the client through its own route.

Features of the TUN Model
The TUN model provides great scalability for LVS clusters distributed across regions.
The VIP must also be configured on the realserver.
Port ing is not supported.


FULL NAT
It is an extension of the NAT model. More than the NAT model, director modifies the source IP address when forwarding requests to the backend realserver, make sure that when the gateway of the realserver is not director, the response packet also goes through itself. After director receives the response packet, it modifies the target IP address and forwards it to the client.

Features of full nat:
Director processes response packets of realserver
Director and realserver can not be in the same IP network.
Support Port ing


ARP problems:
Several problems need to be solved in the DR model. First, avoid sending the packets whose destination IP address is VIP to the realserver instead of ctor at the front-end gateway. There are several methods to achieve this:

1: Prohibit RealServer from responding to ARP requests to VIP
2: Hide the VIP addresses on the RealServer so that they cannot obtain ARP requests on the network;
3: Based on "Transparent Proxy)" or "fwmark firewall mark )";
4: Disable ARP requests from RealServers;

In practice, it is often implemented by modifying two kernel parameters (arp_announce and arp_ignore ).
These two kernel parameters are described as follows:
Arp_annouce: Define different restriction levels for announcing the local source IP address from IP packets in ARP requests sent on interface;
0-(default) Use any local address, configured on any interface.
1-Try to avoid local addresses that are not in the target's subnet for this interface.
2-Always use the best local address for this target.

Arp_ignore: Define different modes for sending replies in response to specified ed ARP requests that resolve local target IP address.
0-(default): reply for any local target IP address, configured on any interface.
1-reply only if the target IP address is local address configured on the incoming interface.
2-reply only if the target IP address is local address configured on the incoming interface and both with the sender's IP address are part from same subnet on this interface.
3-do not reply for local address configured with scope host, only resolutions for golbal and link addresses are replied.
4-7-reserved
8-do not reply for all local addresses


Arp_annouce is used for arp notification level, while arp_ignore is used to set arp response mode. Here we need to set arp_annouce = 2. In fact, I don't quite understand it. It means that an IP address is advertised only through its interface in the same subnet, so when we configure the VIP address on lo later, if the VIP and RIP are in the same network, the VIP mask must be configured as a 32-bit
Arp_ignore = 1. This is very simple. After a network adapter receives an ARP request, if the target IP address of the ARP request is not configured on the network adapter, it will not respond even if the target IP address exists on other network adapters of the local machine.
Generally, the realserver VIP address should be configured on the local loopback interface lo. If the RIP interface is eth0, the arp notification and response to the VIP address should be restricted on eth0. Therefore, you need to define the following configuration in the sysctl. conf file:
# Vim/etc/sysctl. conf
Net. ipv4.conf. eth0.arp _ ignore = 1
Net. ipv4.conf. eth0.arp _ announce = 2
Net. ipv4.conf. all. arp_ignore = 1
Net. ipv4.conf. all. arp_announce = 2

Configure VIP
Ifconfig lo: 0 172.16.9.33 netmask 255.255.255.255 broadcast 172.16.9.33 up
Note: As mentioned above, the VIP addresses configured on RIP and realserver cannot be in the same subnet. Therefore, it is necessary to set the netmask length to 32 characters. I made this mistake when I was doing my experiment, which caused director to be unable to communicate with realserver.




Connection Tracing:
If multiple realservers exist, in some application scenarios, Director also needs to send requests from the same client to the Realserver that is allocated for the first time based on Connection Tracing, to ensure the integrity of the request. The Connection Tracing function is implemented by Hash table. For details about attributes such as the Hash table size, run the following command:

Ipvsadm-Lcn

The persistent connection of lvs has two aspects:

1. Record the request information of the same client to the hash table of lvs. Use persistence_timeout to control the storage time, in seconds. The persistence_granularity parameter works with persistence_timeout and is particularly useful in some cases. Its value is a subnet mask, indicating the granularity of persistent connections. The default value is 255.255.255.255, which is a separate client ip address, 255.255.255.0 is a network segment of the client ip address, which will be allocated to the same real server.

2. To ensure its timeliness, the "Connection Tracing" information in the Hash table is defined as "survival time ". LVS defines three timers for recording "connection timeout:
Tcp idle timeout;
Lvs receives the tcp fin timeout value from the client;
The time interval between the two sent packets is recorded for non-connected UDP packets );
The default values of the above three timers can be modified by commands similar to the following, and the values following them correspond to the preceding three timers in sequence:
# Ipvsadm -- set 28800 30 600
View tcp tcpfin udp timeout
Ipvsadm-L-timeout
Timeout (tcp tcpfin udp): 900 120 30


Implementation of LVS persistent connections:
There are usually three types: PCC, PPC, and firewall-based persistent connections.

Pcc:
PCC is used to direct all access requests from a user to the same REALSERVER within persistence_timeout (0 indicates all ports)
Ipvsadm-A-t $ VIP: 0-s wlc-p 600

PPC
PPC is used to direct a user's access to the same service to the same REALSERVER within the timeout period.
Ipvsadm-A-t $ VIP: PORT-s wlc-p 600

Firewall-based persistent connections
By marking packets that access two different ports with the same firewall, you can define the relationship between the two ports, and then use the firewall tag on LVS instead of an IP Address: PORT Pair to define virtual services, so that services with different ports can be defined as a virtual server, and connection persistence can be set on this virtual server.

Iptables-t mangle-a prerouting-d $ VIP-p tcp -- dport 80-j MARK -- set-mark 10
Iptables-t mangle-a prerouting-d $ VIP-p tcp -- dport 443-j MARK -- set-mark 10
Ipvsadm-A-f 10-s wlc-p





Suspicious: 1. It is said that by default, Linux will build a response packet using the IP address of the interface that sends the packet. Configure a route
Route add-host $ VIP dev lo: 0
But I can't figure out the role of this route. Can it let the response packet go out through the lo port? In practice, it is feasible not to add this route.

2. I see the introduction of ARP on the official website (http://www.linuxvirtualserver.org/docs/arp.html), which describes how to solve the ARP problem in the DR/TUN model. But I want
Generally, in the TUN model, the realserver and ctor are cross-network and there is no ARP problem. If multiple realservers are in the same network, because the client request packets are forwarded through the IP tunnel, there is no issue of competing multiple realservers. Maybe if a realserver detects that the same IP address is configured on other realsever servers, the IP address cannot be used normally?

Reference:
Http://www.linuxvirtualserver.org/zh/lvs1.html
Http://kb.linuxvirtualserver.org/wiki/Main_Page
Http://kb.linuxvirtualserver.org/wiki/ARP_Issues_in_LVS/DR_and_LVS/TUN_Clusters
Http://mageedu.blog.51cto.com/
Http://w.gdu.me/wiki/sysadmin/lvs_persistence.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.