Clusters and high availability, clusters available

Last Update:2017-10-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Cluster:

With the development of the Internet, a large number of client requests are flocking, and the server load is also growing. However, the load on a single server is limited, in this way, the longer the server responds to client requests, and even the DOS occurs. In addition, most of the websites currently provide uninterrupted Network Services 7x24 hours a day, if you only use a single-point server to provide external network services, a single point of failure will cause the entire network service to be disconnected. In this case, we need to deploy the cluster architecture, finally, hundreds of hosts are combined to meet the massive access load in the big data era. There are many products you can choose when deploying the cluster environment, some are based on hardware implementation, some are based on software implementation, the load balancing hardware equipment has F5 GIG-IP, radware's AppDirector and barracuda Server Load balancer equipment. software includes Linux-based LVS, Nginx, HAProxy, and other products. The core of the software in the cluster environment is Server Load balancer and high availability;

2 LVS Server Load balancer introduction:

LVS (Linux Virtual Server) is a Linux Virtual Server. Currently, LVS has been integrated into the Linux kernel module. This project implements an IP-based Data Request Server Load balancer scheduling solution in the Linux kernel, the end Internet users access the company's external Server Load balancer from the outside. The end users' WEB requests will be sent to the LVS scheduler, the scheduler decides to send requests to a backend WEB server based on its preset algorithms. For example, the polling algorithm can evenly distribute external requests to all backend servers, although the LVS scheduler accessed by end users will be forwarded to the real backend servers, if the real servers connect to the same storage, the services provided are the same, no matter which real server the end user accesses, the service content is the same, and the entire cluster is transparent to the user. Finally, according to the working mode of LVS, the real server sends the data required by the user to the end user in different ways. The LVS working mode is divided into NAT mode, TUN mode, and DR mode;

1> NAT-based LVS load balancing:

NAT refers to network address translation. Its function is to modify the data header so that private IP addresses inside the enterprise can access the Internet, and external users can access private IP addresses inside the company, in VS/NAT mode, the LVS Server Load balancer uses two NICs to configure different IP addresses. eth0 is set to the private key IP address and the internal network is connected to each other through the switching device, eth1 is set to connect an Internet IP address to an external network;

Step 1: The user resolves the Internet IP Address on the server Load balancer device through the Internet DNS server. Compared with the real server, the Internet IP Address of LVS is also called Virtual IP Address ), by accessing the VIP, you can connect to the real backend server, which is transparent to users. users think they are accessing the Real Server, however, he does not know that the VIP he accesses is only a scheduler, and he does not know where the real backend servers are and how many real servers are there;

Step 2: The user sends the data request to the Internet IP address. At this time, LVS selects a real backend server based on the preset algorithm and sends the data request packet to the Real Server, before forwarding, LVS modifies the destination address and destination port of the data packet. The destination address and destination port are changed to the selected Real Server IP address and corresponding port;

Step 3: The real server returns the response data packet to the LVS scheduler. After obtaining the response data packet, the scheduler changes the source address and source port to the VIP address and the corresponding port of the scheduler, after the modification, the scheduler sends the response packet to the end user. In addition, because the LVS scheduler has a connection Hash table, the table records connection requests and forwarding information, when the next packet of the same connection is sent to the scheduler, the previous connection records can be directly found in the Hash table, and the same real server and port information can be selected based on the record information;

2> TUN-based LVS load balancing:

In the LVS (NAT) mode cluster environment, because all data request and response data packets must be forwarded by the LVS scheduler, if the number of backend servers is greater than 10, the scheduler will become the bottleneck of the entire cluster environment. We know that the data request packet is often far smaller than the response packet size, because the response data packet contains the specific data required by the client, therefore, the idea of LVS (TUN) is to separate requests from response data so that the scheduler can only process data requests, and the real server can directly return response data packets to the client, IP tunning in VS/TUN working mode is a data packet encapsulation technology that encapsulates the original data packet and adds a new header (the content includes the new source address and port, destination Address and port) to encapsulate a data packet whose destination is the scheduler VIP, forward it to the real backend server through a tunnel, and encapsulate the original data packet of the client sender scheduler, add a new data header (change the target address to the IP address of the Real Server selected by the scheduler and the corresponding port), LVS (TUN) the mode requires that the real server can directly connect to the external network, and the real server directly responds to the client host after receiving the request data packet;

3> DR-based LVS load balancing:

In LVS (TUN) mode, because you need to create a tunnel connection between the LVS scheduler and the real server, this will also increase the burden on the server, similar to LVS (TUN, the DR mode is also called the direct routing mode. In this mode, LVS still only undertakes data inbound requests and selects Reasonable Real Servers Based on algorithms, in the end, the real backend server is responsible for sending the response data packet to the client. Different from the tunnel mode, the direct routing mode requires that the scheduler and the backend server must be in the same LAN, the VIP address must be shared between the scheduler and all the backend servers, because the final Real Server needs to set the source address as the VIP address and the target IP address as the Client IP address when responding to the data packet to the client, in this way, the client accesses the scheduler's VIP address, and the source address of the response is still the VIP address (the VIP address on the Real Server). The client does not feel the existence of the backend server, because multiple computers have the same VIP address, in direct routing mode, the scheduler's VIP address must be visible to the outside, and the client needs to send request packets to the scheduler host, the VIP address of all the real servers must be configured on the Non-ARP network device, that is, the network device does not broadcast its MAC address and the corresponding IP address, the VIP address of the Real Server is invisible to the outside world, but the real server can receive network requests whose destination address is VIP, and set the source address to this VIP address when responding to the data packet, after selecting a real server based on the algorithm, the scheduler modifies the MAC address of the data frame to the MAC address of the selected server without modifying the data packet, the vswitch forwards the data frame to the Real Server. during the whole process, the VIP address of the real server does not need to be visible to the outside world;

* ***** Because the scheduler and the Real Server have set the VIP address, it is required that all real servers disable ARP responses to the VIP address, the method is implemented through arp_ignore and arp_announce:

Vim/etc/sysctl. conf

Net. ipv4.conf. eth0.arp _ ignore = 1

Net. ipv4.conf. eth0.arp _ announce = 2

Net. ipv4.conf. all. arp_ignore = 1

Net. ipv4.conf. all. arp_announce = 2

Sysctl-p

3. LVS:

According to the previous introduction, we have learned three modes of LVS. However, no matter which mode is used in the actual environment, the scheduler's Scheduling Policies and algorithms are core technologies of LVS, LVS implements the following eight scheduling algorithms in the kernel:

Round Robin Scheduling;

Weighted Round Robin Scheduling;

Minimal connection scheduling;

Weighted Least connection Scheduling;

Based on connections with the least locality;

Local least join with replication;

Target address hash scheduling;

Hash scheduling of source addresses;

Round-robin algorithm (RR): requests are scheduled to different servers in a sequential loop. The most specific method of this algorithm is simple implementation, the polling algorithm assumes that all servers have the same processing capability. The scheduler distributes all requests evenly to each Real Server;

Weighted Round Robin (WRR) is an optimization and supplement to the round robin algorithm. LVS considers the performance of each server and adds a weight to each server, if the weight of server A is 1 and that of server B is 2, The scheduler schedules the request to server B twice the weight of server, the more requests are processed;

Least connection scheduling algorithm (LC): schedules requests to servers with the smallest number of connections, while the weighted least connection algorithm (WLC) assigns each server a weight, the scheduler tries its best to maintain a balance between the number of server connections and the weight;

Local least connection scheduling algorithm (lblc): a scheduling algorithm used to request the destination IP address of a data packet. This algorithm first searches for the server used by the nearest destination IP address based on the target IP address of the request, if the server is still available and capable of processing the request, the scheduler will try to select the same server; otherwise, other feasible servers will be selected; the local least-connection algorithm with replication records the connection records between a target IP address and a server. It maintains the ing between a target IP address and a group of servers, prevents high loads on single-point servers;

Target address hash scheduling algorithm (DH): establishes a ing relationship between the target IP address and the server based on the target IP address through the hash function. When the server is unavailable or the load is too high, requests sent to the target IP address are sent to the server;

Source Address hash scheduling algorithm (SH): similar to the target address hash scheduling algorithm, it uses the source address hash algorithm for static row allocation of fixed server resources;

4. Deploy the LVS service:

LVS is now integrated into the Linux kernel module, but the entire LVS environment is divided into the kernel layer and the user layer. The kernel layer replicates the implementation of the core algorithm, and the user layer needs to install the ipvsadm tool, run the command to pass the work mode and implementation algorithm required by the Administrator to the kernel. The kernel module name of LVS is Ip_vs. We can install ipvsadm in YUM mode, you can also go to the official website to download and use the source code for installation;

1> YUM installation:

YUM installation requires that the local machine can be connected to the YUM source, and the RPM-format software package can be downloaded from the YUM source;

Yum-y install ipvsadm

2> source code installation:

To install the source code, you must use YUM to install related dependent software packages. You can download the Ipvsadm source code software from the official website. After downloading the software, use the standard make, make install, compile, and install the software;

# Yum-y install gcc popt-devel popt-static libn1 libn1-devel

# Wget http://www.linuxvirtualserver.org/softwart/kernel-2.6/ipvsadm-1.26.tar.gz

# Tar-xvf ipvsadm-1.26.tar.gz-C/usr/src

# Ipvsadm-1.26/cd/usr/src/

# Make

# Make install

3> related commands:

No matter which method is used to install ipvsadm, a command tool with the same name will be generated after installation. We need to use this command to manage and configure the LVS virtual server group and the corresponding scheduling algorithm;

The ipvsadm command is described and used as follows:

Description: Linux virtual server management tool;

Usage: ipvsadm option server address-s Algorithm

Ipvsadm option server address-r Real Server address [working mode] [weight]

Option:

-A adds A virtual service and uses IP addresses, port numbers, and Protocols to uniquely define A virtual service;

-E: edit a virtual service;

-D. delete a virtual service.

-C. Clear the virtual service table

-R: Restores virtual service rules from standard input.

-S saves the virtual service rule to the standard output. The output rule can be imported and restored using-R.

-A: Add a Real Server to the virtual service

-E. edit a real server in the virtual service.

-D reduce a real server in the virtual service

-L display the virtual service list

-T use the TCP Service. This parameter must be followed by host and port information.

-U uses the UDP Service. This parameter must be followed by host and port information.

-S specifies the scheduling algorithm used by LVS

-R: Set the Real Server IP address and port information

-G: Set LVS to DR direct connection.

-I: Set the LVS working mode to TUN tunnel mode.

-M: Set LVS to NAT address translation.

-W: Set the weight of the specified server

-C connection status, which must be used with-L

-N digital format output

Command example:

Add a virtual service and set the scheduling algorithm to round-robin. All requests that use the TCP protocol to access port 80 of 124.126.147.168 are finally forwarded to 192.168.0.1, 192.168.0.2, port 80 of the three hosts 192.168.0.3:

Service ipvsadm save scheduling rules

Sed-l/ip_forward/s/0/1 // etc/sysctl. conf: Reload sysctl. conf

View the LVS rule table:

View the scheduling status of the current IPVS:

Delete the Real Server 192.168.0.3 that provides WEB functions for the virtual service:

[Root @ hadoop-master ~] # Ipvsadm-d-t 124.126.147.168: 80-r 192.168.0.3

Backup and restoration of the virtual service rule table:

Modify the scheduling algorithm of a virtual service:

[Root @ hadoop-master ~] # Ipvsadm-E-t 124.126.147.168: 80-s wrr

Create a virtual service using the WRR algorithm. The working mode is direct routing (DR). Add two real servers to the virtual service and set the weights for each Real Server:

5. FAQs:

1> route forwarding:

In LVS (NAT) mode, the LVS scheduler must be able to provide the data forwarding function, but the default system rule does not enable the routing forwarding function. You need to manually modify/etc/sysctl. this function is enabled in the conf file;

2> In LVS (NAT) mode, in addition to the scheduling role, the scheduler also needs to act as a route, but the system's firewall forwarding rules disable forwarding by default, therefore, you need to clear the forwarding rules:

Iptables-F

Iptables-X

Service iptables save

3> In LVS (DR) mode, because all real servers are configured with VIP addresses, you need to set the server to disable ARP broadcast for VIP addresses, in Linux, arp_ignore and arp_announce can be used directly;

Arp_ignore is used to define the response level of the NIC when it responds to an external ARP request:

0: default value. After any network interface receives an ARP request, it responds if any interface on the local machine has the MAC address;

1: After a network interface receives an ARP request, it determines whether the requested MAC address is the current interface. If yes, it responds. Otherwise, it does not respond, the LVS scheduler forwards customer requests to the eth0 interface of the Real Server, and the VIP address of the real server is configured on the loopback device;

Arp_announce is used to define the level when the NIC broadcasts ARP packets:

0: default value. After any network interface receives an ARP request, it responds if any local interface has the MAC address;

1: avoid responding to ARP requests from non-local MAC addresses;

2: does not respond to ARP requests from non-local MAC addresses;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Clusters and high availability, clusters available

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Clusters and high availability, clusters available

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support