Cluster and LVS basic knowledge collation

Last Update:2015-05-07 Source: Internet

Author: User

Tags app service

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Directory

1. Introduction of common cluster environment

2, LVS three types of introduction

3, LVS various scheduling algorithm introduction

1. Introduction of common cluster environment

Before talking about LVS, say the cluster, depending on the applicable scenario, IT staff may want to run the server longer, preferably 365 days a year, 72 hours a day uninterrupted operation, you may want to run the application faster, and some digital fields need to perform large-scale numerical operations, These all involve computer clusters.

The most common clusters are the following three types: load-balanced clusters (lb:load Balance), high-availability clusters (Ha:high availability), high-performance clusters (Hp:high performance).

LB cluster mainly has the application layer and the transport Layer two levels on the realization, has the hardware realization, also has the open source software realization, the hardware type lb device mainly has the American F5 BIG/IP, the Citrix NetScaler, A10, the Array, the Radware, The implementation of open source software mainly has LVS (work in four layers), Haproxy (four-layer, seven-layer can be achieved, but mainly seven-layer implementation), Nginx (Application layer implementation).

Open source solutions for HA clusters include heartbeat, keepalived, Corosync+pacemaker, Cman+rgmanager, and Cman+pacemaker.

The HP cluster is used in some special scenarios, which are not discussed here.

2, LVS three types of introduction

First, some basic knowledge of LVS to do some introduction:

LVS, known as Linux virtual Server (Linux virtualization Service), is the fourth layer in the OSI seven layer model, which is a four-layer switching technology, a piece of code that works in the kernel, which implements data forwarding based on the target address and destination port, and it works on the Netfiler framework. LVS is divided into two parts, a part of the work in the kernel, called Ipvs, another part of the work in user space, called Ipvsadmin, with this user space tool to write scheduling rules.

The structure type of LVS is broadly divided into three types, one is Vs/nat, the other is VS/DR, and the third is Vs/tun. Each of the three types is described below.

2.1. Vs/nat (virtual server/network address Translation service/network addresses translation)

This type of network topology is as follows:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6C/7F/wKiom1VKwyXxPViZAAMtOT1Qf5w060.jpg "title=" 1.jpg " alt= "Wkiom1vkwyxxpvizaamtot1qf5w060.jpg"/>

Term Description:

Director represents the scheduler, here refers to the LVS server;

Real server represents a server that provides real-world application services outside the scheduler backend;

CIP represents the IP address of the client accessing the app service;

The VIP indicates the IP address that the Director provides service to the outside;

Dip indicates the IP address of the same network as the back-end server;

RIP represents the IP address of the back-end server.

Vs/nat the approximate process of type scheduling:

First the client initiates a service request to the Director's VIP, and on the input chain on the Netfiler framework Ipvs discovers that the client is requesting its own defined Cluster service, and the director uses a static or dynamic algorithm to pick a real from his own rules When the server comes out, the client request message is Dnat converted, the destination address of the request message is modified to select the IP address of the real server, and then the message is sent back to real server, and when real server receives the request message from the Director, This message will be analyzed to see what resources the other requested, and then the resources are ready to be closed into a message sent to Director,director to do Snat, the source address is modified to its own VIP address after the message sent to the client; In the client's opinion, he was directly visiting the director, The response is also director, the back end of the real server is not visible to the client, the user is transparent.

The working characteristics of the Vs/nat type are summarized as follows:

A, the internal real server application Server uses the private address, the real server gateway must point to the dip;

b, request messages and response messages need to pass the Director, the director of the load pressure, in the high load scene, director easy to become a performance bottleneck;

C, external service port and Real Server service port can be different, that is, support port mapping;

D, Real server does not make any modifications to the operating system, so you can use any operating system.

2.2. VS/DR (virtual Server/direct routing/Direct routing)

This type of topology has roughly the following two types, the first of which:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6C/7B/wKioL1VKxLzhti1oAAMtXaQFAfY315.jpg "title=" 1.jpg " alt= "Wkiol1vkxlzhti1oaamtxaqfafy315.jpg"/>

Note: This VS/DR type architecture in the director only one card access to the network, VIP and DIP are configured on this network card, using sub-interface way to distinguish.

The second type of topology:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6C/7B/wKioL1VKxO_hmbShAAPob7GB8nM905.jpg "title=" 1.jpg " alt= "Wkiol1vkxo_hmbshaapob7gb8nm905.jpg"/>

Approximate scheduling process for VS/DR types:

After the client request message arrives at the director's VIP, the director discovers that the request is a cluster service, and the director chooses a real server using a static or dynamic algorithm, and the director is no longer like vs/ NAT modifies the destination IP of the IP header in the packet, but instead modifies the target Mac to the MAC address of the selected real server's rip, since dip and rip are in the same two-tier network, only the MAC address is needed to complete the addressing. So the message that was modified to the destination Mac can be successfully sent to real Server,real server after the message is opened to discover that the destination Mac is indeed a Rip Mac, so it will process the message, and real server is ready to force the message from the "Lo : 0 "Such sub-interface is sent out, this interface is configured with the VIP address, the first real server will configure the RIP and VIP address, so in order to let VIP address does not cause conflict, so each real server to take a certain mechanism to configure in the" lo:0 " The VIP address on the network is not known to other devices, this address is only used in response messages to the source address of the response message closed to VIP, so that the response message can be directly through the real server to the client, and no longer through the director of the re-forwarding, so, The incoming message passes through the director, and the response message is sent directly to the client, and the message that the client sees is sent back from the VIP (actually from real server), this VS/DR model lets the director get the liberation, only processing the access message, the pressure also becomes smaller, Can drive more real servers, handle concurrency more, but the backend real server configuration is cumbersome.

The working characteristics of the VS/DR type are summarized as follows:

A, each real server's LO loopback address needs to configure a sub-interface, the IP of this subinterface is the VIP address, and ensure that this IP does not accept any external ARP request;

b, this system needs to ensure that the front-end routing to the destination address of the VIP message to the VIP address of the directory, and not to the VIP on the real server;

Solution: Scenario One, the static address binding on the front-end router, but the router is the operator, does not have the management rights of the router; Scenario two, modify the Arptables firewall of real sedrver so that it ignores the response to ARP messages on the Lo interface, Does not actively communicate the information on the LO interface to the network; third, modify the kernel parameters of the system on real server, so that the system does not receive the ARP request message to the LO interface, nor actively notify the LO interface information, which is easy to implement in the kernel after 2.6.

C, real server can use a private address, you can also use the public network address;

D, real server and directory must be in the same physical network, this is because the directory to send the message to real server, does not change the target IP (maintain VIP), but the target Mac is modified from the real Server picked out the Rip MAC address, and the communication between Mac is based on two layer, so can not cross the network segment, which led to the use of VS/DR architecture system can only be implemented in the same room;

E, request message through directory, but the response message must not go through directory, do not support port mapping function;

F, real server can be the most common operating system, as long as the support to hide the relevant functions of ARP notification;

G, real server gateways are definitely not allowed to point to dips.

2.3. Vs/tun (Virtual server/tunneling service/tunnel)

Approximate topologies such as:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6C/80/wKiom1VKxLaBJ5avAAUwgfl0TZA654.jpg "title=" 1.jpg " alt= "Wkiom1vkxlabj5avaauwgfl0tza654.jpg"/>

Clusters based on the Vs/tun type are primarily used to set up a cache server cluster, and the scheduler and real server may not be in the same physical network, so they can be deployed across geographies and across rooms.

Vs/tun the approximate process of type scheduling:

The Vs/tun type uses IP tunneling technology to encapsulate an IP packet with another IP packet, which allows packets that target one IP to be closed and forwarded to another IP address. After the request message arrives at the director, the director uses a static or dynamic algorithm to select a real server address on the tunnel, the original source address is CIP, the destination address is the VIP IP message is closed on another IP packet, Send this IP message to the tunnel to the real Server,real server to open the message to see the target address is VIP, and their "lo:0" interface is also configured VIP, so that the normal processing of this message, the response message ready, and can be directly sent to CIP.

Features of the Vs/tun type work:

A, RIP, dip, VIP all use the public network address;

C, the request message passes through the directory, but the response message does not pass through the directory, does not support the port mapping function;

D, the real server's operating system must support tunneling capabilities.

3, LVS various scheduling algorithm introduction

3.1. Static scheduling algorithm

The static scheduling algorithm is only dispatched based on the height algorithm itself.

A), Rr:round Robin (turn, poll, round call), this algorithm is the starting point fairness, but according to the time, each real server will generate unbalanced load situation;

b), wrr:weighted round robin (weighted polling) for scenarios where real server hardware performance is different;

c), Sh:source hashing (source address hash), indicating that the request from the same CIP will always be directed to the same RS, using the scene to keep the session, but this algorithm has broken the original intention of load balancing;

d), dh:destination hashing (target address hash), indicating that resources accessing the same address are always directed to the same real server, with a special scenario when there are two exits (firewalls) when accessing the external network internally;

3.2. Dynamic Scheduling algorithm

The dynamic scheduling algorithm is calculated based on the algorithm and the current load status of each real server and then dispatched.

e), Lc:least connection (minimum connection), the LC algorithm is to schedule a new connection to the real server with the smallest number of current connections;

Overhead (load) =active*256+inactive, the smaller the value, the more the first is dispatched to the

f), wlc:weighted least connection (with the least weight connection), this algorithm is a supplement to the LC algorithm, each real server with a weight to represent its processing power, the director as far as possible to use its scheduling according to the proportion of their weights to dispatch, This is the default algorithm for Ipvs;

Overhead (load) = (active*256+inactive)/weighted, this algorithm causes a phenomenon, when the initial state of the cluster, the real server does not have an active connection, and in the director's scheduling rules in the weight of the small real Server in the front, the first director will be dispatched to the small weight of the real server, and in reality we want to be dispatched to the processing capacity of the strong (power) real server;

g), Sed:shoutest expection delay (shortest expected delay), means that in the initial state, the director can be selected to a real server of great power, even if the weight of the real server is in the front of the director rules, This algorithm will cause the small weight of real server to not allocate the connection request for a period of time when the real server weights differ greatly.

Overhead= (active+1) *256/weight

h), Nq:never queue (never queue), indicates that in the initial state, the director will be based on the size of weight for each RS assigned a connection, to ensure that each real server can be in the shortest possible time to get the connection request, Solves the problem that real server may not assign a connection request for a period of time in the SED algorithm;

i), lblc:locality-based Least Connection (minimum connection based on locality), this algorithm is a load balancing dispatch for the target IP address of request message, and is mainly used in cache cluster system.

j), Lblcr:replication LBLC (LBLC with replication), this algorithm is a load balancing dispatch for the target IP address of the request message, and is mainly used in the cache cluster system.

This article is from the "focus on operations, and Linux Dances" blog, please be sure to keep this source http://zhaochj.blog.51cto.com/368705/1643712

Cluster and LVS basic knowledge collation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More