The lb LVS of Linux clustering technology

Source: Internet
Author: User
Tags log log iptables

Date: 2018.3.1
Li Qiang
Reference: man,info,magedu handouts, Universal Internet
Lab environment: VMware? Workstation Pro, Centos 6.9,centos 7.4,securecrt Version 8.1.4
Statement: The following English is purely personal translation, English Class B, welcome to correct, the following is purely personal understanding, and there is no right or wrong, just reference, piracy does not correct, can be limited, hope not fraught as well.
Version: v1-2018.3.11

1. Cluster related concepts
大容量网站相关技术概念一个网站的体验度,注册用户数>在线用户数>并发数而并发数量达到一定值的时候,用户打开一个页面所需的时间就是体验度当一台服务器无法满足用户体验度的时候,2中方式scale up 向上扩展,提高更强大的机器处理能力,有上限,成本高scale out 向外扩展,增加多台设备负载所谓凡事有2面性,因此scale out 带来的问题就是基于何种方式把响应发给多台服务器,多台服务器之前如何保证数据的同步性。解决问题的解决方案,不停的升级改变,形成一定的架构。各种应用场景,各种需求下,使用的是不一样的架构,不一样的解决方案。解决老问题的同时又必然会带来新的问题,通过前端调度器/负载均衡设备等多种称呼的设备,来解决如何将相应转发给多台服务器,但是产生的问题就是这台LB设备存在SPOF的问题,因此LB要做HA,多台服务器如果有设备出现问题如法相应要怎么解决,这就需要LB能够用户健康检查的机制,及时将故障设备剔除服务池,当故障恢复后再自动加入到服务池提供服务解决的问题1、如何解决把何种方式响应发给多台服务器2、当服务器无法响应时怎么办3、基于无连接的http协议访问,session如何保存。
    • Cluster classification:

      LB (Load balance)
      Weight

      LB Cluster的实现    硬件:         F5         Citrix NetScaler         A10         等如浪潮inspair,深信服sangfor国产的硬件设备    软件:        lvs:Linux Virtual Server        nginx:支持四层调度        haproxy:支持四层调度        ats        perbal        pound        各种软件适用于各种平台,需求都不一样等等        但是万变不离其宗的是原理都是一样的,只不过在处理方式上,功能性能上各有千秋罢了,所以一开始的解决方案带给我了理解处理问题的思路后,其实许多其他也只是其功能的扩展或者增强而已。    会话保持    调度算法    健康检查

      HA (High available)

      SPOF single point of failtrueheartbeatkeepalived

      HPC (high-)

Distributed deployment

Cdn

2, LVS2.1 LVS Introduction
    • Website

      [Http://www.linuxvirtualserver.org]

    • Working principle

      vs. forwarding to a RS based on the destination IP address and protocol and port of the request message, choosing RS according to the scheduling algorithm
      VS for Virtual Server,rs for real server

      is a madam, someone came to the request, small class has a group of people, a group of persons, a group of business proficiency in more than a few people, Madam is LVS.
      Computer is the product of human thought, and the product of human thought makes the social system, the same applies to computer system.

      LVs are similar to iptables
      The kernel is implemented with Ipvs
      Configuring a first-off scheduling policy through Ipvsadm
      Ipvs work with similar iptables in the NetFilter input, when the packet came, before input made a Ipvs, and then according to the message IP address and port information then intercept, according to the IPVSADM specified forwarding rules, Re-send the response to the RS server.

      So LVS works on the four layer of the OSI model,

    • Advantages and disadvantages of LVS

      Advantages:
      1, good performance
      2, application in the case of a lot of concurrent links to use
      Disadvantages:
      1, single function, only based on 4-tier scheduling
      2, can not check the health status of the back end of the RS

2.2 LVs Related concepts
VSRSCIP:client ipVIP:virtual server ipDIP:director ipRIP:real server ip

Lvs:ipvsadm/ipvs

ipvsadm:用户空间的命令行工具,规则管理器用于管理集群服务及RealServeripvs:工作于内核空间netfilter的INPUT钩子上的框架
    • Types of LVS clusters:

      Lvs-nat: Modify the target IP of the request message, the Dnat of the multi-target IP
      LVS-DR: Manipulating and encapsulating new MAC addresses
      Lvs-tun: Add a new IP header outside the original request IP message
      Lvs-fullnat: Modify the source and destination IP of the request message

    • 1. Lvs-nat mode

Working principle:

主要是数据到达VS后,VS根据VSIP和调度算法,去服务地址池中去查找哪些节点提供此服务,然后将数据包的Dip改为RSip,可能还会改变Dport。然后RS上的网关需指向DIP。VS最终对用户做响应

Application Scenarios:

1、此模式对RS修改较少,
    • 2. LVS-DR mode

Working principle:

主要是数据回去不经过VS,CIP数据到网关,然后网关去找VIP地址,数据到达VIP之后,VIP根据调度算法,转发给响应的RIP,此时目标MAC构建为RIP的MAC地址,此时数据转发出去,不经过TCP/IP的ip层检查。不会经过网关,通过交换机时查看mac表,然后转发给RIP,因此VS和RS必须在同一个交换机下,但是不用在同一个ip网络中,然后RIP收到ip地址,查看mac是自己的,然后再看DIP是VIP,为本机的lo地址,所以处理响应,然后响应时,通过查找路由表,源ip为VIP,源mac为RS MAC,目的ip为CIP,目的mac为CMAC换句话只有VS的VIP是真实的对外提供arp响应的,RS的VIP是虚拟的值只是用于对VS转发来的报文进行本地处理和本地路由转发,可以不经过来时的路,因为VS直接通过目的MAC为RS转发给RS,当报文从VS的接口发出后直接到达交换机,交换机通过MAC table去进行二层转发到RS上,RS收到报文后发现目的ip为本机lo的地址,则进行处理,然后封装报文,源ip为vip,目的ip为cip,然后通过路由转发。因此前提VS和RS必须在同连接到同一个广播域的物理设备上。否则通过二层转发,vs不能将数据包发给rs。RS最终对用户做响应

Modify ARP not processed:

要使RS的VIP称为虚的VIP,不对外提供arp的响应和发布,1、arptables工具2、sysctl修改内核参数3、在路由器上做arp静态帮助,发往vip的数据包只发给VS,但是rs的内核参数依然要改,要不然如windows os上如果ip地址冲突会报警的。而且会有一些不必要的影响。

Application Scenarios:

1、对服务器操作较多。2、VS和RS不能跨广播域
    • 3. Lvs-tun mode

Working principle:

方式和DR类似,不同的是不是封装目的MAC地址,而是通过在IP报文首部前加上新的IP首部(源ip为DIP,目的ip为RIP),将报文发往调度的RS,RS直接响应给客户端。

Application Scenarios:

1、VS和RS可以不在同一广播域。2、VS对外提供真实的VIP地址3、
    • 4. Lvs-fullnat mode

Working principle:

此类型kernel默认不支持同时修改请求报文的源ip和目的ip地址进行转发,CIP改为DIP,VIP改为RIP

Application Scenarios:

1、VS和RS可以跨网段,这个我在浪潮负载上做的就是LVS-NAT和LVS-FULLNAT模式。FULLNAT就要给VIP做一个NAT地址池,此处用于将CIP改为地址池的中的地址,RS指定的网关为源网关只要能到达NAT地址池中的地址也就是DIP即可。RS服务器本身也不需要修改内核arp参数,配置VIP地址。
    • 5. Comparison and analysis of LVS working mode

      Lvs-nat and Lvs-fullnat: Both request and response messages go through vs because there is a NAT session

      Lvs-dr and Lvs-tun:

2.3 Scheduling algorithm

Ipvs Scheduler

根据其调度时是否考虑服务器的负载情况分为两种:静态和动态方法,共10种
    • static method:

      1. Rr:roundrobin Polling

      2, Wrr:weigh Roundrobin weighted

      3. Hash of Sh:source Hash source address

      4, dh:destination hash of the hash destination, applied to the RS before there are more than one firewall

    • Dynamic methods:

By calculating the overload of RS, the smaller the load, the higher the priority forwarding

1、LC:Least Connnections    overload=active*256+inactive2、WLC:Weighted LC,默认调度算法    overload=(active*256+inactive)/weight3、SED:Shortest Expection Delay    overload=(active+1)*256/weight4、NQ:Never Queue5、LBLC:Locality-Based LC6、LBLCR:
    • An alternative explanation

I'm going to start again, the scheduling algorithm is like working in a factory.
After a batch of live, then there are a group of people, then how to assign live to these workers, live is the client's request, a batch of workers is the server RS, the allocation of how to work is the VS scheduler. The boss rejoined the staff, there are 4 kinds of static methods to distribute the work.

1、轮询的方式:就是流水线。2、加权的方式:这个员工干的快,多给他些活干,加工资3、源地址哈希:这个A公司的活你干过你熟悉,你来4、目的地址哈希:这个是给B公司的活你干活你熟悉,你来

The boss has to go bankrupt sooner or later without considering the employee's death.
So there is a way to allocate according to the workload of the staff

1、谁现在工作最少给谁优先多些活干,但是没有考虑到你给他活,但是它干的慢啊2、因此在上个基础上又有了新的解决方案,wlc,权重高又工作量少的优先干3、
LVS Implementation Ipvsadm/ipvs
ipvs为内核中代码,ipvsadm为用户空间命令,用来配置LVSyum install ipvsadm
    • Format
usage:ipvsadm-a|  E-t|u|f Service-address [-S scheduler] [-P [Timeout]] [-M netmask] [--pe persistence_engine] [-B sched-flags] Ipvsadm-d -t|u|f service-address ipvsadm-c ipvsadm-r ipvsadm-s [-n] ipvsadm-a|e-t|u|f service-address-r server-address [o  Ptions] ipvsadm-d-t|u|f service-address-r server-address ipvsadm-l|l [options] ipvsadm-z [-t|u|f service-address] Ipvsadm--set TCP tcpfin UDP ipvsadm--start-daemon state [--mcast-interface interface] [--syncid SID] Ipvsadm--stop-d  Aemon state Ipvsadm-hcommands:either Long or short options is allowed. --add-service-a add virtual service with Options--edit-service-e Edit virtual service with option         s--delete-service-d Delete Virtual service--clear-c clear the whole table--restore -R restore rules from stdin--save-s save rules to stdout--add-server-a add re        Al Server with Options--edit-server-eEdit real server with options--delete-server-d Delete Real server--list-l|-l list the table --zero-z Zero counters in a service or all services--set TCP Tcpfin UDP set connection Timeou T values--start-daemon start connection Sync daemon--stop-daemon stop connection Sync Daemo n--help-h Display this help message
    • Options

      Cluster Service Related:

      -a| E: Add/Modify Cluster service

      -t:tcp-u:udp-f:firewall_markservice_address     -t:ip:port    -u:ip:port    -f:firewall_mark-s:指定调度算法 默认wlc-p:president connection timeout

      -D: Remove the Cluster service
      -C: Clear the defined Cluster service
      -l|l: View Cluster service
      -N: Digital form
      --stats:
      --rate:
      -c:connnection

      -Z: 0 Statistics of Cluster service

      RS Related:

      -a|e: Add/Modify nodes for the specified Cluster service

      -t|u|f service_address:指名添加到那个集群服务-r service_address:指定RS的地址,ip[:port]当允许端口映射时,可以指定端口-g|i|m:指定LVS模式类型 -g gateway,DR模式;-i ipip,TUN模式;-m manquerade,NAT模式,默认-g-w:weight当调度算法有权重时使用

      -D: Remove the node for the specified Cluster service/real server

      Backup (to standard output)

      Ipvsadm-s
      Ipvsadm-save

      Recovery (from standard input)

      Ipvsadm-r
      Ipvsadm-restore

2.4 Session hold
1、session绑定:始终将统一请求的链接调度到同一服务器,没有容错能力,有损调度效果2、session同步:在RS之间同步session,RS上有所有集群的session,大规模集群模式下不适用3、session服务器:部署一台session服务器,专门存放session信息。存在SPOF,需要做HA
2.5 LVs Persistent Connection

Multiple cluster services that share the same set of Rs need to be bound uniformly and cannot be dispatched using the LVS SH algorithm.

Implementation regardless of any scheduling algorithm, over a period of time (default 360s), the ability to implement requests from the same address is always sent to the same RS

PCC:每客户端持久,来自同一客户端访问某个VIP的所有链接统一转发个某个RS,范围太大PPC:每端口持久,来自同一客户端访问某个VIP的某个端口的连接统一转发给某个RS,范围太小PFMC:每防火墙标记持久,基于firewall mark的,将来自同一客户端访问某个VIP的多个端口的连接统一转发给某个RS,ipvsadm -A -t 172.18.0.1:0 -p 200ipvsadm -A -t 172.18.0.1:80 -p 200ipvsadm -A -f 1 -p 200 对应以上三种持久性连接配置ipvsadm  -vnL --persistent-conn 查看持久性连接信息
2.6 LVs High Reliability

1 Director is not available, the entire system will not be available; SPoF single point of Failure

解决方案:高可用    keepalived heartbeat/corosync

2 If an RS is not available, the Director will still dispatch the request to this RS

解决方案: 由Director对各RS健康状态进行检查,失败时禁用,成功时启用    keepalived heartbeat/corosync ldirectord检测方式:    (a) 网络层检测,icmp    (b) 传输层检测,端口探测    (c) 应用层检测,请求某关键资源

3. When RS is all-in-use: Backup server, sorry server

    • Ldirectord

? Ldirectord: Monitoring and controlling the LVS daemon to manage LVS rules

? Package Name: ldirectord-3.9.6-0rc1.1.1.x86_64.rpm Yum Install Ldirectord

? File:

/ETC/HA.D/LDIRECTORD.CF Master configuration file

/USR/SHARE/DOC/LDIRECTORD-3.9.6/LDIRECTORD.CF Configuring the template

/usr/lib/systemd/system/ldirectord.service Service

/usr/sbin/ldirectord Main Program

/var/log/ldirectord.log Log

/var/run/ldirectord.ldirectord.pid pid File

Ldirectord Configuration File Example

checktimeout=3checkinterval=1autoreload=yeslogfile=“/var/log/ldirectord.log“  #日志文件quiescent=no #down时yes权重为0,no为删除virtual=5  #指定VS的FWM或IP:portreal=172.16.0.7:80 gate 2real=172.16.0.8:80 gate 1fallback=127.0.0.1:80 gate #sorry serverservice=httpscheduler=wrrchecktype=negotiatecheckport=80request="index.html"receive=“Test Ldirectord"
    • Instance

      Refer to the Dr,nat mode configuration of the LVS for general notes

LVs for the LB of Linux clustering technology

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.