In daily use, a server is adequate for a lot of work, but when a lot of people visit at the same time will appear slightly weak, this time. There are two ways to solve this, the first is to constantly improve the performance of this server, but there will always be an upper limit, and the effect of ascension is not obvious. Another way is to use more than one server, to divide the task needs to be processed, this is the cluster, through a certain combination of functions like many server ancestors together, but when used to feel their differences. There are currently four ways of combining, namely Lvs-nat, LVS-DR, Lvs-tunnel and enthusiastic netizens to provide the Lvs-fullnat, in the Linux kernel only support the first three, the last one if you want to use to recompile the kernel itself. Before we start, we need to know a few common terms:
Vs:virtual Server, the Scheduler (central node) that makes up the cluster rs:real server, the user can only find vs, but the real service is rscip:client IP, the IP address of the client, the IP address of the requesting sender Vip:virtual The server IP, the virtual IP address of the virtual server, the destination address of the client Access Dip:director IP, the scheduler IP address, the IP address that is used when forwarding client requests to the rear Real server rip:real server IP, IP address of the rear real server
Simply put, when the user accesses the server, the first access to the scheduler, and then by the scheduler to see which of the servers in the cluster is relatively idle, then the request is sent to this server for processing. The address of the user is called CIP, the address of the user is called the VIP, the address used to access the cluster on the scheduler is called dip, and the address of each server in the cluster is called RIP. The following starts with the various clustering methods.
First, Lvs-nat
This way, the architecture of the cluster is built:
650) this.width=650; "Src=" Https://s4.51cto.com/oss/201710/21/06f5c94b6eb77ac14d53f352eeb14ed3.png-wh_500x0-wm_3 -wmp_4-s_3303131388.png "title=" Tim20171021121427.png "alt=" 06f5c94b6eb77ac14d53f352eeb14ed3.png-wh_ "/>
First, the client's CIP initiates the request, which is distributed by the scheduler to the back-end server for processing and returns processing results after processing. In particular, which host is selected by the scheduler according to the algorithm, according to the LVS in the scheduling of the current load status of the RS, you can divide the scheduling algorithm into two categories, namely static and dynamic algorithms, static algorithm is in advance to assign the process of arrangement, The dynamic algorithm is distributed in real-time processing according to the busy degree of the server. (the basic method used for each of these architectures is the following algorithm, which is no longer repeated)
Static algorithm, according to the characteristics of the algorithm itself to dispatch, pay attention to the beginning of fairness, including the following several:
Rr:roundrobin, polling (one person a task, but some server performance is good, some poor, so this allocation method is not very good) wrr:weighted RR, weighted polling (there is additional weight interference distribution results) (the greater the capacity of the responsibility of the larger) (high weight, representative of the ability to work strong, will be assigned multiple tasks) Sh:source Hashing, the source address hash, the request from the same IP address is always sent back to the first pick in the RS, so that the session binding can be achieved, such as shopping, such as operations, it is best to stay on the same server dh:destination Hashing, the destination address hash, will be sent to the same destination address of the request, always sent to the back end of the first pick in the RS, generally used for the forward Proxy server cluster
Dynamic algorithm, mainly based on the current load status of each RS schedule scheduling, pay attention to the results of fairness, including the following:
(Back-end RS load situation, with overhead, using this variable can be used to real-time observation tasks are best allocated to which server, through Active connection (number of active connections) and Inavtive connection (number of inactive connections) for calculation )
Lc:least connections, minimum number of connections
Calculation formula: Overhead=activeconnections*256+inactiveconnections
Note: The first schedule is assigned from top to bottom according to the order configured in Ipvsadm
wlc:weighted LC, weighted minimum connection (this is the default if not specified at deployment time)
Calculation formula: overhead= (activeconnections*256+inactiveconnections)/weight
Note: At the first dispatch, the weight is allocated from top to bottom in the order configured in Ipvsadm, and the weights do not work at the first dispatch
Sed:shortest expection delay, shortest expected delay
Calculation formula: overhead= (activeconnections+1) *256/weight
Note: SED can solve the problem of unfair starting point, but when the weight gap is large, it may lead to unfair
For example: Weight a 1, a 10, then 10 will continue to get multiple tasks
Nq:never Queue, improved version of the SED algorithm, the first time scheduling, according to the weight of the back-end RS to assign a connection to each RS, and then in accordance with the SED algorithm scheduling, it is necessary to ensure that the back end of each RS at least one ActiveConnection; lblc:locality-based Least Connections: Local-based minimum connection, dynamic DH algorithm lblcr:lblc with Replication, LBLC with copy function
cluster deployment is the most difficult to understand the schema, after understanding, the deployment is very simple, in the Lvs-nat way, we only need to set the individual network segments, so that the same network segment can be ping through, Then use the "ipvsadm" command on the scheduler (in the kernel there is Ipvs, is the function of implementing the cluster, mainly work on the INPUT chain, Ipvs (input): The kernel of the TCP/IP protocol stack on the components, The ipvsadm is used to write rules to Ipvs (similar to the firewall-iptables software, which is a component of the user space for writing Ipvs rules) to see if the kernel supports Ipvs features: ~]# grep -i -c 10 "Ipvs" /boot/config* ):
ipvsadm: format: ipvsadm -a| e -t|u|f service-address [-s scheduler] [-p [timeout]] ipvsadm -D -t|u|f service-address ipvsadm -C ipvsadm -R ipvsadm -s [-n] ipvsadm -a|e -t|u|f service-address -r server-address [-g|i|m] [-w weight] ipvsadm -d -t|u|f service-address -r server-address ipvsadm -l|l [options] ipvsadm -Z [-t|u|f service-address] ipvsadm --set tcp tcpfin udp ipvsadm -- start-daemon state [--mcast-interface interface] [--syncid syncid] ipvsadm --stop-daemon state ipvsadm -h
Manage Cluster Services: increase, delete, change (processing of the whole cluster) add LVS Cluster service: ipvsadm -a -t|u|f  SERVICE-ADDRESS [-S SCHEDULER] [-P [TIMEOUT]]    -T: Cluster service based on TCP protocol service-address: Socket bound with VIP - U: UDP service-based Cluster service service-address: Socket bound with VIP  -F: Cluster service based on firewall protocol (Fwm-firewall mark, firewall tag) -s: Specifies the scheduling algorithm for the Cluster service (the default is WLC if omitted) by iptables the mangle table on the data setting - P: Whether to turn on persistent connection modify LVS Cluster service: Ipvsadm -e -t|u|f service-address [-s scheduler] [-p [timeout]] parameters and add similar delete LVS Cluster service: IPVSADM -D -t|u|f service-address Clear LVS Cluster service: ipvsadm -c (similar to Iptables "-f") (preferably not easy to use)
RS in management cluster: Add, Delete, change (this is the command to process which servers in the cluster) adding RS:IPVSADM to the cluster -a -t|u|f service-address -r server-address [-g|i|m] [-w weight] -r: Specify the IP address and port of the backend RS to add to the cluster     -G: Select the cluster type DR (default) -i: Select a cluster type of tun -m: Select the type of cluster Nat -w: for RS (real server) Set weights (do not set the gap too large, otherwise the allocation will be inappropriate) modify the rs:ipvsadm -e -t|u|f service-address -r in the cluster server-address [-g|i|m] [-w weight] rs:ipvsadm -d -t|u|f removed from the cluster Service-address -r server-address
view LVS cluster status and attribute information: ipvsadm -l|l [options]: options includes: -c, --connection: View the status of various connections in the current Ipvs -- Stats: Displays statistics information for the LVS cluster --rate: Shows the contents of the LVS cluster related to the transfer rate --timeout: Display TCP, The FIN flag bit for TCP and the timeout length for UDP sessions
Save and reload rules for IPVS (two methods are OK): Save: Ipvsadm-s [-N] (just show it is not saved, need to redirect, save yourself) >/path/to/ipvs_rule_fileipvsadm-save >/path /to/ipvs_rule_file Reload: Ipvsadm-r </path/to/ipvs_rule_fileipvsadm-restore </path/to/ipvs_rule_file save rules and automatically when power-on Loading: CentOS 6-: # chkconfig Ipvsadm on save rule to:/etc/sysconfig/ipvsadm CentOS 7: Save rule to:/etc/sysconfig/i Pvsadm # Systemctl Enable Ipvsadm.service
Empty the counter (you can empty the counter when you want to reset it): Ipvsadm-z [-t|u|f service-address] The back if there are no parameters, clear all the counters
After understanding the command, we can start deploying the Lvs-nat cluster, first configure the corresponding IP as shown in the figure, and use the following command on the scheduler:
~]# Yum Install ipvsadm-y #安装管理软件ipvsadm ~]# ipvsadm-a-t 172.15.0.2:80-s RR #创建一个集群服务 ~]# ipvsadm-a- T 172.15.0.2:80-r 172.16.128.18:80-m-W 1 #在刚才创建的集群中添加一台服务器, "-M" is specified as NAT mode ~]# ipvsadm-a-t 172.15.0.2:80-r 172.16.128. 19:80-m-W 2 #在刚才创建的集群中添加一台服务器, "-W" Specifies the weight (weight is useless in NAT, but it is written first and then used)
Then install the HTTPD service on the RS (installed on all servers) (default is already installed):
~]# Yum Install httpd-y~]# route add default GW 172.16.128.17 #并指定网关, otherwise even if the service goes to the server it will not go back ~]# echo "This is 172.16.128.18" >/var/www/html/index.html executed on 172.16.128.18
~]# echo "This is 172.16.128.19" >/var/www/html/index.html executed on 172.16.128.19
Finally, the service is turned on:
~]# Service httpd start #CentOS 6
Go back to the scheduler to verify:
650) this.width=650; "Src=" Https://s3.51cto.com/oss/201710/21/76a6ada70f6b8731b0c31eb3f5eac9e2.png-wh_500x0-wm_3 -wmp_4-s_2781817197.png "title=" Tim20171021131706.png "alt=" 76a6ada70f6b8731b0c31eb3f5eac9e2.png-wh_ "/>
After validation is passed, the forwarding function is turned on on the scheduler:
~]# echo 1 >/proc/sys/net/ipv4/ip_forward
In this way, the Lvs-nat cluster is configured, and we can view it through commands on the host as the client:
~]# for I in {1..10}; Do Curl http://172.15.0.2; Done #使用循环的方式连续访问10次
650) this.width=650; "Src=" Https://s3.51cto.com/oss/201710/21/ecd2a118f9a70b5419d0dd4d2a1fa567.png-wh_500x0-wm_3 -wmp_4-s_3451384316.png "title=" Tim20171021132308.png "alt=" Ecd2a118f9a70b5419d0dd4d2a1fa567.png-wh_ "/>
We can see how the scheduler installs the polling, Once access to 172.16.128.19, a visit to 172.16.128.18, at this time I set is two different Web pages, if two are the same, then the user can not see the gap in the use of the time, it is possible to achieve the task of averaging, to reduce the burden on the server.
Then, we can replace an algorithm with a WLC algorithm instead:
~]# ipvsadm-e-T 172.15.0.2:80-s WLC
Then using the method above to see that the two servers are assigned to 2:1, this is because when the server is added to the 172.16.128.18 weight of 1, and 172.16.128.19 weight of 2:
650) this.width=650; "Src=" Https://s1.51cto.com/oss/201710/21/de01b7d83dbe69ea864db3d7b61f5956.png-wh_500x0-wm_3 -wmp_4-s_2373958296.png "title=" Tim20171021133313.png "alt=" De01b7d83dbe69ea864db3d7b61f5956.png-wh_ "/>
Second, LVS-DR
LVS-DR the configuration of the Lvs-nat to reduce the workload of the scheduler, if it is Lvs-nat mode, then whether the packet into the cluster or from the cluster out of the packet to go through the dispatcher to distribute, then the dispatcher is the efficiency bottleneck of the cluster, if the scheduler is not good enough, The productivity of a cluster can be compromised. So the LVS-DR model eases the workload of the scheduler, and only packets that go into the cluster pass through the scheduler, and the packets that come out of the cluster are sent back to the client directly by the server. However, this will encounter a problem, is the problem of IP address, if sent directly from the server, then the source address is not the address requested by the client, the client will prohibit such a packet, so the LVS-DR model by disguising the user access to the IP address to implement the cluster. Schema (this is not the case in the client's IP reality, this is just for simple validation) (if you have done the above Lvs-nat experiment, the host IP configuration needs to be reset):
650) this.width=650; "Src=" Https://s4.51cto.com/oss/201710/21/75c58b1b2a0697c07707235141f207f7.png-wh_500x0-wm_3 -wmp_4-s_2342196951.png "title=" Tim20171021143134.png "alt=" 75c58b1b2a0697c07707235141f207f7.png-wh_ "/>
At this point the client can communicate with all the IPs in:
650) this.width=650; "Src=" Https://s1.51cto.com/oss/201710/21/3dfb9fb91e3a988be29838308f086cc0.png-wh_500x0-wm_3 -wmp_4-s_4210366236.png "title=" Tim20171021144322.png "alt=" 3dfb9fb91e3a988be29838308f086cc0.png-wh_ "/>
In the DR model, each host needs to configure VIP; There are typically three ways to resolve address conflicts:
1. Statically bind the VIP and Mac on the front-end gateways;
2. Use Arptables in each RS to filter the ARP messages;
3. Modify the corresponding kernel parameters in each RS to limit the level of ARP notification and response;
The effect can be achieved by changing the following two values in the kernel:
Arp_ignore:0: Default value that responds to ARP query requests received from any network interface for any IP address on this computer; 1: Only answer the destination IP address is the ARP request for the IP address of the network segment of the IP address configured on the inbound interface ; 2: Only answer the destination IP address is an ARP request for the IP address of the network segment of the IP address configured on the inbound interface, and the visiting IP address must also be in the same subnet as the IP address of the interface, 3: does not respond to ARP requests for that network interface, but only responds to IP addresses that are set to global; 4-7: Reserved ; 8: Do not reply to all ARP requests;
arp_announce:0: Default value, the information of all the interfaces of this machine is advertised to the network connected by all interfaces; 1: try to avoid communicating with other interfaces in different networks of this interface; 2: Absolutely avoid advertising to non-network hosts;
Run the following command on both servers:
echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore #设置所有接口 (including Lo interface in all) echo 1 >/proc/sys/net/ipv4/conf/lo/arp_ Ignore #设置lo接口 (set again on the LO interface for security reasons) Echo 2 >/proc/sys/net/ipv4/conf/all/arp_announceecho 2 >/proc/sys/net/ipv4/ Conf/lo/arp_announceifconfig lo:0 172.16.128.128 netmask 255.255.255.255 broadcast 172.16.128.128 uproute add-host 172.16.128.128 Dev lo:0
In this way, the configuration on the server is complete, then the HTTPD service is turned on as Lvs-nat, the home page is set up, and then the command is executed back to the scheduler:
~]# ifconfig eth0:0 172.16.128.128 netmask 255.255.255.255 broadcast 172.16.128.128 up
~]# ipvsadm-c #情空规则 ~]# ipvsadm-a-t 172.16.128.128:80-s RR #添加集群 ~]# ipvsadm-a-T 172.16.128.128:80- R 172.16.128.18-g-W 1 #向集群中添加服务器 ~]# ipvsadm-a-T 172.16.128.128:80-r 172.16.128.19-g-W 3 #添加几个, just write a few "-G" options Select the cluster type is DR, do not write, because the default is this
The LVS-DR model is configured so that it can be viewed on the client (172.16.128.17) using commands:
~]# for I in {1..10}; Do Curl http://172.16.128.128; Done
650) this.width=650; "Src=" Https://s2.51cto.com/oss/201710/21/7a65e02163d66680f73ade024690d85d.png-wh_500x0-wm_3 -wmp_4-s_3269935146.png "title=" Tim20171021150104.png "alt=" 7a65e02163d66680f73ade024690d85d.png-wh_ "/>
The algorithm is changed in the same way as Lvs-nat.
Third, tag-based forwarding
The above is the way of two architectures, the next is to say a token-based forwarding method, you can use any one of the schema mode. The data packets of the access scheduler are tagged by the prerouting chain of the mangle table of the firewall, for example, access to port 80 is marked as 6, and then it can be allocated using this tag when configuring the cluster. This approach is more flexible than other methods. The method of configuration is to first use iptables for tagging:
Iptables-t mangle-f #清空规则, the purpose is to prevent the effects of other rules, can not be performed iptables-t mangle-a prerouting-d 172.16.128.128-p TCP--dport 8 0-j MARK--set-mark 6
Then use the IPVSADM command to schedule the server based on the tag:
Ipvsadm-cipvsadm-a-F 6-s RR # "-F" followed by a marked mark, modified according to the markings set above ipvsadm-a-F 6-r 172.16.128.18:80-g-W 1ipvsadm -a-f 6-r 172.16.128.19:80-g-W 3
650) this.width=650; "Src=" Https://s3.51cto.com/oss/201710/21/3a95f9f5869f0010c50f3412be1a0e47.png-wh_500x0-wm_3 -wmp_4-s_4043248808.png "title=" Tim20171021181008.png "alt=" 3a95f9f5869f0010c50f3412be1a0e47.png-wh_ "/>
The above commands are configured on the scheduler, the server does not care, the following using the client view
650) this.width=650; "Src=" Https://s3.51cto.com/oss/201710/21/f03ac35dcf86b4c8db0ce9a5a6f5d05c.png-wh_500x0-wm_3 -wmp_4-s_1109587412.png "title=" Tim20171021180803.png "alt=" F03ac35dcf86b4c8db0ce9a5a6f5d05c.png-wh_ "/>
Four, long connection
On the basis above, use the following command on the scheduler:
~]# ipvsadm-e-F 6-s wrr-p
The purpose of this command is to build a long connection, that is, if a client connects to a server in the cluster, then the time specified after the "-P" option (the default of 360 seconds) is accessed or the server.
, use this command on the scheduler:
650) this.width=650; "Src=" Https://s4.51cto.com/oss/201710/21/f2e6e35b243c7ef9a9f34eefb8d98c6d.png-wh_500x0-wm_3 -wmp_4-s_3958481770.png "style=" Float:none; "title=" Tim20171021182155.png "alt=" F2e6e35b243c7ef9a9f34eefb8d98c6d.png-wh_ "/>
Finally, you can repeat the connection on the client 10 times, you can find 10 times are connected to the same server:
650) this.width=650; "Src=" Https://s4.51cto.com/oss/201710/21/b8c92dd8c41fe18109265e4f04a63424.png-wh_500x0-wm_3 -wmp_4-s_4231710900.png "style=" Float:none; "title=" Tim20171021182214.png "alt=" B8c92dd8c41fe18109265e4f04a63424.png-wh_ "/>
If you change a host, it will be another server (choose based on the algorithm)
Linux cluster deployment and use of IPVSADM commands