HA cluster configuration

Last Update:2016-12-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, what is HA cluster?

ha (high available) is highly available, also known as dual-machine hot standby, for critical business, the simple understanding is that there are two machines A and B, the normal service is a machine, B machine standby, when a machine stop service, Will switch to the B machine to continue to provide services. Commonly used open source software with high availability is heartbeat and keepalived, where keepalived has load balancing capabilities.

Second, the working principle of heartbeat

How Heartbeat works: The core of heartbeat consists of two parts, the heartbeat monitoring part and the resource takeover part, heartbeat monitoring can be carried out through the network link and the serial port, and support redundant links, they send each other a newspaper Greek tell each other their current state, If the message is not received by the other party within the specified time, then it is considered to be invalid, then a resource takeover module should be initiated to take over the resources or services running on the other host.

Heartbeat is just an HA software that only completes heartbeat monitoring and resource takeover, and does not monitor the resources or applications it controls. To monitor whether resources and applications are functioning properly, you must use third-party plug-ins, such as Ipfail, Ldirector, and so on. Heartbeat itself contains several plugins, namely Ipfail and Ldirectord, which are described below.

The function of Ipfail is directly contained in the heartbeat, which is mainly used for detecting network faults and making reasonable response. To achieve this, Ipfail uses a ping node or ping node group to detect a network connection failure and to make a timely transfer action.

Ldirector is a plug-in that monitors the running state of a Cluster service node. Ldirector If a service in the cluster node is monitored for failure, the external connection capability of this node is masked, and subsequent requests are transferred to the normal node for service. This plugin is often used in LVS load balancing clusters.

Similarly, heartbeat is not monitored for problems with the operating system itself. If the primary node operating system hangs, on the one hand may cause the service interruption, on the other hand because the primary node resources can not be freed, and the backup node takes over the resources of the master node, at this time, there are two nodes competing for a resource status.

For this problem, you need to enable a module called watchdog in the Linux kernel. Watchdog is a Linux kernel module that determines whether the system is functioning properly by performing a write operation to the/dev/watchdog device file at timed intervals. If watchdog thinks the kernel hangs, it restarts the system, freeing up the node resources.

Three, configuration examples

Working environment
Operating under the VirtualBox4.1 virtual machine
Master node: CentOS 5.5-i386
From node: CentOS 5.5-i386

1) Network environment setting
Each host has two Ethernet cards, one for network communication and the other for heartbeat functions. The network settings for the two nodes are as follows:

Node1: Host Name: Server01 (HA01)
eth0:10.8.50.1 255.255.0.0//External IP address
eth1:192.168.50.1 255.255.255.0//ha Heartbeat use Address

Node2: Host Name: Server02 (HA02)
eth0:10.8.50.2 255.255.0.0//External IP address
eth1:192.168.50.2 255.255.255.0//ha Heartbeat use Address

2) Firewall settings
Heartbeat default to use UDP 694 port for heartbeat monitoring, if the system has a firewall using iptables, you need to open this port.
#vi/etc/sysconfig/iptables
Add the following to the master node
-A rh-firewall-1-input-p udp-m UDP--dport 694-d 192.168.50.1-j ACCEPT
Add the following from the node
-A rh-firewall-1-input-p udp-m UDP--dport 694-d 192.168.50.2-j ACCEPT
Reload Iptables.
#service iptables Restart

In the actual test, the firewall is disabled directly through the Setup interface.

3) Modify the hostname/etc/hosts
[[email Protected] ~]# cat/etc/hosts
127.0.0.1                localhost.localdomain localhost
10.8.50.1                server01               HA01
192.168.50.1            HA01
10.8.50.2               Server02
192.168.50.2            HA02

[Email protected] ~]# cat/etc/hosts
127.0.0.1 localhost.localdomain localhost
10.8.50.2 Server02 HA02
192.168.50.2 HA02
10.8.50.1 Server01
192.168.50.1 HA01

After the modification, the two machines will ping each other's host name should be able to ping the

4) Installing the HA-related components
Install Heartbeat Heart Software
[[Email protected] ~] #yum Install Heartbeat-stonith
[[Email protected] ~] #yum Install Heartbeat-pils
[[Email protected] ~] #yum Install heartbeat
[[Email protected] ~] #yum Install Heartbeat-devel
[[Email protected] ~] #yum Install Heartbeat-gui
[[Email protected] ~] #yum Install Libnet

To view the installed software

[Email protected] ~]# Rpm-qa | grep Heartbeat
Heartbeat-stonith-2.1.3-3.el5.centos
Heartbeat-2.1.3-3.el5.centos
Heartbeat-gui-2.1.3-3.el5.centos
Heartbeat-pils-2.1.3-3.el5.centos
Heartbeat-devel-2.1.3-3.el5.centos
[Email protected] ~]# Rpm-qa | grep libnet
Libnet-1.1.2.1-2.rf
[Email protected] ~]# Rpm-qa | grep ipvsadm
Ipvsadm-1.24-12.el5
-----------------------------------------------

Do the same on the slave node

5) Configuring the heartbeat of the master node
Heartbeat's main configuration files are ha.cf, haresources, Authkeys, all in/etc/ HA.D directory, after installing heartbeat through Yum, the default does not have these three files, can be found from the extracted source directory, here manually created and edited.

1. Main configuration file: Ha.cf
Configuring the detection mechanism for heartbeat
In this example, the content is set as follows:
-----------------------------------------------
[Email protected] ~]# CAT/ETC/HA.D/HA.CF
Debugfile/var/log/ha-debug #用于记录heartbeat的调试信息
Logfile/var/log/ha-log #用于记录heartbeat的日志信息
Logfacility local0 #系统日志级别
KeepAlive 2 #设定心跳 (monitoring) interval, default unit is seconds
Warntime 10 # # warning time, usually half of deadtime time
Deadtime 30 # exceeds 30 seconds without receiving the heartbeat of the other node, the other party is considered dead
Initdead #网络启动时间, at least twice times the size of Deadtime.
Hopfudge 1 #可选项: For a ring topology, the number of total hops in a cluster
Udpport 694 #使用udp端口694 for heartbeat monitoring
Ucast eth1 192.168.50.2 #采用单播, heartbeat monitoring, IP for the other host IP
Auto_failback on #on表示当拥有该资源的属主恢复之后, resources migrated to our Lord
Node Server01 #设置集群中的节点, node name must match Uname–n
Node Server02 #节点2
Ping 10.8.1.254 #ping集群以外的节点, here is the gateway for detecting network connectivity
Respawn Root/usr/lib/heartbeat/ipfail
Apiauth ipfail gid=root uid=root #设置所指定的启动进程的权限
-----------------------------------------------
Note: The heartbeat two hosts are the primary node and the slave node respectively. The primary node consumes resources and runs all of the services under normal circumstances, handing the resources to the slave node and running the service from the node when it encounters a failure.

2, resource file haresources
HA.CF file set heartbeat inspection mechanism, no execution mechanism. Haresources is used to set heartbeat execution mechanism when the primary server is having problems. The content is: when the primary server down, how to switch operations. Switching content usually has IP address switching, service switching, shared storage switching, so that from the server with the primary server the same IP, service, sharestorage, so that the client is not aware. The file must be fully consistent on two HA nodes.
In this instance, the contents are set as follows:
-----------------------------------------------
[[email protected] ~]# CAT/ETC/HA.D /HARESOURCES&NBSP
Server01 ipaddr::10.8.50.0/16 httpd
-----------------------------------------------
Note: Priority is to bind a virtual IP 10.8.50.0 to eth0:0 on Server01, and to manage the HTTP service on this computer; if Server01 is down, Server02 can automatically start the HTTP service and assign a new virtual IP 10.8.50.0 to Server02 's eth0:0

3, authentication file Authkeys
In this instance, the contents are set as follows:
-----------------------------------------------
[[email protected] ~]# CAT/ETC/HA.D /authkeys
Auth 1
1 CRC
-----------------------------------------------
Note: You need to change the file's property to 600. Otherwise heartbeat boot will fail
[[email protected] ~] #chmod 600/etc/ha.d/authkeys

6) Configure the heartbeat from the node
Copy the heartbeat configuration file on the master node to the slave node and make sure that the configuration file permissions are the same on the two nodes:
-----------------------------------------------
[Email protected] ~]# scp/etc/ha.d/ha.cf [email protected]:/etc/ha.d/
[Email protected] ~]# scp/etc/ha.d/haresources [email protected]:/etc/ha.d/
[Email protected] ~]# Scp/etc/ha.d/authkeys [email protected]:/etc/ha.d/
-----------------------------------------------
The ha.cf file needs to modify the contents of the Ucast to point to the master node:
Ucast eth1 192.168.50.1 #指定对方IP
The contents of other files need not be modified.

7) test heartbeat with HTTP service
Edit the respective host's test file index.html, put in the/var/www/html/directory, the content is "Server01" and "Server02"

Start the httpd service and Heartbeat service separately on both machines
#service httpd Start
#service Heartbeat Start

Under normal circumstances, the Web service should be heartbeat initiated while heartbeat the primary node to set the virtual IP address 10.8.50.0. You can see more than one eth0:0 on the master node using Ifconfig, with the following details:
-----------------------------------------------
[Email protected] ~]# ifconfig eth0:0
eth0:0 Link encap:ethernet HWaddr 08:00:27:da:60:ad
inet addr:10.8.50.0 bcast:10.8.255.255 mask:255.255.0.0
Up broadcast RUNNING multicast mtu:1500 metric:1
-----------------------------------------------

This article is from the "Linux operation and Maintenance" blog, reproduced please contact the author!

HA cluster configuration

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

HA cluster configuration

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

HA cluster configuration

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support