Use Heartbeat to implement hot standby

Source: Internet
Author: User

Use Heartbeat to implement hot standby
Use Heartbeat to implement "dual-host hot backup" or "dual-host mutual backup"
Heartbeat Working principle: heartbeat consists of two core parts: heartbeat monitoring and resource management. heartbeat monitoring can be performed through network links and serial ports, and supports redundant links, they send messages to each other to inform the other party of their current status. If the message is not sent by the other party within the specified time, the other party is deemed invalid, in this case, you need to start the resource management module to take over the resources or services running on the host of the other party.
Heartbeat's two hosts are master nodes and slave nodes respectively. The master node normally occupies resources and runs all services. In case of a fault, the node is handed over to the slave node and the slave node runs the service.

I. Network Environment Settings
Each host has two Ethernet cards, one for network communication and the other for heartbeat. The network settings of the two nodes are as follows:
Node1: Host Name: srv5.localdomain (NodeA)
Eth0: 192.168.8.5 255.255.255.0 // external IP address
Eth1: 192.168.9.5 255.255.255.0 // HA heartbeat address
Node2: Host Name: srv6.localdomain (NodeB)
Eth0: 192.168.8.6 255.255.255.0 // external IP address
Eth1: 192.168.9.6 255.255.255.0 // HA heartbeat address
Vip: 192.168.8.100
At the same time, the network and another machine 192.168.9.7 are connected to each other to detect network connections.
The network topology is as follows:

The following configuration must be added to the/etc/hosts file of the two machines:
192.168.8.5 srv5.localdomain
192.168.8.6 srv6.localdomain
The HOSTNAME of the/etc/sysconfig/network file of the master node is as follows:
HOSTNAME = srv5.localdomain
The HOSTNAME of the/etc/sysconfig/network file of the slave node is as follows:
HOSTNAME = srv6.localdomain

Ii. installation and configuration

2.1. Install heartbeat on both machines.
Yum-y install heartbeat-stonith heartbeat-pils heartbeat-devel heartbeat-gui libnet
2.2 configure heartbeat
The main configuration files of Heartbeat include ha. cf, haresources, and authkeys, which must be placed in the/etc/ha. d directory,
After Heartbeat is installed using yum, the three files are not found by default,
After yum is installed, you can find it from/usr/share/doc/heartbeat-2.1.3, copy it to/etc/ha. d, and edit it.
Cp/usr/share/doc/heartbeat-2.1.3/ha. cf/etc/ha. d/
Cp/usr/share/doc/heartbeat-2.1.3/haresources/etc/ha. d/
Cp/usr/share/doc/heartbeat-2.1.3/authkeys/etc/ha. d/

2.3 Main configuration file: ha. cf
The content is set as follows:
Debugfile/var/log/ha-debug # used to record heartbeat debugging information logfile/var/log/ha-log # used to record heartbeat log information logfacility local0 # System log Level keepalive 2 # Set HEARTBEAT (Monitoring) interval, the default unit is second warntime 10 # warning time, usually half of the deadtime time 30 # the heartbeat of the peer node is not received after 30 seconds, the initdead 120 # network start time is considered dead by the other party, at least twice the deadtime. Listen fudge 1 # option: used for ring topology, total number of skip nodes in the cluster udpport 694 # Use udp port 694 for heartbeat monitoring ucast eth1 192.168.9.6 # use single play for heartbeat monitoring, the IP address is the peer host IPauto_failback on # on, indicating that when the owner that owns the resource recovers, the resource will be migrated to node srv5.localdomain on the parent node # Set nodes in the cluster, the node name must match the uname-n node srv6.localdomain # node 2 ping 192.168.8.2 192.168.9.7 # ping a node other than the cluster, where the gateway and another machine are located, used to check network connectivity respawn root/usr/lib/heartbeat/ipfailapiauth ipfail gid = root uid = root # Set the permissions of the specified Startup Process

2.4. Resource file haresources
The ha. cf file sets the heartbeat check mechanism, and there is no execution mechanism. Haresources is used to set the heartbeat execution mechanism when the master server encounters a problem. The content is as follows: how to switch when the master server is down. Switching usually involves IP address switching, SERVICE switching, and shared storage switching, so that the slave server has the same IP address, SERVICE, and volume storage as the master server, so that the client is unaware. The file must be completely consistent on the two HA nodes.
Cat/etc/ha. d/haresources
Srv5.localdomain IPaddr: 192.168.8.100/32

2.5 authentication file authkeys
This file is used to configure the heartbeat encryption mode. It is mainly used for authentication on two nodes in the cluster. The algorithm and key used must be the same on the node in the cluster. Currently, three algorithms are provided: md5, sha1 and crc. Crc cannot provide authentication. It can only be used to verify whether data packets are damaged, while sha1 and md5 require a key for authentication.
In this instance, the content is set as follows:
-----------------------------------------------
Cat/etc/ha. d/authkeys
Auth 1
1 crc
-----------------------------------------------
Note: You need to change the file property to 600, otherwise heartbeat will fail to start.
[Root @ server01 ~] # Chmod 600/etc/ha. d/authkeys

2.6 configure the heartbeat of the slave Node
Copy the heartbeat configuration file on the master node to the slave node, and ensure that the configuration files on the two nodes have the same permissions:
-----------------------------------------------
Scp/etc/ha. d/ha. cf root@srv6.localdomain:/etc/ha. d/
Scp/etc/ha. d/haresources root@srv6.localdomain:/etc/ha. d/
Scp/etc/ha. d/authkeys root@srv6.localdomain:/etc/ha. d/
-----------------------------------------------
The ha. cf file needs to modify the ucast content and direct it to the master node:
Ucast eth1 192.168.9.5 # specify the IP address of the other party
You do not need to modify the content of other files. The haresources and authkeys files must be the same on the master and slave nodes.

Iii. Test
Use http service to test heartbeat
Edit the test files index.html and put them in the/var/www/html/directory. The content is "NodeA" and "NodeB" respectively"

Start the httpd service and heartbeat service on the two machines respectively.
# Service httpd start
# Service heartbeat start
Start on the master node first:
Set the virtual IP address 192.168.8.100 for the master node. If you use ifconfig on the master node, you can see that eth0: 0 is added. The details are as follows:
-----------------------------------------------
[Root @ srv5 ha. d] # ifconfig eth0: 0
Eth0: 0 Link encap: Ethernet HWaddr 00: 0C: 29: D8: F1: 9C
Inet addr: 192.168.8.100 Bcast: 192.168.8.100 Mask: 255.255.255.255
Up broadcast running multicast mtu: 1500 Metric: 1
Interrupt: 67 Base address: 0x2000
-----------------------------------------------
View logs:
Heartbeat [23230]: 2014/10/04 _ 01:28:24 info: Local status now set to: 'up' heartbeat [23230]: 2014/10/04 _ 01:28:25 info: Link 192.168.8.2: 192.168.8.2 up. heartbeat [23230]: 2014/10/04 _ 01:28:25 info: Status update for node 192.168.8.2: status pingheartbeat [23230]: 2014/10/04 _ 01:28:25 info: Status update for node 192.168.9.7: status pingheartbeat [23230]: _ 01:28:25 info: Link 192.168.9.7: 192.168.9.7 up. heartbeat [23230]: 2014/10/04 _ 01:30:24 WARN: node srv6.localdomain: is dead // The slave node has not started heartbeat [23230]: 2014/10/04 _ 01:30:24 info: Comm_now_up (): updating status to activeheartbeat [23230]: 2014/10/04 _ 01:30:24 info: Local status now set to: 'active' // seize resources from another node heartbeat [23230]: 2014/10/04 _ 01:30:24 info: starting child client "/usr/lib/heartbeat/ipfail" (23230) heartbeat []: 2014/10/04 _ 01:30:24 WARN: No STONITH device configured. heartbeat [23230]: 2014/10/04 _ 01:30:24 WARN: Shared disks are not protected. heartbeat [23230]: 2014/10/04 _ 01:30:24 info: Resources being acquired from srv6.localdomain. heartbeat [23317]: 2014/10/04 _ 01:30:24 info: Starting "/usr/lib/heartbeat/ipfail" as uid 0 gid 0 (pid 23317) harc [23318]: 2014/10/04 _ 01:30:24 info: running/etc/ha. d/rc. d/status statusmach_down [23364]: 2014/10/04 _ 01:30:25 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquiredIPaddr [23385]: _ 01:30:25 INFO: resource is stopped // then start the resource ResourceManager [23473]: 2014/10/04 _ 01:30:25 info: Acquiring Resource group: srv5.localdomain IPaddr: 192.168.8.100/32 IPaddr [23500] on the local machine: _ 01:30:25 INFO: Resource is stoppedResourceManager [23473]: 2014/10/04 _ 01:30:25 info: Running/etc/ha. d/resource. d/IPaddr 192.168.8.100/32 start

Other PCs can log on to http: // 192.168.8.100/index.html from the browser, and the page content is "NodeA ".
Disconnect eth1 from the master node and refresh the page. The content is displayed as "NodeB", indicating that the http service has been handed over to the slave node and is run by the slave node.
In this case, you can view ifconfig on the master node. The vip devices and their VIPs are not displayed.
View the log of the master node at the same time: the master node transfers resources
[Root @ srv5 ha. d] # tail/var/log/ha-log
ResourceManager [21049]: 2014/10/04 _ 00:52:00 info: Releasing resource group: srv5.localdomain IPaddr: 192.168.8.100/32
ResourceManager [21049]: 2014/10/04 _ 00:52:00 info: Running/etc/ha. d/resource. d/IPaddr 192.168.8.100/32 stop
IPaddr [21113]: 2014/10/04 _ 00:52:00 INFO: ifconfig eth0: 0 down
IPaddr [21087]: 2014/10/04 _ 00:52:00 INFO: Success
Take over resources from a node and view the slave node log:
IPaddr [20183]: 2014/10/04 _ 01:43:53 INFO: eval ifconfig eth0: 0 192.168.8.100 netmask 255.255.255.255 broadcast 192.168.8.100
IPaddr [20157]: 2014/10/04 _ 01:43:53 INFO: Success
Mach_down [20038]: 2014/10/04 _ 01:43:53 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Mach_down [20038]: 2014/10/04 _ 01:43:53 info: mach_down takeover complete for node srv5.localdomain.
Heartbeat [19467]: _ 01:43:53 info: mach_down takeover complete.
Ipfail [19495]: 2014/10/04 _ 01:43:53 info: NS: We are still alive!
Ipfail [19495]: 2014/10/04 _ 01:43:53 info: Link Status update: Link srv5.localdomain/eth1 now has status dead

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.