There are many types of Linux high-availability clusters, such as common heartbeat,corosync,rhcs,keepalived, which provide a high-availability guarantee for our business production environment. This article will briefly introduce the V2 version of heartbeat to handle a simple HTTP high-availability cluster setup.
Before implementing an HTTP high-availability cluster, you need at least 2 hosts, and you need to do 3 basic preparatory work:
1. Set the node name, and all nodes in the cluster can resolve all hosts in the cluster through the node name. For high availability of the Cluster service, choose to use/etc/hosts, and ensure that the value of Uname-n must match the value in hostname.
2. Use SSH to trust the two machines.
3. Time synchronization.
Before installing the Heartbeat software, first complete the above basic work. Here we use 2 hosts (192.168.1.201,192.168.1.202) to do our high-availability Cluster service.
1. First login 192.168.1.201 Modify hostname=test1.qiguo.com, modify the hostname= in/etc/sysconfig/network Test1.qiguo.com ensure that the host name will not change the next time the server is started. Add 192.168.1.201 test1.qiguo.com test1 in/etc/hosts; 192.168.1.20 test2.qiguo.com test2 Two lines, and then perform the same operation on the 192.168.1.202 host.
2. Execute ssh-keygen-t RSA in 192.168.1.201 and then use ssh-copy-id-i [email protected] to realize SSH trust. This step is performed on two machines.
3. Use the time synchronization command on 2 servers Ntpdate 133.100.11.8 (available NTP server IP address)
Once the above three steps are complete, you can start installing heartbeat. You can go to Epel to download Heartbeat's installation package, The default requires heartbeat-2.1.4-11.el5.i386,heartbeat-gui-2.1.4-11.el5.i386,heartbeat-pils-2.1.4-11.el5.i386,heartbeat-stonith-2.1.4 -11.el5.i386 these 4 packages. But these 4 packages depend on the other two packages perl-mailtools-1.77-1.el5.noarch,libnet-1.1.6-7.el5.i386, so the first thing to do is to install the 2 packages. Use RPM-IVH Perl-mailtools-1.77-1.el5.noarch to report dependency errors, so use yum--nogpgcheck localinstall Perl-mailtools-1.77-1.el5.noarch to install. Then use the same method to install the remaining packages together. Note: These software will be installed on all 2 servers
After the installation is complete heartbeat, the heartbeat default configuration file is in/ETC/HA.D. RC.D in HA.D are resource-management related scripts, and RESOURCE.D are resource-agent scripts, and service scripts are in/etc/ha.d/heartbeat. The default mounted heartbeat does not have a profile, but you can place Ha.cf,authkeys and haresources three files in/ETC/HA.D from/usr/share/doc/heartbeat-2.1.4/. The functions of these 3 configuration files are:
Authkeys: Key file, the permissions of this file must be 600, otherwise the heartbeat service cannot be started
Ha.cf:heartbeat configuration files for the service itself
Haresources: Resource Agent configuration file
All we need to do to configure these 3 files is to implement our HTTP high availability cluster. First look at the Authkeys file:
#auth 1 to provide a key authentication method
#1 CRC cyclic redundancy check code authentication
#2 SHA1 hi! SHA1 algorithm authentication
#3 MD5 hello! MD5 Certification
It is best to use SHA or MD5 certification, the CRC performance is low. If the configuration file using MD5 authentication is as follows:
Auth 1 # 1 code uses the following line starting with 1 as the key authentication condition
1 MD5 9adc3f50d9bb9e9c795fce0a839aa766
The only way to generate MD5 is to enter echo "Qiguo" in the shell command line | Md5sum can
The second configuration file ha.cf inside a lot of content, briefly described as follows:
#debugfile /var/log/ha-debug #是否启用debug的日志 logfile /var/log/ha-log #日志文件的存放位置 #logfacility local0 #日志的设施, if logfile is enabled, do not start this option keepalive 2 #每隔多少时间进行心跳检测一次 #deadtime 30 #服务器经过多少时间后, has not detected its existence, it is believed that it has been dropped #warntime #警告时长 #initdead #一个集群起来多久, The second cluster has not yet started and the cluster is considered unsuccessful #udpport 694 #监听的端口 #baud 19200 #串行线的发送速率 bcast Eth0 #以广播的方式发送心跳检测 (Here we use the broadcast way, the direct start bcast eth0, this way in the LAN in many cases, very expensive resources) #mcast eth0 255.0.0.1 694 1 0 #以多播的方式发送心跳检测 #ucast eth0 192.168.1.2 #以单播的方式发送心跳检测 #auto_failback on #主节点挂了以后 and resumed, Whether the new jump to the primary node, on means a new jump. #stonith Baytech/etc/ha.d/conf/stonith.baytech #定义stonith, how to isolate the nodes that are not in the line #node ken3 #集群内的节点名称, Each node needs to use one of the nodes, and the value must be the same as the value of uname-n node test1.qiguo.com node test2.qiguo.com #ping 10.10.10.254 #指定ping的地址 Ping 192.168.1.1 #网管地址
The third configuration file, Haresources file, is the cluster resource configuration file. There are a number of configuration examples above, with one of the sample configuration files to illustrate: #node1 10.0.0.170 filesystem::/dev/sda1::/data1::ext2
Node1 is the name of the master node, 10.0.0.170 is Vip,filesystem is the resource agent (the resource agent can find from/ETC/HA.D/RESOURCE.D and/etc/init.d/, ":" represents the parameters of the resource agent). Here we do HTTP high availability, so the configuration is as follows:
Test1.qiguo.com Ipaddr::192.168.1.210/24/eth0 httpd can
Once the above three configuration files are successful, they are copied to the 192.168.1.202 host. After the replication is complete, the HTTPD service is mounted on both hosts. Install the HTTPD service must not let them boot automatically start. If all the configurations are successful, you can turn off the httpd service to start the heartbeat service.
HEARTBEAT[4825]: 2014/05/11_23:54:35 info:version 2 support:falseheartbeat[4825]: 2014/05/11_23:54:35 warn:logging Daemon is disabled--enabling logging daemon is recommendedheartbeat[4825]: 2014/05/11_23:54:35 info: ****************** HEARTBEAT[4825]: 2014/05/11_23:54:35 info:configuration validated. Starting Heartbeat 2.1.4heartbeat[4826]: 2014/05/11_23:54:35 info:heartbeat:version 2.1.4heartbeat[4826]: 2014/05/11 _23:54:35 info:heartbeat generation:1399811242heartbeat[4826]: 2014/05/11_23:54:35 info:glib:UDP Broadcast Heartbeat Started on port 694 (694) interface eth0heartbeat[4826]: 2014/05/11_23:54:35 Info:glib:UDP broadcast Heartbeat closed on Port 694 Interface eth0-status:1heartbeat[4826]: 2014/05/11_23:54:35 info:glib:ping Heartbeat started.heartbeat[4826 ]: 2014/05/11_23:54:35 info:G_main_add_TriggerHandler:Added signal Manual handlerheartbeat[4826]: 2014/05/11_ 23:54:35 info:G_main_add_TriggerHandler:Added Signal Manual handlerheartbeat[4826]: 2014/05/11_23:54:35 info:G_main_add_SignalHandler:Added signal handler for signal 17heartbeat[4826]: 2014/05/11_23:54:35 Info:local status now set to: ' Up ' heartbeat[4826]: 2014/05/11_23:54:36 info:link test1.qiguo.com:eth0 up.heartbeat[4826 ]: 2014/05/11_23:54:36 info:link 192.168.1.1:192.168.1.1 up.heartbeat[4826]: 2014/05/11_23:54:36 info:status Update For node 192.168.1.1:status pingheartbeat[4826]: 2014/05/11_23:54:41 info:link test2.qiguo.com:eth0 up.heartbeat[4826 ]: 2014/05/11_23:54:41 info:status Update for node test2.qiguo.com:status upharc[4835]: 2014/05/11_23:54:41 info:ru Nning/etc/ha.d/rc.d/status statusheartbeat[4826]: 2014/05/11_23:54:42 info:comm_now_up (): Updating status to ACTIVEHEARTBEAT[4826]: 2014/05/11_23:54:42 info:local status now set to: ' Active ' heartbeat[4826]: 2014/05/11_23:54:42 Info:status Update for node test2.qiguo.com:status activeharc[4853]: 2014/05/11_23:54:42 INFO:RUNNING/ETC/HA.D/RC. D/status statusheartbeat[4826]: 2014/05/11_23:54:53 Info:remote resource Transition completed.heartbeat[4826]: 2014/05/11_23:54:53 info:remote Resource Transition complet ED.HEARTBEAT[4826]: 2014/05/11_23:54:53 info:initial Resource Acquisition complete (t_resources (US)) ipaddr[4907]: 2014/05/11_23:54:53 Info:resource is stoppedheartbeat[4871]: 2014/05/11_23:54:53 info:local Resource Acquisition compl ETED.HARC[4957]: 2014/05/11_23:54:53 info:running/etc/ha.d/rc.d/ip-request-resp ip-request-respip-request-resp[ 4957]: 2014/05/11_23:54:53 received IP-REQUEST-RESP ipaddr::192.168.1.210/24/eth0 OK yesresourcemanager[4976]: 2014/ 05/11_23:54:53 info:acquiring resource group:test1.qiguo.com ipaddr::192.168.1.210/24/eth0 httpdipaddr[5002]: 2014/ 05/11_23:54:53 Info:resource is stoppedresourcemanager[4976]: 2014/05/11_23:54:53 info:running/etc/ha.d/resource.d/ IPAddr 192.168.1.210/24/eth0 startipaddr[5097]: 2014/05/11_23:54:53 info:using calculated netmask for 192.168.1.210:25 5.255.255.0IPADDR[5097]: 2014/05/11_23:54:53 Info:eval ifconfig eth0:0 192.168.1.210 netmask 255.255.255.0 broadcast 192.168.1.255ipaddr[5068]: 2014/05/11_23:54:53 INFO: successresourcemanager[4976]: 2014/05/11_23:54:53 info:running/etc/init.d/httpd start
Observing the logs, you can see that the high-availability HTTP cluster has started up. It is now test1 to observe the changes in the log after performing shutdown-h on this machine. (You can also use Heartbeat's own Hb_standby script to switch, default in the/usr/lib/heartbeat directory)
HEARTBEAT[11796]: 2014/05/11_20:56:46 info:received shutdown notice from ' test1.qiguo.com '. heartbeat[11796]: 2014/05/ 11_20:56:46 Info:resources being acquired from test1.qiguo.com.heartbeat[11862]: 2014/05/11_20:56:46 info:acquire Local HA resources (standby). heartbeat[11863]: 2014/05/11_20:56:46 info:no Local resources [/usr/share/heartbeat/ ResourceManager Listkeys test2.qiguo.com] to acquire.heartbeat[11862]: 2014/05/11_20:56:46 info:local HA Resource Acquisition completed (standby). heartbeat[11796]: 2014/05/11_20:56:46 info:standby Resource Acquisition done [all]. harc[11888]: 2014/05/11_20:56:46 info:running/etc/ha.d/rc.d/status statusmach_down[11903]: 2014/05/11_20:56:46 Info:taking over Resource group ipaddr::192.168.1.210/24/eth0resourcemanager[11928]: 2014/05/11_20:56:46 info: Acquiring resource group:test1.qiguo.com Ipaddr::192.168.1.210/24/eth0 httpdipaddr[11954]: 2014/05/11_20:56:46 INFO: Resource is stoppedresourcemanager[11928]: 2014/05/11_20:56:46 info: Running/etc/ha.d/resource.d/ipaddr 192.168.1.210/24/eth0 startipaddr[12049]: 2014/05/11_20:56:46 info:using Calculated netmask for 192.168.1.210:255.255.255.0ipaddr[12049]: 2014/05/11_20:56:46 info:eval ifconfig eth0:0 192.168.1.210 netmask 255.255.255.0 broadcast 192.168.1.255ipaddr[12020]: 2014/05/11_20:56:46 INFO: SUCCESSRESOURCEMANAGER[11928]: 2014/05/11_20:56:46 info:running/etc/init.d/httpd startmach_down[11903]: 2014/05/1 1_20:56:46 Info:/usr/share/heartbeat/mach_down:nice_failback:foreign Resources acquiredmach_down[11903]: 2014/05/ 11_20:56:46 Info:mach_down takeover complete for node test1.qiguo.com.heartbeat[11796]: 2014/05/11_20:56:46 info:mach_ Down takeover complete.
Open the log on the standby server, observe, on the standby server, has all the resources taken past, and now continue to visit 192.168.1.210 can see the display is test2 this host content, when test1 new online, because we set up the Auto_ The value of failback is on, so the resources will be brought back again, where the log files are no longer placed. Here a simple high-availability httpd service has been set up.
Because in many cases httpd high-availability services also use shared file services, it is sometimes necessary to share a file system. You only need to define more than one file system resource in Haresources.
test1.qiguo.com ipaddr::192.168.1.210/24/eth0 Filesystem::192.168.1.230:/html::/var/www/html::nfs httpd. This is mounted using the NFS file system.