Installation documentation for LINUX6.3 under RHCS

Last Update:2015-04-09 Source: Internet

Author: User

Tags failover gpg

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

LINUX6.3 RHCS installation and cluster configuration documentation

Environment:

At present, the Huawei E6000 series of two blades to install RHCS, each blade has two business network ports and a management network port, but do not see the physical network card, but connected to the blade itself carries a switch board card. The two blades are mainly to realize the drift of the server address, that is, the machine running the service fails, the service can be smoothly switched to another, of course, you can add other resources (such as Apache, script) to implement the cluster function, but the content is the same, only when the interface is configured to add a different resource.

The Hosts file for Blade:

172.16.32.1 host1

172.16.32.2 Host2

172.16.14.21 HOSTHB1

172.16.14.22 HOSTHB2

172.16.14.1 HOST1_IPMI

172.16.14.2 HOST2_IPMI

172.16.32.15 Service

Simply explain the meaning of the address, the first set of business address, configured on the network card eth0; The second group is the heartbeat address, configured on the network card eth1, this address is related to Qdisk; The third group is the management address of each blade, configured on the BMC chip, this address is related to the fence device, The last address is the service address, which is the address that actually provides the service in the cluster.

Explain some of the key keywords in RHCs, figuring out what these things are and how it works.

Fence Device: that is, isolating devices. A cluster of two machines, which occupy resources to provide services, when the machine has a problem, another machine will take over the resources and services, but if the problem of equipment if it is suspended animation, then he may still occupy the resources (shared storage, etc.), This can cause two machines to read and write to the storage at the same time and produce errors, the name of brain fissure phenomenon. At this time need to fence equipment, when a device problems, then through the BMC chip to control its power to make it power off, restart, so that resources will be released. The isolation device can be understood as the BMC chip, the BMC chip can control the server power switch, restart. Before we said that each blade has a BMC Port (management port), but this port at the operating system level is not visible, unlike other network cards with Ifconfig can see the details of eth0,eth1. To manage this port, you can install an IPMI-related package that comes with your operating system

freeipmi-0.7.16-3.el6.i686.rpm

freeipmi-0.7.16-3.el6.x86_64.rpm

freeipmi-bmc-watchdog-0.7.16-3.el6.x86_64.rpm

freeipmi-ipmidetectd-0.7.16-3.el6.x86_64.rpm

ipmitool-1.8.11-13.el6.x86_64.rpm

openipmi-2.0.16-12.el6.x86_64.rpm

openipmi-libs-2.0.16-12.el6.x86_64.rpm

The 64-bit machine is good for 64 bits, and the specific usage of the interface given by IPMI is described later. After the package is installed, it can be connected to the BMC chip through the IPMI package. In addition, the BMC network port corresponds to a management address, this address is required to enter the blade BIOS in the IPMI-related options set, to set its user name, password and IP address, this information will be used when configuring the cluster. We have the address of the BMC and the address of the eth1 in the same network segment, so that the system can communicate with the BMC chip, and then control the server start and stop.

Qdisk : commonly known as an arbitration disk, the heartbeat address is used to determine whether the device status in the cluster is normal. Common 1G-size shared storage.

Failover domain: Failed domains, node switching policy

Configure the IP address before installing the software, including modifying the IPMI configuration inside the BIOS.

RHCs Software Installation:

The installation of RHCS requires the use of some packages from the system's own CD-ROM, as follows:

Highavailability loadbalancer Packages resilientstorage scalablefilesystem Server

These packages (folders) are copied to the/mnt directory, you can also do not copy, set the Yum source is OK.

Settings for the Yum source

# Vi/etc/yum.repos.d/rhel-source.repo

[email protected] yum.repos.d]# cat Rhel-source.repo

[Rhel-source]

name=red Hat Enterprise Linux $releasever-$basearch-source

Baseurl=file:///mnt/packages

Enabled=0

Gpgcheck=1

Gpgkey=file:///etc/pki/rpm-gpg/rpm-gpg-key-redhat-release

[Rhel-source-beta]

name=red Hat Enterprise Linux $releasever beta-$basearch-source

Baseurl=file:///mnt/packages

Enabled=0

Gpgcheck=1

Gpgkey=file:///etc/pki/rpm-gpg/rpm-gpg-key-redhat-beta,file:///etc/pki/rpm-gpg/rpm-gpg-key-redhat-release

[Server]

Name=server

Baseurl=file:///mnt/server

Enabled=1

Gpgcheck=0

[Highavailability]

Name=highavailability

Baseurl=file:///mnt/highavailability

Enabled=1

Gpgcheck=0

[LoadBalancer]

Name=loadbalancer

Baseurl=file:///mnt/loadbalancer

Enabled=1

Gpgcheck=0

[Scalablefilesystem]

Name=scalablefilesystem

Baseurl=file:///mnt/scalablefilesystem

Enabled=1

Gpgcheck=0

[Resilientstorage]

Name=resilientstorage

Baseurl=file:///mnt/resilientstorage

Enabled=1

Gpgcheck=0

Ensure that the BaseURL path for each package is correct.

Install the Software:

# yum Install Cluster-glue resource-agents pacemaker

Select Y

# yum Install Luci Ricci cman Openais Rgmanager lvm2-cluster gfs2-utils

Select Y

start Ha Service:

# service Luci Start

# service Ricci Start

# service Rgmanager Start

# service Cman Start

To change the operational level of a service:

Cman 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Rgmanager 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Luci 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Ricci 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Modify the password for the Ricci user

PASSWD Ricci, change to root user password.

Create a Qdisk disk

Usage: mkqdisk-l | -F <label> | -C <device>-L <label>

Mkqdisk-c/dev/sdb-l Qdisk

Here, a 1G/dev/qdisk is configured as an arbitration disc called Qdisk.

Ping the heartbeat and BMC Port address, should be OK.

See if Host1 can manage host2 power modules

Ipmitool-i lan-h 172.16.14.2 (the BMC Port address of the second machine)-u root (user name)-p [email protected] (password) power status

[Email protected] ~]# ipmitool-i lan-h 172.16.14.2-u root-p [email protected] power status

Chassis Power is on

Creating a Cluster

After making sure that the primary node's Luci,ricci service is up, go to the https://172.16.32.1:8084 cluster configuration page.

Start creating a cluster after logging in with the Ricci password you set:

Add node:

The JYAPP_01HB,JYAPP_02HB here is our HOSTHB1,HOSTHB2.

Configuration basic default, nothing to say.

Fence Devices to add:

Select the IPMI LAN, of course, the fence device of each manufacturer is different according to the situation configuration.

Enter the user name, password, and IP address previously configured

Two devices two fence devices.

associated with the node after creation is complete.

Add the fence method first, and then associate it with the fence instance that you created. One by one corresponds.

FAILOVER Domain Configuration:

Failure domain configuration, prioritized is priority, restricetd to run the Service node machine range, no fackback not failback, when the failure of the machine recovery is available, the service does not failback. The lower the priority value, the higher the priority level, and 0 is not available.

Resource add:

Resource Add, here we only add a server address (drift address)

Note: We added a script resource later, the role of the script is to start a middleware and then launch the application, and must have a sequence, the implementation of this sequence can be implemented in the script. The activation of the script and the start of the service address are also sequential and must be enabled by the application before the service address can be started. So when you add these two resources, it is a bit different, you need to add the service Address resource, and then add the service resources of child resource, the first parent has child. That's the order. If there is no requirement in order, increase the resources of two siblings. The script contents and explanations are at the end of the document.

SERVICE Group Add:

Service configuration, choose our failure domain, automatically start tick, run exclusive, the service will only run on "node not running other services", can not choose, recovery policy Select relocate, that is, switch to another node.

The default does not say, need to set to say, multicast address is best set one, not set also default, but if there are multiple sets of RHCS in a business system, the multicast address will be the same.

Qdisk, with the label we set to identify, working mechanism is every 2 seconds ping the gateway, ping 1 points, do not score, ping 10 times, the minimum score is 1 points, you do not need to switch. That means ping the gateway 10 times in a row, as long as one of them is OK.

Ping-c1-t1 172.16.14.254

The configuration is done here.

We can look at the state of this cluster.

[Email protected]_01 ~]# Clustat

Cluster Status for Jyapp_cluster @ Tue Dec 24 14:25:50 2013

Member status:quorate

Member Name ID Status

------ ---- ---- ------

JYAPP_01HB 1 Online, Local, Rgmanager

JYAPP_02HB 2 Online, Rgmanager

/dev/block/8:16 0 Online, Quorum Disk

service name Owner (last) state

----------- ----------- -----

Service:jysg JYAPP_01HB started

All online, the service is in normal operation.

[Email protected]_01 ~]# cman_tool Status

version:6.2.0

Config version:19

Cluster Name:jyapp_cluster

Cluster id:469

Cluster Member:yes

Cluster generation:28

Membership State:cluster-member

Nodes:2

Expected Votes:3

Quorum Device Votes:1

Total Votes:3

Node votes:1

Quorum:2

Active subsystems:11

Flags:

Ports bound:0 11 177 178

Node NAME:JYAPP_01HB

Node id:1

Multicast addresses:239.192.14.111

Node addresses:172.16.14.21

Cman_tool status to see the number of votes are normal, one vote per node, qdisk a vote, a total of 3 votes, the expected value is 3 votes. Dead one node, the expectation is less one vote.

Finally verify the switchover

[Email protected]_01 ~]# clusvcadm-r jysg

Trying to relocate Service:jysg ... Success

SERVICE:JYSG is now running on JYAPP_02HB

Node manual switchover succeeded. After testing, network card failure, power loss can be switched normally.

Script:

#!/bin/bash

Start () {

Su-tuxedo-c "Tmboot-y"

Retval=$?

Su-trade-c "Cd/home/trade/app/bin && tmboot-y"

Retval=$?

Return $RETVAL

}

Stop () {

Su-trade-c "Cd/home/trade/app/bin && tmshutdown-y"

Retval=$?

Su-tuxedo-c "Tmshutdown-y"

Retval=$?

Return $RETVAL

}

Case "$" in

Start

;;

Stop

;;

Status

Retval=0

;;

Restart

Stop

Start

;;

echo $ "Usage:tuxedo.sh {Start|stop|restart}"

retval=2

Esac

Exit $RETVAL

The script first defines two functions, launches the application, and stops the application. Each execution will have a return value. When the main program executes, it performs different operations, starts, stops, current state, and reloads according to the different values of the return values.

From: Zhou Wen Yu [email protected]

Installation documentation for LINUX6.3 under RHCS

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More