Introduction of installation Environment
This example is to introduce the construction of the Web+mysql cluster, the entire RHCS cluster has a total of four servers composed of two hosts to build a Web cluster, two hosts to build a MySQL cluster, in this cluster framework, any Web server failure, there is another Web server to take over the service , at the same time, any MySQL server failure, there is another MySQL server to take over services to ensure that the entire application system services uninterrupted operation. As shown in the following illustration:
Ii. preparatory work before installation
CentOS is a cloned version of Rhel, and RHCS all functional components are provided free of charge, so the following is CentOS.
Operating system: Unified use of Centos5.3 version. To facilitate the installation of the RHCs suite, the following installation packages are recommended when installing the operating system:
L Desktop Environment: Xwindowssystem, GNOME desktop environment.
• Development tools: Developmenttools, x Software Development, Gnomesoftware development, KDE software development.
Address planning is as follows:
The installation and use of Iscsi-target has been introduced in previous articles, no longer described, where the shared disk is assumed to be/dev/sdb.
III, install Luci
luci is a RHCS, web-based cluster Configuration management tool that can find the corresponding Luci installation package from the system CD, installed as follows:
[ Root@storgae-server ~] #rpm-IVH luci-0.12.2-12.el5.centos.1.i386.rpm
Install completed, perform Luci initialization operation:
[Root@storgae-server ~] #luci_admin init
Initializing the Luci server
creating the ' admin ' user
Confirm passwor D:
Please wait ...
the Admin Password has been successfully set.
Generating SSL Certificates ...
Luci Server has been successfully initialized
entered the password two times and created a user admin for the default logon Luci.
Finally, start the Luci service:
[root@storgae-server ~]#/etc/init.d/luci start
Service is successfully started, you can access https://ip:8084 through Luci.
to allow Luci access to other nodes of the cluster, you also need to add the following in/etc/hosts:
To this point, the settings on the Storgae-server host are complete.
Iv. Installing RHCS packages on cluster nodes
to ensure that each node of the cluster can communicate with each other, the hostname information of each node needs to be added to the/etc/hosts file to modify the completed/etc/ The Hosts file reads as follows:
Copy this file to each node of the cluster in turn/etc/ Hosts file.
RHCs software package installation can be done in two ways, through the Luci management interface, when creating cluster, automatically installed via online download, or directly from the operating system CD to find the required software package for manual installation, due to the online installation method is affected by network and speed, not recommended, This is the manual way to install the RHCS package.
Install RHCS, the main installed components are Cman, GFS2 and Rgmanager, of course, when installing these packages may require other dependent system packages, just follow the prompts to install, the following is an installation checklist, in the cluster of four nodes to execute:
# Install Cman
Rpm-ivh openais-0.80.6-16.el5_5.2. i386.rpm
rpm -ivh gfs2-utils-0.1.62-20.el5.i386.rpm
V. Configuring iSCSI clients in cluster node installation
The iSCSI client is installed to communicate with the Iscsi-target server, which in turn imports the shared disks into each cluster node, taking the cluster node Web1 as an example, describes how to install and configure iSCSI, and the remaining other nodes are installed and configured in exactly the same way as the Web1 node.
The installation and configuration of iSCSI clients is simple and can be accomplished in the following steps:
[Root@web1 rhcs]# RPM-IVH iscsi-initiator-utils-22.214.171.1241-0.16.el5.i386.rpm
[Root@web1 rhcs]#/ETC/INIT.D/ISCSI Restart
[Root@web1 rhcs]# iscsiadm-m discovery-t sendtargets-p 192.168.12.246
[Root@web1 rhcs]#/ETC/INIT.D/ISCSI Restart
[Root@web1 rhcs]# Fdisk-l
disk/dev/sdb:10.7 GB, 10737418240 bytes
Heads, Sectors/track, 10240 cylinders
Units = Cylinders of 2048 * 1048576 bytes
Disk/dev/sdb doesn ' t contain a valid partition table
From the output of Fdisk,/DEV/SDB is the partition of the disk that is shared from Iscsi-target.
At this point, the installation work is all over.
Six, configure RHCS high availability cluster
Configuration RHCs, the core is the configuration of/etc/cluster/cluster.conf files, the following Web management interface to describe how to construct a cluster.conf file.
Starting the Luci service on the Storgae-server host and then accessing https://192.168.12.246:8084/through the browser, you can open the Luci login interface, as shown in Figure 1:
After a successful login, Luci has three configuration options, namely, Homebase, cluster, and storage, where cluster is primarily used to create and configure cluster systems, storage to create and manage shared storage, while Homebase is primarily used to add, update, Delete the cluster system and storage settings, and you can also create and delete Luci logged-in users. As shown in Figure 2:
1. Create a cluster
After you log in to Luci, switch to the cluster option and click "Create a new cluster" in the clusters box on the left to add a cluster, as shown in Figure 3:
In Figure 3, the cluster name created is called Mycluster, node Hostname represents the host name of each node, and "root Password" represents the root user password for each node. The root password for each node can be the same or different.
In the five options below, "Download packages" means downloading and automatically installing the RHCS package online, while "use locally installed packages" means installing with a local installation package, Since the RHCs component package has been manually installed in the description above, you can choose to install it locally. The remaining three check boxes are enable shared storage support (enable shared Storage Support), reboot the system when the node joins the cluster (Rebootnodes before joining cluster), and check the consistency of the node password If nodepasswords are identical), these create cluster settings, optional, and do not make any selections here.
The View SSL cert fingerprints is used to verify that the nodes of the cluster and Luci communicate properly, and that the configuration of each node can create a cluster, and if the detection fails, the appropriate error message is given. If the validation succeeds, the success information is output. All options are complete, click "Submit" for submission, and then Luci to start creating cluster as shown in Figure 4:
After the install----Reboot----Configure----Join the four processes, if there is no error, "Mycluster" was created, in fact, the process of creating cluster, Luci is the process of writing a set of cluster information to each cluster node configuration file. After the cluster is successfully created, the list of cluster global properties for "Mycluster" is displayed by default, and click Cluster-->clusterlist to view the status of the mycluster created, as shown in Figure 5:
As you can see from Figure 5, there are four nodes under the Mycluster cluster, and in normal state, the node nodes name and cluster name are all green, and if an exception appears, it will appear in red.
Click on any of the node names below nodes to view the running state of this node, as shown in Figure 6:
As you can see from Figure 6, the Cman and Rgmanager services are running on each node, and the two services need to be started automatically, they are the core daemons of RHCs, and if the two services are not started at a node, they can be started manually by command line, as follows:
After the service starts successfully, click the "Update Nodedaemon Properties" button in Figure 6 to update the status of the node.
With the above operation, a simple cluster is created, but this cluster is still not working, and you need to create failover Domain, resources, Service, sharedfence for this cluster Device and so on, the following is introduced in turn.
2. Create Failover Domain
Failover Domain is a failed staging field that configures the cluster, and the failover domain enables you to restrict the switching of services and resources to the specified nodes, and the following actions create two failover domains, Webserver-failover and mysql-failover.
Click Cluster, then click "Mycluster" in the cluster list, then click Failover domains-->add a Failover Domain in the lower left mycluster column, Add a failover Domain, as shown in Figure 7:
In Figure 7, the meanings of each parameter are as follows:
L Failover Domain name: A failed transfer domain name that is created with an easy to remember name.
L Prioritized: Whether to enable domain member priority settings in failover domains, select Enable.
L Restrict Failover to this domain ' s member: Indicates whether the service failover limit is enabled in the failed transfer domain members. Select Enable here.
L Don't fail back service in this domain, which means that the failover function is used in the field, that is, when the primary node fails, the standby node automatically takes over the primary node services and resources, and when the primary node is restored to normal, The services and resources of the cluster are automatically switched from the standby node to the primary node.
Then, in the failover Domain membership member check box, select the node that joins this domain, where the web1 and WEB2 nodes are selected, and then the priority priority is set to 10 at Web1. What needs to be said is "priority" set to 1 nodes, the priority is the highest, as the value of the lower, the node priority is reduced in turn.
All settings are complete, click submit button and start creating Failoverdomain.
As described above, continue to add the second failover domain mysql-failover, in failover domain membership member check box, select the node that joins this domain, select MYSQL1 and MYSQL2 nodes, and then, in the Priority sets the priority of the MYSQL1 to the 2,MYSQL2 priority set to 8.
3. Create Resources
Resources is the core of the cluster, which includes service scripts, IP addresses, file systems, and so on, and RHCs provides a resource as shown in Figure 8:
Add IP resources, HTTP service resources, MySQL management script resources, ext3 file system, as shown in Figure 9:
4. Create service
Click Cluster, then click "Mycluster" in the cluster list, and then click Services-->add A service in the Mycluster column on the lower left to add a services to the cluster, as shown in Figure 10:
After all the services have been added, if the application is set up correctly, the service will start automatically, click Cluster, and then in the cluster list you will see the two service startup status, which is normally shown as green. As shown in Figure 11:
Seven, configure storage cluster GFs
In the previous section, we have shared a disk partition with the four nodes of the cluster system through the Storgae-server host, followed by disk partitioning, formatting, creating a file system, and so on.
(1) Partitioning the disk
You can partition and format a shared disk partition on any node in the cluster system, where you choose to do it on the node web1, and partition the shared disk first, as follows:
[Root@web1 ~]# Fdisk/dev/sdb
This divides the shared disks into three valid partitions,/DEV/SDB5 for the GFs file system,/DEV/SDB6 for Ext3 file systems, and/DEV/SDB7 for voting disks, which are described immediately below.
(2) Format disk
Next, the disk partitions are formatted as EXT3 and GFS2 file systems at the WEB1 node as follows:
[Root@web1 ~]# MKFS.EXT3/DEV/SDB6
[Root@web1 ~]# mkfs.gfs2-p lock_dlm-t mycluster:my-gfs2-j 4/DEV/SDB5
Defined as the DLM lock method, if this parameter is not added, when the partition is mounted in both systems, as in the EXT3 format, the information of the two systems is not synchronized.
Specifies the name of the table in which the DLM lock is located, Mycluster is the name of the RHCs cluster, and must be the same as the cluster label in the cluster.conf file.
Setting the GFS2 file system supports up to a maximum number of nodes and mounts at the same time, this value can be dynamically adjusted in use by the Gfs2_jadd command.
Specifies the partition device identification to be formatted.
After all operations are completed, restart all nodes of the cluster to ensure that partitioned partitions can be identified by all nodes.
(3) Mount disk
Once all nodes have been restarted, the file system can be mounted, followed by each node in the cluster to mount the shared file system to the/GFS2 directory:
[Root@web1 ~] #mount-T Gfs2/dev/sdb5/gfs2–v
/sbin/mount.gfs2:parse_opts:opts = "RW"
/sbin/mount.gfs2:clear flag 1 for "RW", flags = 0
/sbin/mount.gfs2:parse_opts:flags = 0
/sbin/mount.gfs2:write "JOIN/GFS2 gfs2 lock_dlm mycluster:my-gfs2 rw/dev/sdb5"
Sbin/mount.gfs2:mount (2) OK
/sbin/mount.gfs2:lock_dlm_mount_result:write "MOUNT_RESULT/GFS2 GFS20"
/sbin/mount.gfs2:read_proc_mounts:device = "/DEV/SDB5"
/sbin/mount.gfs2:read_proc_mounts:opts = "Rw,hostdata=jid=3:id=65540:first=0"
The "-V" parameter allows you to output the process of mounting the GFS2 file system, helping you understand GFS2 file systems and troubleshooting.
To enable the shared file system to boot up the disk automatically, add the following content to the/etc/fstab file for each cluster node.
#GFS MOUNT POINTS
/DEV/SDB5/GFS2 GFS2 Defaults 1 1
VIII. Configuration Voting disk
(1) The necessity of using a voting disk
In a multi-node RHCs cluster system, once a node fails, the services and resources of the cluster can be automatically transferred to other nodes, but the transfer is conditional, for example, in a four-node cluster, once two nodes fail, the entire cluster system hangs and the Cluster service stops. If a storage cluster GFS file system is configured, the GFS file system that is mounted on all nodes will be hung as long as one node fails. Shared storage will not be available at this time, which is absolutely not allowed for highly available cluster systems, and the resolution of this problem is achieved by voting disk.
(2) Voting disk operation mechanism
Voting disk, that is, quorum disks, in RHCS for short Qdisk, is a disk-based cluster Arbitration service program, in order to solve the problem of voting in small-scale clusters, RHCS introduced quorum mechanism mechanism, quorum to represent the quorum number of nodes, and quorum corresponds to the quorate,quorate is a state, indicating that the number of legal nodes. In the normal state, the value of the quorum is the sum of the voting values of each node plus the Qdisk partition.
Qdisk is a shared disk partition that is less than 10MB, the QDISKD process runs on all nodes of the cluster, and through the QDISKD process, the cluster node periodically evaluates its own health and writes its own state information to the specified shared disk partition. QDISKD can also view the status information of other nodes and pass information to other nodes.
(3) The concept of voting disk in RHCs
Some of the tools associated with Qdisk are Mkdisk, heuristics.
Mkdisk is a cluster quorum disk toolset that you can use to create a Qdisk shared disk or view status information for a shared disk. The Mkqdisk operation can only create 16-node polling space, so currently qdisk can support up to 16 nodes of the RHCs high availability cluster.
Sometimes only by detecting the Qdisk partition to determine the node state is not enough, but also through the application to expand the accuracy of node state detection, heuristics is such an extension option, it allows the use of Third-party applications to assist in locating node state, commonly used Ping gateway or route, or a script, etc., if the temptation fails, QDISKD will assume that the node failed, and then try to restart the node in order to make the node into a normal state.
(4) Create a voting disk
In the previous section, multiple shared disk partitions have been partitioned, where the shared disk partition/dev/sdb7 is used as the Qdisk partition, and the following is the creation of a QDISK partition:
[Root@web1 ~]# mkqdisk-c/dev/sdb7-l myqdisk
[Root@web1 ~]# mkqdisk–l #查看表决磁盘信息
(5) Configure Qdisk
Here you configure Qdisk with the Conga Web interface, first log in to Luci, then click Cluster, click Cluster in the Mycluster list, and then select Quorum Partition, as shown in Figure 12:
The meaning of each option in Figure 12 is explained as follows:
L Interval: Indicates how long interval to perform a check assessment, in seconds.
L Votes: Specifies the number of Qdisk partition voting value.
L TKO: The number of times that a check failed was allowed. If a node is not connected to the Qdisk partition within the tko*interval time, it is assumed that the node fails and is isolated from the cluster.
L Minimum Score: Specify the minimum voting value.
L The Label:qdisk partition corresponds to the volume label name, which is the "Myqdisk" specified when the Qdisk is created, and the volume label name is recommended here because the device name may change after the system restarts, but the label name will not change.
L Device: Specifies what the name of the device to share is stored in the node.
Path to Program: Configure Third-party applications to extend the accuracy of node State detection, which is configured with the ping command
L Score: Sets the voting value of the ping command.
L Interval: Set how long to perform the ping command once.
(6) Start the Qdisk service
at each node of the cluster execute the following command to start the QDISKD service:
[root@web1 ~]#/etc/init.d/qdiskd start
After startup, if configured correctly , the Qdisk disk automatically enters the online state:
[Root@web1 ~]# clustat -L
Cluster status for Mycluster @ Sat Aug 01:25:40
member name id Status
Web 1online, Rgmanager
Mysql1 2online, Rgmanager
Mysql2 3Online, Rgmanager
web1 4online, Local , Rgmanager
/dev/sdb7 0 Online, Quorum Disk
Qdisk has run up to this point.
IX. Configuration of fence equipment
The configuration of fence equipment is an essential part of RHCS cluster system, which can prevent cluster resources (such as file system) from being occupied by multiple nodes at the same time, protect the security and consistency of shared data, and prevent the occurrence of node Diencephalon crack by fence equipment.
GFS is based on the cluster's underlying architecture to deliver lock information, or a clustered file system based on RHCS, so using the GFS file system also requires a fence device.
There are two kinds of fence device provided by RHCS, one is internal fence equipment. Common to have:
IBM server-supplied RSAII card
ILO card provided by HP server
Drac card provided by Dell Server
Intelligent Platform Management Interface IPMI
Common external fence devices are UPS, SAN switch, network switch, and if shared storage is implemented through GNBD server, you can also use GNBD fence features.
Click Cluster, then click "Mycluster" in "Clusterlist" and select shared Fence devices-->add a sharable Fence in the lower left-hand corner of the mycluster column. The Fence device selected here is "WTI power Switch", Fence's name is "Wti-fence", and then the IP address and password are entered in turn, as shown in Figure 13:
At this point, the RHCs configuration based on the Web interface is complete.