Web-based dual-machine hot-standby cluster building with RHCS

Last Update:2014-08-27 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Operation principle and function introduction of RHCS cluster based on RHCS Web dual-Machine hot-standby cluster

1. Distributed Cluster Manager (CMAN)
Cluster Manager, referred to as Cman, is a distributed cluster management tool that runs on each node of the cluster and provides cluster management tasks for RHCS. Cman is used to manage cluster members, messages, and notifications. It is through monitoring the running state of each node to understand the relationship between node members, when a node in the cluster fails, the node member relationship will change, Cman timely notice the change to the bottom, and then make corresponding adjustments.

2. Lock Management (DLM)
Distributed lock manager, referred to as DLM, represents a distributed lock manager, which is a basic component of RHCS and provides a common locking mechanism for the cluster, in which the DLM runs on each node of the cluster in the RHCs cluster system. GFS accesses file system metadata synchronously through the lock mechanism of the lock manager. CLVM synchronizes update data to LVM volumes and volume groups through the lock manager. DLM does not need to set the lock Management Server, it adopts the equivalent lock management mode, greatly improves the processing performance. At the same time, DLM avoids a performance bottleneck that requires overall recovery when a single node fails, and the DLM request is local and does not require a network request, so the request takes effect immediately. Finally, DLM can implement the parallel lock mode of multiple lock spaces through the layered mechanism.

3. Configuration file Management (CCS)
Cluster configuration System, referred to as CCS, is primarily used for cluster configuration file management and synchronization of configuration files between nodes. CCS runs on each node of the cluster, monitoring the state of the single profile/etc/cluster/cluster.conf on each cluster node, updating the change to each node in the cluster whenever the file changes, keeping the profile of each node synchronized at all times. For example, the administrator updates the cluster configuration file on Node A, and when CCS discovers that a node's configuration file changes, the change is propagated to the other nodes immediately. The RHCs configuration file is cluster.conf, which is an XML file that contains the cluster name, cluster node information, cluster resources and service information, fence device, and so on.

4. Grating equipment (FENCE)
Fence equipment is an essential part of the RHCS cluster, through the fence device can avoid the unpredictable situation caused by the "brain crack" phenomenon, fence equipment, is to solve similar problems, Fence equipment is mainly through the server or storage itself hardware management interface, or external power management device, to the server or storage directly issued hardware management instructions, the server restart or shutdown, or disconnected from the network. Fence working principle is: when the unexpected cause of the host exception or downtime, the opportunity to first call the fence device, and then through the fence device to restart the exception host or from the network isolation, when the fence operation successfully executed, the return of information to the standby machine, after receiving the fence successful information, Start taking over the services and resources of the host. This frees up the resources occupied by the exception node through the fence device, ensuring that the resources and services are always running on a single node. RHCs's fence devices can be divided into two types: internal fence and external fence, common internal fence with IBM Rsaii cards, HP's ILO cards, and IPMI devices, and external fence devices with UPS, SAN SWITCH, NETWORK Switch, etc.

5. Highly Available Service Manager
High Availability service management is primarily used to monitor, start, and stop clusters of applications, services, and resources. It provides a management capability for cluster services, when a node's service fails, the High Availability Cluster service management process can transfer services from this failed node to other health nodes, and the service transfer capability is automatic and transparent. RHCS manages the Cluster service through Rgmanager, Rgmanager runs on each cluster node, and the corresponding process on the server is CLURGMGRD. In a RHCS cluster, the High Availability service contains two aspects of Cluster service and cluster resources, the Cluster service is actually application service, such as Apache, MySQL, etc., there are many kinds of cluster resources, such as an IP address, a running script, a Ext3/gfs file system, etc. In a RHCS cluster, the high availability service is combined with a failover domain, so-called failover domains are a collection of cluster nodes that run a particular service. In the failover domain, you can prioritize each node, prioritize the service transfer when the node fails, and if the node is not assigned a priority, the cluster high-availability service will be transferred between any nodes. Therefore, by creating a failover domain, you can not only set the order in which services are transferred between nodes, but you can restrict a service to switch only within the nodes specified by the failover domain.

6. Cluster Configuration management Tools
RHCS provides a variety of cluster configuration and management tools, commonly used GUI-based system-config-cluster, conga, etc., but also provides command-line-based management tools.
System-config-cluster is a graphical management tool for creating clusters and configuring cluster nodes, which has two components for cluster node configuration and cluster management, which are used to create cluster node profiles and maintain node running status respectively. It is generally used in earlier versions of RHCs. Conga is a new network-based cluster configuration tool, unlike System-config-cluster, where conga is configured and managed by the Web for cluster nodes. The conga consists of two parts, Luci and Ricci,luci installed on a separate computer for configuring and managing clusters, Ricci on each cluster node, and Luci communicating through Ricci and each node in the cluster. RHCS also provides a number of powerful cluster command-line management tools, commonly used are clustat, Cman_tool, Ccs_tool, Fence_tool, Clusvcadm, and so on, the use of these commands will be described below.

7. Redhat GFS
GFS is a storage solution provided by RHCS for the cluster system, which allows multiple nodes in the cluster to share storage at the block level, each node by sharing a storage space to ensure consistency of access data, more realistically, GFS is a clustered file system provided by RHCS, Multiple nodes simultaneously mount a file system partition, and the file system data is not destroyed, this is a single file system, such as EXT3, EXT2 can not do.
To enable multiple nodes to read and write to a filesystem concurrently, GFS uses a lock manager to manage I/O operations, and when a write process operates a file, the file is locked, and no other process is allowed to read and write until the write process is finished to release the lock, only if the lock is released. Other read and write processes can operate on this file, and when a node modifies data on the GFS file system, this modification is immediately visible to the other nodes through the RHCS underlying communication mechanism. When building a RHCS cluster, GFS is generally run on each node as a shared store, and GFS can be configured and managed through the RHCS management tool. These need to explain the relationship between RHCs and GFs, the general beginner is easy to confuse the concept: running Rhcs,gfs is not necessary, only need to share storage, only need GFS support, and build GFS cluster file system, must have RHCS of the underlying support, So installing the GFs File system node, you must install the RHCs component.
Introduction to cluster environment

Master node, realserver1:192.168.10.121

Standby node, realserver2:192.168.10.122

Storage node, node1:192.168.10.130

Floating ip:192.168.10.254 of a cluster

Configure SSH trust between hosts

① execute code on each host

/usr/bin/ssh-keygen-t rsa/usr/bin/ssh-keygen-t dsacat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keyscat ~/.ssh/id _dsa.pub >> ~/.ssh/authorized_keys

② establish SSH trust between RealServer1 and RealServer2

Execute on RealServer1:

SSH server2 cat/root/.ssh/id_rsa.pub >>/root/.ssh/authorized_keys ssh server2 cat/root/.ssh/id_dsa.pub >> /root/.ssh/authorized_keys

Execute on RealServer2:

SSH Server1 cat/root/.ssh/id_rsa.pub >>/root/.ssh/authorized_keys ssh Server1 cat/root/.ssh/id_dsa.pub >> /root/.ssh/authorized_keys

Configure the Target Store

Yum Install Iscsi-target-utils-y

Service TGTD restartchkconfig tgtd onhostname= "Iqp.2014-08-25.edu.nuist.com:storage.disk" Tgtadm--lld iSCSI--mode Target--op  new--tid 1--targetname $HOSTNAMEtgtadm--lld iSCSI--op New--mode logicalunit--lun 1--tid 1-b/dev/s Dbtgtadm--lld iSCSI--op bind--mode target--tid 1-i all#tgtadm--lld iSCSI--op show--mode target |grep Targettgt-adm In-s

Configuring Initiator for all nodes

Yum install iscsi-initiator*-yservice iSCSI startservice iscsid startchkconfig iscsidiscsiadm-m discovery-p 192.168.10. 130:3260-T sendtargetsiscsiadm-m node--targetname iqp.2014-08-25.edu.nuist.com:storage.disk-p 192.168.10.130:3260- -login

Create a GFS2 file system (only one of the nodes) (created on 192.168.10.121) (Create an LVM)

Pvreate/dev/sdbvgcreate webvg/dev/sdb lvcreate-l 2g-n webvg_lv1 WEBVG

Install the cluster software on each node

Yum-y Install cman* yum-y install rgmanager*yum-y install gfs2-utilsyum-y install system-config-cluster*yum-y Install Lvm2-cluster

Format file system (format once)

Mkfs.gfs2-p lock_dlm-t httpd_cluster:webvg_lv1-j 2/dev/webvg/webvg_lv1

Using the System-config-cluster graphical tool to generate cluster.conf configuration files

① start System-config-cluster, create Httpd_cluster

② Adding a new node

③ Adding fence

④ Binding fence and nodes

⑤ Adding resources: adding IP Resources

⑥ Adding resources: Adding GFS Resources

⑦ Adding a resource: Adding a Script resource

⑧ Creating an invalid domain

⑨ Creating a Cluster service

The contents of the file after configuration is as follows

<?xml version= "1.0"? ><cluster config_version= "2" name= "Httpd_cluster" ><fence_daemon post_fail_delay = "0" post_join_delay= "3"/><clusternodes><clusternode name= "RealServer1" nodeid= "1" votes= "1" >< Fence><method name= "1" ><device name= "Fence1" nodename= "RealServer1"/></method></fence> </clusternode><clusternode name= "RealServer2" nodeid= "2" votes= "1" ><fence><method name= "1" ><device name= "Fence2" nodename= "RealServer2"/></method></fence></clusternode></ Clusternodes><cman expected_votes= "1" two_node= "1"/><fencedevices><fencedevice agent= "Fence_ Manual "Name=" Fence1 "/><fencedevice agent=" fence_manual "name=" Fence2 "/></fencedevices><rm> <failoverdomains><failoverdomain name= "Httpd_fail" ordered= "0" restricted= "1" ><failoverdomainnode Name= "RealServer1" priority= "1"/><failoverdomainnode name= "RealServer2" priority= "1"/></failOverdomain></failoverdomains><resources><ip address= "192.168.10.254" monitor_link= "1"/>< Script file= "/etc/init.d/httpd" name= "httpd"/><clusterfs device= "/dev/webvg/webvg_lv1" force_unmount= "1" fsid = "8669" fstype= "GFS2" mountpoint= "/var/www/html" name= "Docroot" options= ""/></resources><service autostart= "1" domain= "Httpd_fail" name= "Httpd_srv" recovery= "relocate" ><ip ref= "192.168.10.254"/>< Script ref= "httpd"/><clusterfs ref= "Docroot"/></service></rm></cluster>

Copies the resulting cluster.conf to other nodes through the SCP. (Manual replication is required for the first time, and the configuration file can be distributed to all nodes by System-config-cluster after the Cluster service starts.) ）

Installing Apache on RealServer1 and RealServer2

Yum Install httpd

Configuration RealServer1 (RealServer2 similar)

Namevirtualhost 192.168.10.121:80servername   www.example.com<VirtualHost 192.168.10.121:80>documentroot /var/www/htmlservername   www.example.com</virtualhost>

Set Apache to boot without booting

Chkconfig httpd off

Start the Cluster service

Service Cman StartService rgmanager startlvmconf--enable-clusterservice clvmd start

After the cluster is started, after you turn on system-config-cluster, such as. The configuration file for the cluster can be modified, and the configuration file can be distributed to all nodes via send to cluster after the modification is complete. The tool has a built-in version control feature that automatically generates a new version each time the profile is modified. After the Cluster service is started, the "Cluster Manager" appears, with the contents as shown in the image on the right. Displays the nodes in the cluster, the status of the Cluster service, and the master node in the cluster, respectively.

Test

The cluster is now complete.

View cluster Status: Clustat

View mount Status: Mount

Manually Switch Master node: clusvcadm-r httpd_srv Alternate host name

View floating-point IP attached to that host: IP addr

Access apache:http://192.168.10.254 through the browser

Reference

1.51CTO Zhu Weihong Teacher's video tutorial, address: http://edu.51cto.com/course/course_id-2.html

2. Baidu Library: Red Hat Cluster kit RHCS four section

Web-based dual-machine hot-standby cluster building with RHCS

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More