Web-based dual-machine hot-standby cluster building with RHCS

Last Update:2017-04-22 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Implementation principle and function introduction of RHCS cluster based on RHCS for Web dual-machine hot-standby cluster construction

1. Distributed Cluster Manager (CMAN)
Cluster Manager. Abbreviation Cman. is a distributed cluster management tool. It executes on each node of the cluster, providing cluster management tasks for RHCS.

Cman is used to manage cluster members, messages, and notifications. It understands the relationship between node members by monitoring the execution state of each node. There is a problem with a node in the cluster. The node member relationship will change, cman the change to the bottom, and then make corresponding adjustment.

2. Lock Management (DLM)
Distributed Lock Manager, abbreviated to DLM. Represents a distributed lock manager, which is an underlying building block of RHCs. At the same time, a common lock execution mechanism is provided for the cluster. In the RHCs cluster system, the DLM executes on each node of the cluster, and GFS synchronizes access to the file system metadata through the lock manager's lock mechanism.

CLVM synchronizes update data to LVM volumes and volume groups through the lock manager. DLM does not need to set the lock Management Server, it adopts the equivalent lock management mode, greatly improve the processing performance.

At the same time. DLM avoids performance bottlenecks that require overall recovery when a single node fails, and the DLM requests are local. No network requests are required. So the request will take effect immediately.

Finally, DLM can implement the parallel lock mode of multiple lock spaces through the layered mechanism.

3. Configuration file Management (CCS)
Cluster configuration System, referred to as CCS, is primarily used for cluster configuration file management and synchronization of configuration files between nodes.

CCS executes on each node of the cluster, monitoring the state of the single profile/etc/cluster/cluster.conf on each cluster node, updating the change to every node in the cluster whenever the file changes. Always keep each node's profile synchronized.

For example, the administrator updates the cluster configuration file on Node A, and CCS discovers that a node's configuration file has changed. Propagate this change to other nodes immediately. The RHCs configuration file is cluster.conf, which is an XML file. Details include cluster name, cluster node information, cluster resource and service information, fence device, etc.

4. Grating equipment (FENCE)
Fence equipment is an indispensable part of the RHCS cluster, through the fence device can avoid the unpredictable situation caused by the "brain crack" phenomenon, fence equipment, is to solve similar problems. The fence device is primarily a hardware management interface through the server or storage itself. or an external power management device. To issue hardware management instructions directly to the server or storage. Restart or shut down the server, or disconnect from the network. Fence works by: When an unexpected cause causes a host exception or outage, the opportunity to first call the fence device, and then through the fence device to restart the exception host or from the network isolation, when the fence operation successfully executed, return information to the standby machine. After receiving the fence successful information, the standby machine starts to take over the service and resources of the host.

This frees up the resources occupied by the exception node through the fence device, ensuring that resources and services are always executed on a single node.

RHCs's fence devices can be divided into two types: internal fence and external fence, often using an internal fence with an IBM Rsaii card. HP's ILO card, and IPMI devices, external fence devices include UPS, SAN switch, NETWORK switch, etc.

5. Highly Available Service Manager
High Availability service management is primarily used to monitor, start, and stop clusters of applications, services, and resources. It provides a management capability of the Cluster service, when a node's service fails, the High Availability Cluster service management process can transfer the service from this failed node to other healthy nodes, and the service transfer capability is self-active and transparent. RHCS manages the Cluster service through Rgmanager, Rgmanager executes on each cluster node. The corresponding process on the server is CLURGMGRD. In a RHCS cluster, high availability services include two aspects of cluster services and cluster resources. The Cluster service is actually the application service. such as Apache, MySQL, and so on, cluster resources are very many, such as an IP address, an execution script, Ext3/gfs file system. In a RHCS cluster, the high availability service is combined with a failover domain, so-called failover domains are a collection of cluster nodes that perform a particular service. In the failover domain, the ability to prioritize each node and prioritize the service transfer when the node fails, assuming that no priority is given to the node, the cluster high-availability service will move between random nodes. Therefore, by creating a failover domain, you can not only set the order in which services are transferred between nodes, but also limit the ability for a service to switch only within the nodes specified by the failed transfer domain.

6. Cluster Configuration management Tools
RHCS provides a variety of cluster configuration and management tools, often using GUI-based system-config-cluster, conga, etc., and also provides command-line-based management tools.
System-config-cluster is a graphical management tool for creating clusters and configuring cluster nodes, which consists of two parts: cluster node configuration and cluster management. Used to create a cluster node profile and maintain node execution status, respectively. It is generally used in the earlier version number of RHCs. Conga is a new network-based cluster configuration tool, unlike System-config-cluster, where conga is configured and managed by the Web for cluster nodes. The conga consists of two parts, each of which is Luci and Ricci,luci installed on a separate computer. For configuring and managing clusters, Ricci is installed on each cluster node. Luci communicates through Ricci and every node in the cluster.

RHCS also provides a number of powerful cluster command-line management tools. Often used are clustat, Cman_tool, Ccs_tool, Fence_tool, Clusvcadm and so on. How these commands are used will be described in the following.

7. Redhat GFS
GFS is a storage solution provided by RHCS for a clustered system. It agrees that multiple nodes in a cluster share storage at the block level, and each node shares a single storage space. Ensure the consistency of access data, more realistically, GFS is a clustered file system provided by RHCS, multiple nodes at the same time mount a file system partition, and file system data is not compromised, this is a single file system, such as EXT3, EXT2 can not do.

In order to implement multiple nodes for a file system at the same time read and write operations. GFS uses the lock manager to manage I/O operations. When a write process is manipulating a file. This file is locked. Other processes are not accepted for read and write operations at this time. The lock will not be released until the write process is completed properly. Only when the lock is released, other reading and writing processes are able to manipulate the file, in addition. When a node changes data on the GFS file system. Such changes will be visible on the other nodes immediately through the RHCS underlying communication mechanism. When building a RHCS cluster, GFS is generally used as a shared store and executed on each node. GFS can also be configured and managed through the RHCS management tool. These need to explain the relationship between RHCs and GFS. People who are just beginning to learn are very easy to confuse this concept: Execute RHCs. GFS is not a must, only need GFS support when it is necessary to share the storage, and the GFS cluster file system must have RHCS support, so installing the GFS file System node must install the RHCs component.

Introduction to cluster environment

The Master node. realserver1:192.168.10.121

Standby node, realserver2:192.168.10.122

Storage node, node1:192.168.10.130

Floating ip:192.168.10.254 of a cluster

Configure SSH trust between hosts

① run code on each host

/usr/bin/ssh-keygen-t rsa/usr/bin/ssh-keygen-t dsacat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keyscat ~/.ssh/id _dsa.pub >> ~/.ssh/authorized_keys

② establish SSH trust between RealServer1 and RealServer2

Run on RealServer1:

SSH server2 cat/root/.ssh/id_rsa.pub >>/root/.ssh/authorized_keys ssh server2 cat/root/.ssh/id_dsa.pub >> /root/.ssh/authorized_keys

Run on RealServer2:

SSH Server1 cat/root/.ssh/id_rsa.pub >>/root/.ssh/authorized_keys ssh Server1 cat/root/.ssh/id_dsa.pub >> /root/.ssh/authorized_keys

Configure the Target Store

Yum Install Iscsi-target-utils-y

Service TGTD restartchkconfig tgtd onhostname= "Iqp.2014-08-25.edu.nuist.com:storage.disk" Tgtadm--lld iSCSI--mode Target--op  new--tid 1--targetname $HOSTNAMEtgtadm--lld iSCSI--op New--mode logicalunit--lun 1--tid 1-b/dev/s Dbtgtadm--lld iSCSI--op bind--mode target--tid 1-i all#tgtadm--lld iSCSI--op show--mode target |grep Targettgt-adm In-s

Configuring Initiator for all nodes

Yum install iscsi-initiator*-yservice iSCSI startservice iscsid startchkconfig iscsidiscsiadm-m discovery-p 192.168.10. 130:3260-T sendtargetsiscsiadm-m node--targetname iqp.2014-08-25.edu.nuist.com:storage.disk-p 192.168.10.130:3260- -login

Create a GFS2 file system (only one of the nodes) (created on 192.168.10.121) (Create an LVM)

Pvreate/dev/sdbvgcreate webvg/dev/sdb lvcreate-l 2g-n webvg_lv1 WEBVG

Install the cluster software on each of the nodes

Yum-y Install cman* yum-y install rgmanager*yum-y install gfs2-utilsyum-y install system-config-cluster*yum-y Install Lvm2-cluster

Format file system (format it once)

Mkfs.gfs2-p lock_dlm-t httpd_cluster:webvg_lv1-j 2/dev/webvg/webvg_lv1

Using the System-config-cluster graphical tool to generate cluster.conf configuration files

① start System-config-cluster, create Httpd_cluster

② Join a new node

③ Join Fence

④ Binding fence and nodes

⑤ join a resource: Join an IP resource

⑥ Join resources: Join GFS Resources

⑦ join a resource: Join a script resource

⑧ Creating an invalid domain

⑨ Creating a Cluster service

The contents of the file after configuration such as the following

<?xml version= "1.0"? ><cluster config_version= "2" name= "Httpd_cluster" ><fence_daemon post_fail_delay = "0" post_join_delay= "3"/><clusternodes><clusternode name= "RealServer1" nodeid= "1" votes= "1" >< Fence><method name= "1" ><device name= "Fence1" nodename= "RealServer1"/></method></fence> </clusternode><clusternode name= "RealServer2" nodeid= "2" votes= "1" ><fence><method name= "1" ><device name= "Fence2" nodename= "RealServer2"/></method></fence></clusternode></ Clusternodes><cman expected_votes= "1" two_node= "1"/><fencedevices><fencedevice agent= "Fence_ Manual "Name=" Fence1 "/><fencedevice agent=" fence_manual "name=" Fence2 "/></fencedevices><rm> <failoverdomains><failoverdomain name= "Httpd_fail" ordered= "0" restricted= "1" ><failoverdomainnode Name= "RealServer1" priority= "1"/><failoverdomainnode name= "RealServer2" priority= "1"/></failOverdomain></failoverdomains><resources><ip address= "192.168.10.254" monitor_link= "1"/>< Script file= "/etc/init.d/httpd" name= "httpd"/><clusterfs device= "/dev/webvg/webvg_lv1" force_unmount= "1" fsid = "8669" fstype= "GFS2" mountpoint= "/var/www/html" name= "Docroot" options= ""/></resources><service autostart= "1" domain= "Httpd_fail" name= "Httpd_srv" recovery= "relocate" ><ip ref= "192.168.10.254"/>< Script ref= "httpd"/><clusterfs ref= "Docroot"/></service></rm></cluster>

Copies the resulting cluster.conf to other nodes through the SCP.

(Manual copy is required for the first time.) After the Cluster service is started, the configuration files can be distributed to all nodes through System-config-cluster. ）

Installing Apache on RealServer1 and RealServer2

Yum Install httpd

Configuration RealServer1 (RealServer2 similar)

Namevirtualhost 192.168.10.121:80servername   www.example.com<VirtualHost 192.168.10.121:80>documentroot /var/www/htmlservername   www.example.com</virtualhost>

Set Apache to boot without booting

Chkconfig httpd off

Start the Cluster service

Service Cman StartService rgmanager startlvmconf--enable-clusterservice clvmd start

After the cluster is started, System-config-cluster is turned on. For example with. The ability to make changes to the configuration files of the cluster and to distribute the configuration files to all nodes via send to cluster after the changes are complete. The tool built-in version number control function, each time you change the configuration file, you will voluntarily generate a new version number. After the Cluster service is started, the "Cluster Manager" appears, with the contents as shown in the image on the right. Displays the nodes in the cluster, the status of the Cluster service, and the master node in the cluster, respectively.

Test

This completes the cluster construction

View cluster Status: Clustat

View mount Status: Mount

Manually Switch Master node: clusvcadm-r httpd_srv Alternate host name

View floating-point IP attached to that host: IP addr

Visit apache:http://192.168.10.254 via browser

References

1.51CTO Zhu Weihong Teacher's video tutorial. Address: http://edu.51cto.com/course/course_id-2.html

2. Baidu Library: Red Hat Cluster kit RHCS four section

??????

Web-based dual-machine hot-standby cluster building with RHCS

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More