Linux ha Cluster

Last Update:2014-09-16 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The main purpose of HA (high availability, highly available) clusters is to increase the online rate of service, that is, to shorten the average The time of the failure. In fact, when a service node is not in the line, another node that provides the same service can continue to provide services to avoid a single point of failure.

HA cluster related concepts One, the related noun explanation 1, the online rate

The online rate is an important metric for evaluating HA clusters, even in the following ways:

Online rate = Average trouble-free time/(mean time to repair + average no downtime)

So improve the usability of the system:

1), increase the average trouble-free time

2), shorten the mean time to repair (can be achieved through redundancy mechanism)

2. Resources

The resources here refer to the resources that HA clusters need to improve service. For example, when providing a MySQL service, the required resources are the IP address (the interface that accesses the database), the MySQL service script (which provides the database service), the file system (which provides the storage location of the data, either the local file system or the shared file system, for example, NFS, etc.).

The resources required for different HA clusters are also different.

3. Resource type

Master resource: Primitive/native, can only run on a node
Group resource: is a collection of multiple resources
Clone: Clones a resource that can run on multiple nodes. Includes the number of copies of the clone, and the number of clones per node to run is specified
Master/slave: Master-slave resources, special cloning resources. (DRBD)

4. Resource switching

FailOver: Failover, when a node fails, it is necessary to implement the transfer of resources.

Failback: Resource recovery, whether to re-take over the resource when the failed node comes back online.

5. Resource constraints

Define the stickiness and constraints between resources and resources. The common constraints are position constraints, rehearsal constraints, and order constraints.

Second, HA cluster architecture

The architecture mentioned here is the use of software to achieve high availability, the software to achieve a highly available cluster should implement the following aspects of the content.

1. Messaging layer (Message level)

The primary purpose of this layer is to pass "heartbeat" information. Heartbeat information: (also known as status information) is a certain size of broadcast, multicast, or multicast packet. The frequency at which each node can be configured to communicate "heartbeat" information to other nodes, and the wait time before the process on other nodes to confirm that the primary node has run, etc.

The software that can implement this feature is:

Heartbeat v1
Heartbeat v2
Heartbeat v3
Corosync
Cman

2. CRM (Cluster Resource Manager Cluster resource management)

Each node in the HA cluster is running, providing a core component for a highly available cluster, including the definition of resources, attributes. In addition, a CIB (cluster repository XML document) and LRM (local resource management) components are maintained on each node. For the CIB only the documents that work on the DC (master node) are modifiable. For LRM, it is the specific performer who performs a local execution of a resource and stops the CRM delivery. When a node fails, it is the DC through the PE (Policy engine) and TE (Implementation engine) to decide whether to rob the resource.

The software that implements this layer's functionality is:

1), Heartbeat v1: Comes with Explorer Haresources,haresources: Required configuration file, file name is Haresources

2), Heartbeat v2: Bring your own resource Manager CRM,CRM: You need to run CRMD on each node. Configure interface: Command line: Crmsh;gui:ha-gui

3), Heartbeat v3 = heartbeat + pacemaker + Cluster-glue

Pacemaker:CLI:crm (SuSE), Pcs;gui interface: HAWK,PACEMAKER-MGMT

4), Rgmanager (cman as Message Layer): These mechanisms are used to failover Domial,node priority to manage the cluster.

Configuration Interface: Cli:clustat,cman_tool;gui:conga (Lici + Ricci)

3, RA (Resource Agent resource agents)

A script that can manage the start, stop, and status information for a resource that is part of a cluster on this node.

Common Resource proxies:

LSB: All scripts in the/etc/init.d/directory

OCF (open Cluster framework Open source cluster architecture): more versatile than LSB.

HB Legcy: All Files under/etc/ha.d/haresource.d/

Third, resource constraints

The tendency of resource operation: (Resource transfer tendency)
Resource stickiness: The resource tends to remain at the current node value (-oo +oo)
The-oo means that only this node is available for service.
+oo this node is preferred.

Resource constraints:
Position constraint: The tendency of a resource to run on a node
Inf-inf
Permutation constraints: Defining the propensity between resources
Inf:
-inf:
Order constraints: The order in which multiple resources are started and closed when they run on the same node

To be Continued ...

Linux ha Cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More