Linux ha Cluster

Source: Internet
Author: User
Tags failover

The main purpose of HA (high availability, highly available) clusters is to increase the online rate of service, that is, to shorten the average The time of the failure. In fact, when a service node is not in the line, another node that provides the same service can continue to provide services to avoid a single point of failure.

HA cluster related concepts One, the related noun explanation 1, the online rate

The online rate is an important metric for evaluating HA clusters, even in the following ways:

Online rate = Average trouble-free time/(mean time to repair + average no downtime)

So improve the usability of the system:

1), increase the average trouble-free time

2), shorten the mean time to repair (can be achieved through redundancy mechanism)

2. Resources

The resources here refer to the resources that HA clusters need to improve service. For example, when providing a MySQL service, the required resources are the IP address (the interface that accesses the database), the MySQL service script (which provides the database service), the file system (which provides the storage location of the data, either the local file system or the shared file system, for example, NFS, etc.).

The resources required for different HA clusters are also different.

3. Resource type
    • Master resource: Primitive/native, can only run on a node

    • Group resource: is a collection of multiple resources

    • Clone: Clones a resource that can run on multiple nodes. Includes the number of copies of the clone, and the number of clones per node to run is specified

    • Master/slave: Master-slave resources, special cloning resources. (DRBD)

4. Resource switching

FailOver: Failover, when a node fails, it is necessary to implement the transfer of resources.

Failback: Resource recovery, whether to re-take over the resource when the failed node comes back online.

5. Resource constraints

Define the stickiness and constraints between resources and resources. The common constraints are position constraints, rehearsal constraints, and order constraints.

Second, HA cluster architecture

The architecture mentioned here is the use of software to achieve high availability, the software to achieve a highly available cluster should implement the following aspects of the content.

1. Messaging layer (Message level)

The primary purpose of this layer is to pass "heartbeat" information. Heartbeat information: (also known as status information) is a certain size of broadcast, multicast, or multicast packet. The frequency at which each node can be configured to communicate "heartbeat" information to other nodes, and the wait time before the process on other nodes to confirm that the primary node has run, etc.

The software that can implement this feature is:

    • Heartbeat v1

    • Heartbeat v2

    • Heartbeat v3

    • Corosync

    • Cman

2. CRM (Cluster Resource Manager Cluster resource management)

Each node in the HA cluster is running, providing a core component for a highly available cluster, including the definition of resources, attributes. In addition, a CIB (cluster repository XML document) and LRM (local resource management) components are maintained on each node. For the CIB only the documents that work on the DC (master node) are modifiable. For LRM, it is the specific performer who performs a local execution of a resource and stops the CRM delivery. When a node fails, it is the DC through the PE (Policy engine) and TE (Implementation engine) to decide whether to rob the resource.

The software that implements this layer's functionality is:

1), Heartbeat v1: Comes with Explorer Haresources,haresources: Required configuration file, file name is Haresources

2), Heartbeat v2: Bring your own resource Manager CRM,CRM: You need to run CRMD on each node. Configure interface: Command line: Crmsh;gui:ha-gui

3), Heartbeat v3 = heartbeat + pacemaker + Cluster-glue

Pacemaker:CLI:crm (SuSE), Pcs;gui interface: HAWK,PACEMAKER-MGMT

4), Rgmanager (cman as Message Layer): These mechanisms are used to failover Domial,node priority to manage the cluster.

Configuration Interface: Cli:clustat,cman_tool;gui:conga (Lici + Ricci)

3, RA (Resource Agent resource agents)

A script that can manage the start, stop, and status information for a resource that is part of a cluster on this node.

Common Resource proxies:

LSB: All scripts in the/etc/init.d/directory

OCF (open Cluster framework Open source cluster architecture): more versatile than LSB.

HB Legcy: All Files under/etc/ha.d/haresource.d/

Third, resource constraints

The tendency of resource operation: (Resource transfer tendency)
Resource stickiness: The resource tends to remain at the current node value (-oo +oo)
The-oo means that only this node is available for service.
+oo this node is preferred.

Resource constraints:
Position constraint: The tendency of a resource to run on a node
Inf-inf
Permutation constraints: Defining the propensity between resources
Inf:
-inf:
Order constraints: The order in which multiple resources are started and closed when they run on the same node

To be Continued ...

Linux ha Cluster

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.