Cluster-ha Theory

Last Update:2014-08-26 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Prerequisites for high-availability clusters:

1. the node must continuously notify other nodes of its heartbeat information to determine the health status of the node.

2. You need to use encrypted passwords to communicate with each other to prevent any host from being added.

3. Use the voting mechanism to determine which host is the active host. Generally, a high-availability cluster has an odd number of nodes (at least three nodes). When there are too many nodes, the cluster is divided into multiple small clusters, when one small cluster cannot be connected to another small cluster, more than half of the small cluster will be used as the master cluster. Less than half of small clusters give up all services.

High-availability cluster resources:

Services running in the cluster, such as httpd and sshd.

Failover: After the node host fails, the resources on the node are transferred to the slave node.

Failback: the process of transferring resources back after the faulty node recovers.

Resource stickiness: After a resource is started, it is more inclined to run on the node host.

Resource constraints: (constraints)

Location constraints: locations

Define the relationship between resources and nodes. For example, the node on which resources tend to run.

The defined range is positive infinity (INF) to negative infinity (-INF ).

Positive infinity indicates that the node runs as long as it may run on the current node. Negative infinity is the opposite.

But positive infinity and negative infinity are added, and the result is negative infinity. Because it is infinite, additional resources are required.

Order constraint: Order

When multiple resources exist, define the order in which resources are started and closed.

For example, for Web Services, IP resources must be available first, filesystem must be available, and httpd must be the most available.

Constraint: colocations

Define the relationship between resources, that is, the tendency of two resources to start together.

For example, the IP address and filesystem must be together.

Still use positive infinity (INF) and negative infinity (-INF) to determine.

When there are many hosts in the cluster, if a resource is allowed to swim between hosts, the location of the resource cannot be found in a few days. Therefore, you need to define a Failover policy to limit the Failover domains of resources.

Left symmetric: resources can only be transferred to the specified host using a whitelist.

Right symmetry: The Blacklist is used, and resources cannot be transferred to some hosts.

Hierarchy of highly available clusters:

650) This. length = 650; "src =" http://s3.51cto.com/wyfs02/M00/47/8B/wKioL1P8PAGDjgyxAAIFv4OUp_U718.jpg "Title =" 11.jpg" width = "700" Height = "551" border = "0" hspace = "0" vspace = "0" style = "width: 700px; Height: 551px; "alt =" wkiol1p8pagdjgyxaaifv4oup_u718.jpg "/>

1. messaging and membership are used to transmit heartbeat information between nodes, also known as heartbeat layer. The heartbeat information transmitted between nodes can be broadcast, multicast, or unicast. The most important role of this layer is the information provided by the messaging layer by the master node (DC) through the cluster consensus menbership Service (CCM or CCS, to generate a complete member relationship. This layer is mainly used to connect the upper and lower layers. It transmits the information production member relationship diagram generated in the lower layers to the upper layers to notify each node of the working status, isolate a device from the upper layer. Software that can run directly on the information layer is called high-availability software. Due to the high difficulty and long cycle of software development, the second layer is available.

2. The cluster resource manager (cluster resource manager) implements the cluster service layer. Each node in this layer runs a cluster resource manager (CRM, cluster resource manager), which can provide core components for high availability, including resource definition and attributes. On each node, CRM maintains a CIB (cluster information library XML document) and LRM (local resource manager) component. For CIB, only the designated coordinator works in DC (designated coordinator. The documents on the master node can be modified. Other cibs copy the documents on the DC. For lrm, it is the specific executor that executes and stops a resource locally transferred by CRM. When a node fails, the DC determines whether to snatch resources through PE (policy engine) and Te (implementation engine). The DC is automatically elected by the cluster and does not need to be modified manually.

3. Resource Agent layer (resource agent) and cluster resource agent (scripts that can manage the startup, stop, and status information of a resource belonging to the cluster on the current node ), resource proxies are divided into: LSB (/etc/init. d/*), OCF (more professional and generic than LSB), and legacy HEARTBEAT (V1 resource management ).

Implementation tools at different layers:

Message Layer	CRM	RA
Heartbeat V1	Haresource: Heartbeat V1	Haresource: Heartbeat V1
Heartbeat v2	CRM: Heartbeat V2, CRM brand CRM	LSB:/etc/rc. d/init. d /*
Heartbeat v3	Pacemaker: a heartbeat independent project, applicable to multiple software.	OCF: open cluster framework
RHCS: CMAN (rhel5)	RHCS: rgmanger (rhel6)	Stonith: Power Switch head-off switch to prevent resource contention during cluster classification
Coroync

There are a variety of available information layer + CRM combination solutions on rhel6

1. CMAN + rgmanger

2. CMAN + pacemaker

3. heatbeat V3 + pacemaker

4. corosync + pacemaker

During the installation of corosync, since the RPM will depend on some Ra of heartbeat, heartbeat will be used in the fashion, but heartbeat may not be started. However, heartbeat does not depend on corosync.

Edit the HA cluster interface:

1. directly edit the configuration file

2. CLI: crmsh (SUSE), PCs (RedHat) Command Line

3. Gui: conga Of Heartbeat-Gui, lcmc, Hawk, pygui, and RHCs

Fencing:

When a service node suddenly goes down, to ensure that it does not continue to occupy cluster resources when it goes down, for example, it is still using a block-level storage device San, in this case, it is very likely that the file will be damaged, so we need to make it completely out of the cluster in an instant.

Node-level fencing: A power switch that uses stionith to receive network signals,

When the resource enters the new host, you can send a signal to cut off the power of the old host.

When the old host is a virtual machine, you can directly kill that process.

Resource-level fencing: directly blocks the traffic from the port that the old host enters on the San device.

XML: Extended markable language: extensible markup language, but difficult to configure and not easy to use

Resource Type:

Primitive, native: primary resource, resources that can only run on one node

Group: Group resources. A group resource can only run on one node.

Clone: Clone resources. It must run on multiple nodes, such as stonith. It must run on multiple nodes,

When resources are transferred, old resources can be shot at any time.

Master/Slave: master-slave resources are also clone resources, but are divided into master-slave resources, such as drdb.

Cluster mode:

Active/passive: Master/Slave Model

Active/active: Dual-active nodes. With the function provided by corosync, a resource (including primary resources) can run on multiple nodes.

N to M: N nodes and M services.

N to N: N nodes and N services.

Dual-node cluster :,

Generally, an HA cluster has at least three nodes, but the dual-node can run even if other configurations are provided.

1. Ping the node to provide multiple routers.

2. qdisk (quarum disk): An arbitration disk, that is, a shared disk.

The master node keeps writing data to qdisk, And the slave node keeps monitoring. If new data is not visible within the specified time, it is determined that the master node has crashed, then, the stonith instance is used to seize the resource.

Cluster-ha Theory

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More