First, the concept of heartbeat
Linux-ha's full name is High-availability Linux, an Open-source project, the goal of which is to provide an enhanced Linux reliability (reliability), through the joint efforts of Community developers The availability (availability) and serviceability (serviceability) (RAS) cluster solution. Heartbeat is a component of the Linux-ha project and the most successful example of the current open source Ha project, which provides the basic functionality required by all HA software, such as heartbeat detection and resource takeover, monitoring of system services in a cluster, and transfer of shared IP between nodes in a cluster The owner of the address, and so on, since 1999, Heartbeat has been widely used in the industry and has released a number of versions, which can be downloaded from Linux-ha's official website www.linux-ha.org to the latest version of Heartbeat.
Second, the relevant terminology in the HA cluster
1. Nodes (node)
A stand-alone host that runs the heartbeat process, called a node, is the core component of HA, where the operating system and heartbeat software services are running on each node, and the nodes have primary and secondary points in the heartbeat cluster, called the Master node and the standby/backup node, respectively. Each node has a unique host name and has its own set of resources, such as disk, file system, network address, and application services. One or more application services are typically running on the master node. And the standby node is generally in the monitoring state.
2. Resources (Resource)
A resource is an entity that can be controlled by a node, and when a node fails, these resources can be taken over by other nodes, and in heartbeat, the entities that can be used as resources are:
Disk partitions, file systems
IP Address
Application Services
NFS File System
3. Events (Event)
This is what can happen in a cluster, such as node system failure, network connectivity failure, NIC failure, application failure, and so on. These events can result in the transfer of resources from the node, and the HA test is based on these events.
4. Action (Action)
How HA responds when an event occurs, and the action is controlled by the shell's footsteps, for example, when a node fails, the backup node will shut down or start the service with a predefined execution script. And then take over the resources of the failed node.
Composition and principle of heartbeat
1. Composition of the Heartbeat
Heartbeat provides the most basic functions of a highly available cluster, such as internal communication between nodes, cluster cooperation management mechanism, monitoring tools and failover functions, etc. The current version is heartbeat2.x, here the story is mainly heartbeat2.x, the following introduction of the internal composition of Heartbeat2.0, mainly divided into the following major parts:
Heartbeat: Communication detection module between nodes
HA-LOGD: Cluster Event Log service
CCM (Consensus Cluster Membership): Cluster member consistency management module
LRM (local Resource manager): Native Resource Management module
Stonith Daemon: Causes the problematic node to detach from the cluster environment
CRM (Cluster resource Management): Cluster resource management module
Cluster Policy engine: Cluster policy engine
Cluster transition Engine: Cluster transfer engine
Figure 1 shows the composition of the HEARTBEAT2.0 internal structure: