Explanation of Heartbeat v1 and NFS file sharing
High Availability Basics
I. Definition of high-availability clusters
High Availability Cluster (HACluster) is a group of computers that provide users with a set of network resources as a whole. These individual computer systems are cluster nodes ).
The emergence of highly available clusters is to make the overall services of the cluster as available as possible, thus reducing the loss caused by computer hardware and software error. If a node fails, its standby node takes over its responsibilities in seconds. Therefore, the cluster will never stop. The main function of highly available cluster software is to automate fault check and service switching. When one server fails, the other server can undertake service tasks, in this way, the system can automatically provide external services without manual intervention. Hot Standby is only one type of high-availability cluster. The high-availability cluster system supports more than two nodes and provides more advanced functions than hot standby, it can better meet the changing needs of users.
Ii. high-availability cluster measurement criteria
HA (High Available), High availability clusters are measured by system reliability and maintainability (maintainability. In engineering, MTTF is usually used to measure the reliability of the system, and MTTR is used to measure the maintainability of the system. Therefore, the availability is defined as HA = MTTF/(MTTF + MTTR) * 100%.
-------------------------------------- Split line --------------------------------------
Hot Standby for Web Services Based on Heartbeat V1
Heartbeat enables high-availability clusters of Web Services
Heartbeat + LVS + Ldirectord high-availability Load Balancing Solution
DRBD + Heartbeat + NFS High Availability Configuration notes
Heartbeat high availability for MySQL using NFS based on CRM
Heartbeat high-availability httpd simple configuration based on Resources
-------------------------------------- Split line --------------------------------------
Specific HA metrics:
99% downtime for one year cannot exceed 4 days
99.9% downtime for one year cannot exceed 10 hours
99.99% downtime for one year cannot exceed 1 hour
99.999% downtime for one year cannot exceed 6 minutes
Is the working hierarchy principle of HA
The first messagin layer: the heartbeat information transmission layer, which can be used to learn the online status of underlying server resources and report the status to the previous layer.
Level 2 cluster resource manager: the cluster resource management layer (crm layer for short ).
Level 3 resource agents: resource proxy layer: Define the resource
1. cluster Consensus Menbership Service, then, the results are transmitted to the upper layer, allowing the upper layer to decide what measures to take. ccm can also generate a topology Overview map of each node status, from the perspective of this node, ensure that the node can take corresponding actions under special circumstances.
2. crmd component (Cluster Resource Manager, Cluster Resource Manager, or pacemaker): implements Resource allocation. Each action of Resource allocation must be implemented through crm, which is the core component, crm on each node maintains a cib to define the specific attributes of resources and which resources are defined on the same node.
3. cib component (Cluster information Base, Cluster Infonation Base): a configuration file in XML format. It is a configuration file for Cluster resources in an XML format in the memory and is mainly stored in the file, at work time, It is resident in the memory and needs to be notified to other nodes. Only cib on the DC can be modified. cib on other nodes are copied to the DC. Methods for configuring cib files include command-line configuration and GUI configuration at the front-end.
4. lrmd component (Local Resource Manager): used to obtain the status of a Local Resource and manage Local resources. If no heartbeat information is detected, to start local service processes.
5. pengine components:
PE (Policy Engine): A Policy Engine that defines a complete set of transfer methods for resource transfer. However, it is only a Policy maker and does not come in person to participate in the process of resource transfer, instead, let TE execute its own policy.
TE (Transition Engine): it is used to execute PE policies and only run PE and TE on the DC.
6. stonithd component
STONITH (Shoot The Other Node in the Head, "headers") directly operates The power switch. If one Node fails, if the Other Node can detect it, A command is issued through the network to control the power switch of the faulty node. the faulty node is restarted by means of temporary power failure and power-on. This method requires hardware support.
In the STONITH application case (master-slave server), the master server does not have time to respond to heartbeat information at a certain end of time because of busy services. If the slave server suddenly grabs service resources, however, at this time, the master server has not been down, which will lead to resource preemption, so that users can access the master and slave servers. If only the read operations are okay, if there is a write operation, this will cause the file system to crash, so everything will be done. Therefore, when resources are preemptible, some isolation methods can be used to achieve this, that is, when the slave server grabs resources, directly sending the master server to STONITH is what we often call "headers ".
3. high-availability cluster Software
Messaging and Membership Layer (Information and relationship Layer ):
Heartbeat (v1, v2, v3), heartbeat v3 split heartbeat pacemaker cluster-glue
Corosync
Cman
Keepalived
Ultramokey
Cluster Resource Manager Layer (Resource management Layer (CRM ):
Haresource, crm (heartbeat v1/v2)
Pacemaker (heartbeat v3/corosync)
Rgmanager (cman)
Common combinations:
Heartbeat v2 + haresource (or crm) (Description: generally used in CentOS 5.X)
Heartbeat v3 + pacemaker (Description: generally used in CentOS 6.X)
Corosync + pacemaker (Note: The most common combination)
Cman + rgmanager (Description: components in the Red Hat Cluster suite, including gfs2 and clvm)
Keepalived + lvs (Description: High Availability commonly used in lvs)
Summary: we often see in our technical blogs that heartbeat + pacemaker achieves high mysql availability, or corosync + pacemaker achieves high mysql availability. Some bloggers will ask, what are our best practices? After the above instructions, you should know something!
For more details, please continue to read the highlights on the next page: