Linux HA Cluster principle model and instance creation (1)
1. What is a high-availability cluster? A high-availability cluster means that when one node or server fails, the other node can automatically and immediately provide external services, resources on faulty nodes will be transferred to another node, so that resources on another node can provide services externally. A high-availability cluster is used to automatically switch resources and services when a single node fails. This ensures that services are always online. In this process, the client is transparent. 2. high-availability clusters are generally measured by system reliability and system maintainability (maintainability. Generally, MTTF is used to measure the system reliability, and MTTR is used to measure the system maintainability. Therefore, a high-availability cluster service can be defined as follows: HA = MTTF/(MTTF + MTTR) * 100% general high-availability cluster standards are as follows: 99%: indicates that the downtime for one year cannot exceed 4 days 99.9%: indicates that the downtime for one year cannot exceed 10 hours 99.99%: indicates that the downtime for one year cannot exceed 1 hour 99.999%: 1. HA Cluster features 1. Redundant System: HA Cluster: to improve system calling, cluster built by combining multiple hosts 2. vote system voting system: When nodes in HA cannot detect the heartbeat information of each other, they must not coordinate the work; this state is partitioned cluster; voting Principle: (1) minority follows the majority principle: quorum whit quorum (with a statutory number of votes)> total1/2 without quorum (unable to determine the number of votes) <= total1/2 when the number of HA nodes is an odd number, the voting number is used for arbitration. When the number is an even number, you need to use another arbitration device (2) Arbitration device quorum disk (qdisk): qdisk is a shared disk device that imports less than 10 MB to all cluster nodes. Qdiskd is a background service that runs on all nodes of the cluster to regularly evaluate its own health status. It regularly places the status information of its nodes to qdisk. After each qdiskd service submits its node information, it then checks the status of other nodes on qdisk, with the weight (N/2). ping node: ping a gateway or device at the same time, arbitration through communication 3. failover: failover, failover, failback: failover, failover, and ha configuration. in the cf file, auto_failback on is enabled. 4. Heartbeat Information Transmission Mechanism (1) Serail cable: String interface connection; limited scope; not recommended; (2) Ethernet cable: network cable connection, connect the host through a network interface (through a switch in the middle); (3) UDP Unicast: UDP Unicast UDP Multicast: UDP Multicast (relatively common) UDP Broadcast: UDP broadcast mode Description: multicast address: used to identify an IP multicast domain; IANA (Internet Assigned number authority) allocates Class D address space IP Multicast: 224.0.0.0-239.255.255.255; permanent multicast address: 224.0.0.0-permanent; temporary multicast address: 224.0.1.0-238.255.255.255; local multicast address: 239.0.0.0-Shanghai, effective only within A specific local range IV. HA Cluster working model 1. Master-slave mode (asymmetric) A/P: Two-node Cluster, active and passive, working in the master-slave model; the cluster contains two nodes and one or more servers. The backup node is checking the health status of the master node at any time. When the master node fails, the service will automatically switch to the backup node to ensure the operation, the backup node does not run at ordinary times (it may make resources charged) 2. Symmetric Mode: A/A: Two-node cluster, active/active, working on the dual-master model; the cluster contains two nodes and one or more services. Each node runs different services and acts as a backup for each other. The two nodes detect each other's health status. When a node fails, the service on the node will automatically switch to another node to ensure service operation 3. multi-machine model: M-N (M nodes, N services, M> N) or a M-M (M nodes, M services) cluster contains multiple nodes and multiple services. Each node may run and does not run services. Each server monitors several specified services. When a node fails, it will automatically switch to one of these servers. 5. Architecture layers and Solutions of HA Cluster 1. Messaging Layer: mainly refers to the information Layer, which transmits the heartbeat information of the current node and informs other nodes whether the node is online. If not, you can implement resource transfer based on relevant mechanisms and transmit cluster-related transaction messages (each node installs related heartbeat software, connects to each other through a network cable, and listens to each other on the relevant IP addresses and ports). solution: (1) heartbeat V1, V2 (stable version), V3 (2) corosync (openAIS sub-project separation for R & D, powerful functions) (3) keepalive (4) cman 2. CRM (Cluster Resource Messager): the Cluster Resource manager is mainly used to provide services that do not have high availability and call Messaging Layer to implement work. Therefore, it works on the Messaging Layer. The main task of resource manager is to determine service start, stop, resource transfer, resource definition and resource allocation based on the Health Information transmitted by messaging Layer. Each node contains a CRM, and each CRM maintains this CIB (Cluster Internet Base, Cluster information library). Only CIB on the master node can be modified, CIB on other nodes are copied from the master node. Components such as LRM and DC are also included in CRM: (1) heartbeat v1 haresources (configuration interface: configuration file, file name: haresources) (2) heartbeat v2 crm (run a crmd process on each node, configure the interface: command line Client crmsh, GUI client: hb_gui); (3) heartbeat v3, pacemaker (pacemaker can run in Plug-ins or standalone mode; configuration interface, CLI interface: crmsh, pcs; GUI: hawk (webgui), LCMC, pacemaker-mgmt); (4) rgmanager (configuration interface, CLI: clustat, cman_tool; GUI: Conga (luci + ricci) Combination Method: (1) heartbeat v1 (2) heartbeat v2 (3) heartbeat v3 + pacemaker (4) Corosync + pacemaker (5) cman + rgmanager (RHCS) (6) cman + pacemaker 3. LRM (Local Resource Messager): Local Resource manager, which belongs to the CRM component, used to obtain the status of a resource and manage local resources. For example, if no heartbeat information is detected, local services are started. 4. DC: understood as the Transaction Coordinator, when cluster nodes fail and are grouped, resources may be snatched because services may run, therefore, the Transaction Coordinator DC determines which nodes start the service and which nodes stop the service based on the legal votes of each group. 5. Resource isolation components: If the master node suffers a fault, at this time, the backup node immediately grabs resources, while the master node is performing write operations. Once the backup Node also performs write operations, file system disorder and server crash will occur, therefore, The isolation mechanism requires resource (1) Node-level Isolation of STONITH (Shoot The Other Node in The Head, "burst Head") restart or shut down the node by controlling the power switch and power-on (2) resource-level fc san switch can deny access to a node at the storage Resource level. 6. Resource Agent (Resource Agent): The Resource that RA actually replicates to start a script file, one node can have multiple RA (1) heartbeat legacy: a traditional type of RA, usually located in/etc/ha. d/haresources. d/directory; (2) LSB: Linux Standard Base,/etc/rc. d/init. the script in the d directory must have at least four parameters: {start | stop | restart | status}; (3) OCF: Open Cluster Framework, subcategory: provider STONITH: dedicated to resources that call the functions of the STONITH device. Generally, the resource type is clone. 7. Resource: a resource is a subitem required to start a service.. For example, to start an httpd service, we need an ip address, a service script, and a file system (used to store data). These can all be collectively referred to as resource (1) resource types: (a) primitive: primary resource, which can only run on a single node in the cluster; (also called native); (B) group: group resource, container, contains one or more resources. These resources can be centrally scheduled using the "Group" resource. (c) clone: clone a resource, multiple clones can be run on multiple nodes in the same cluster. (d) master/slave: master/slave resources. Two resources can be run on two nodes in the same cluster, one of the primary nodes is a slave node. (2) resource constraint (a) location: location constraint, which defines the tendency of the resource to the node. It is represented by a value,-oo, + oo; (B) colocation: arranges constraints and defines the tendency of resources to be "together";-oo, + oo group (group): it can also bind multiple resources together; (c) order: order constraint that defines Sequence. For example, you should first Mount shared storage and start the httpd or mysqld service.