LINUX Cluster Learning One

Last Update:2018-04-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

 服务器中的集群与网络中的集群虽然都是为了提供冗余的服务能力，但是在实现上有一定的差异，主要原因为网络冗余只需要实现流量有冗余路径，当主的链路故障后，流量可以通过备的链路通行即可。对于有状态的如TCP协议，某些设备如防火墙需要对其进行状态检测，那么只需要在主备设备之间开启会话同步功能即可。但是对于服务器而言较为复杂，主要原因是因为服务器作为流量的终结点，是需要直接对外提供服务的，其上存储的数据则需要被服务访问，不再是流量穿越就可以。本人一起从事网络相关的工作，在第一次接触服务器集群时，对其实现原理机制完全不了解，在反复学习后才有一些粗浅的认识，本篇先简单的介绍一下服务器集群原理，后续慢慢对学习中的实验进行总结。 服务器集群主要实现服务的冗余能力以及高性能，从目前我所了解到的来看，主要有两种类型：

Through a similar load-balancing device, the service request is forwarded to the subsequent multiple real servers, and multiple servers can be served externally to achieve the high concurrency capability of the service.
Multiple servers themselves redundant external services, which constitute a high-reliability cluster between the servers, normally only the primary node response requests, when the primary node device failure, the service needs to be immediately taken over by the backup node, and external services.
As shown, these two types can be combined together, the front of a number of devices by the 2nd case of the cluster to provide services, here is master and backup, two devices to form a virtual node vs, by the virtual vs node to provide services, actually assume the VS function node as master. Under normal circumstances, master will receive the response from the outside, and then forward the request to the back end. After the master failure of the primary node of the current-side receive service, the backup node backups continue to receive requests for back-end forwarding, which enables high-reliability front-end acceptance requests and high concurrent service responsiveness at the backend.

Here we focus on the second scenario, where multiple devices provide a cluster of redundant capabilities to the outside. In layman's terms, this kind of server cluster is actually to achieve when the primary node to provide services to enable service-related resources, and the standby node to deactivate the related resources; When the primary node fails, the standby node enables service-related resources to provide services externally, while the master node stops the related services.
In a clustered system, these related resources are no longer managed by the server itself, but are managed by a unified software, which we call CRM (Cluster resource manager), whose primary role is to collect information for each member node, and calculate which node should provide services externally, and then the LRM (local resource manager) on the node for the specific processing of resources. These specific services we call RA (resource agent)
How does CRM computing depend on resources? This introduces the cluster transaction information layer, which is mainly used to communicate with each other and gather information about the cluster. So, the structure of the cluster can be roughly represented as shown:

How can these resources be combined to run together? Is it possible to put together? Do they have to be run together? Here we talk about the stickiness of resources. At the same time the resource stickiness is also to solve in some cases, the primary and standby node performance is not the same, the resources should be running on the better performance of the node, should make resources more viscous to high-performance devices.
Resource stickiness: The degree to which a resource depends on a node, defined by score.
Resource group: Resource Group, the resources are grouped to achieve the same time transfer of resources.
Resource constraints: In addition to resource groups, you can use constraint relationships between resources to define them.
Permutation constraints: Dependencies between resources, defining whether resources can be used together, whether they can be run on the same node, score
Positive values: Resources can be put together
Negative value: Resources cannot be put together
Position constraint: Location,score score measurement: the degree to which a resource relies on a node
Positive value: tends to run on this node
Negative value: tends to escape this node
Order constraints: Order, which defines the order in which resources are started or closed
The core is either a positive integer or a negative integer, or it can be in the following form:
-inf: Negative Infinity
INF: Positive Infinity
When the primary and standby switch, to prevent the failure of the device in normal, abnormal switching back and forth, the need for resource isolation.
Resource isolation:
Node Level: STONITH
Resource level:
For example, FC SAN switch can implement deny access to a node at the storage resource level

Split brain: When the cluster node is unable to obtain the state information of other nodes effectively, the brain fissure occurs, and the resource preemption happens. In this case, we call it a dual-master state in the network.
The most serious consequence of a brain fissure is that preemption of shared storage results in data corruption.

When a cluster master fails, the standby device needs to replace the primary device's functionality, but how can it be confirmed that it is an on-end fault rather than a fault?

Two nodes: With third-party, such as a gateway
With the help of the quorum disk, the master node writes data to the disk, and the other nodes monitor whether the primary node writes data to the disk
Odd number of nodes: the introduction of the legal votes quorum concept, can be 1 votes per node, when the cluster split, the number of votes wins, can also be the number of votes per node is not equal, such as high-performance node can occupy multiple votes.

Specific implementation:
Massage Layer:
HEARTBEAT:V1 (old version), V2 (Mature version), V3, port No. 694 listening for UDP protocol
Heartbeat v3 splits into several separate projects:
Heartbeat V3:messagelayer Layer
Pacemaker:
Cluster-glue
corosync:redhat6.0 after the default use of software: Corosync, performance than Heartbeat superior, but does not have the pacemaker function, need to use, combined with the equivalent of Heartbeat v3
Cman:redhat5.0,cluster Manager: There is a dedicated RHCS kit on Red Hat, Cman is the core component of RHCS
Keepalived: Designed for DC high availability of LVS
Ultramonkey:

Crm:
Heartbeat v1 comes with manager function, Haresources, provides CRM for Heartbeat v1;
Heartbeat v2 comes with a resource manager that provides CRM for Heartbeat v2;
Haresources: Compatible with V1 version
CRM: Need to listen on an interface that can be managed through a graphical interface
Heartbeat v3:
Pacemaker:heartbeat V3 in the development of CRM as an independent project. Provides CRM functionality for HEARTBEATV3 or Corosync.
Rgmanager: Resource group manager, Cman specifically provided by CRM

Resource Type: Resource category
Primitive: Master Resources, a time can only be run on a node, such as the external service to provide virtual IP address;
Clone: Cloning a primary resource, the resource must be running on multiple nodes at the same time, such as Stonith-managed RA;
Group: A plurality of resources classified as a class, as a container, with the same back;
Master/save: A special clone resource, which can only run on two nodes, and the resource has a master-slave relationship.

RA Category: Receive instructions from LRM, complete the control of resources
Legacy Heartbeat v1 RA: Dedicated to Heartbeat v1
LSB: (/etc/rc.d/init.d/) follows the Linux shell programming style and can receive four parameters: Start|stop|restart|status.
The Ocf:open cluster framework, in addition to the four parameters mentioned above, can also accept parameters such as monitor, which may be provided by different vendor
Pacemaker
Linbit (DRBD)
STONITH: Specialized in managing hardware STONITH devices.

LINUX Cluster Learning One

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

LINUX Cluster Learning One

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support