Linux high availability solution-Heartbeat CRM node score Calculation

Source: Internet
Author: User

Crm resource score Overview
In the Heartbeat of V2, to combine resource monitoring and switching, multi-node clusters are also supported, heartbeat provides an integration policy to control the switching policies between nodes of each resource in the cluster. This point mechanism is used to calculate the total score of each node. The highest score is used to manage a (or a group) Resource in the active state.
If no configuration is made in the configuration file of CIB, the initial score (resource-stickiness) of each resource is 0 by default, in addition, the score (resource-failure-stickiness) of each resource after each failure is also 0. In this case, heartbeat only performs the restart operation no matter how many times a resource fails. In general, the value of resource-stickiness is positive, and the value of resource-failure-stickiness is negative. Another special value is positive INFINITY and negative INFINITY ). If the node score is negative, no matter what happens, the node will not take over the resources (Cold Standby node ). As the status of the resource changes, the score on each node changes. As the score changes, once the score of a node is greater than the score of the node currently running the resource, heartbeat switches the resource. The nodes running the resource will release the resource, and the nodes with higher scores will take over the resource.
Resource score Configuration
In CIB configuration, you can define a score for each resource and set it through resource-stickiness. You can also set a score that is lost after a failure, use resource-failure-stickiness to set. As follows:
<Primitive id = "mysql_db" class = "ocf" type = "mysql" provider = "heartbeat">
<Meta_attributes id = "mysql_db_meta_attr">
<Attributes>
<Nvpair name = "resource_stickiness" id = "mysql_db_meta_attr_1" value = "100"/>
<Nvpair name = "resource_failure_stickiness" id = "mysql_db_meta_attr_2" value = "-100"/>
</Attributes>
</Meta_attributes>
...
<Primitive/>
The preceding configuration configures two scores for the resource mysql_db. The scores (resource_stickiness) and the scores (resource_failure_stickiness) that will be lost when running the resource successfully ), if the two scores are the same, the score is 100 points for success and-100 points for failure.
In addition to setting two scores for each resource, you can also set all resources to the same score, as shown below:
<Configuration>
<Crm_config>
<Cluster_property_set id = "cib-bootstrap-options">
<Attributes>
...
<Nvpair id = "default-resource-failure-stickiness" name = "default-resource-failure-stickiness" value = "-100"/>
<Nvpair id = "default-resource-stickiness" name = "default-resource-stickiness" value = "100"/>
...
</Attributes>
</Cluster_property_set>
</Crm_config>
In this configuration, two default scores are set for all resources, saving the trouble of setting each resource separately. Of course, if the default score is set for some or all resources, the scores set for each resource separately are used instead of the default scores.
In addition to the resource scores, nodes also have scores. The node score can be set as follows:
<Constraints>
<Rsc_location id = "rsc_location_group_mysql" rsc = "group_mysql">
<Rule id = "mysql1_group_mysql" score = "200">
<Expression id = "mysql1_group_mysql_expr" attribute = "# uname" operation = "eq" value = "mysql1"/>
</Rule>
<Rule id = "mysql2_group_mysql" score = "150">
<Expression id = "mysql2_group_mysql_expr" attribute = "# uname" operation = "eq" value = "mysql2"/>
</Rule>
</Rsc_location>
</Constraints>
Note that the node score settings are placed under the constraints configuration item in the configuration item, and are set through rule. This is matched by the node host name (in fact, many heartbeat configurations are sensitive to the host name ). The value here is the host name of the node, and the score in the rule is the score of a node.
Node score calculation rules
In the CRM configuration, the node calculates the score based on the following rules:
Score = node + resource + failcount * failure
If HB finds that NODE resources cannot be obtained or switched, the "Success score: default-resource-stickiness" previously assigned to the NODE is subtracted"
When a NODE resource failure occurs in HB, a "failure score: default-resource-failure-stickiness" is added to the NODE"
When the HB resource is successfully started on the NODE, "Success score: default-resource-stickiness" is added to the NODE"
Score calculation for a single resource in a single resource group
Through the above configuration, we can make the following calculations:
A. If heartbeat is started on both sides at the beginning, neither side starts to run this resource. The resource itself has no score, so only the node score is calculated:
Mysql1 score: node + resource + failcount * failure = 200 + 0 + (0 * (-100) = 200
Mysql2 score: node + resource + failcount * failure = 150 + 0 + (0 * (-100) = 150
Heartbeat will choose to run the resource mysql_db on mysql1, and then the score of mysql1 changes, because the scores of resources are added:
Mysql1 score: node + resource + failcount * failure = 200 + 100 + (0 * (-100) = 300
Mysql2 score: node + resource + failcount * failure = 150 + 0 + (0 * (-100) = 150
B. After a while, the heartbeat monitor finds that the resource crash (or other problems) of mysql_db changes immediately, as shown below:
Mysql1 score: node + resource + failcount * failure = 200 + 100 + (1 * (-100) = 200
Mysql2 score: node + resource + failcount * failure = 150 + 0 + (0 * (-100) = 150
Heartbeat finds that the score of the mysql1 node is higher than that of mysql2. If the resource is not migrated, restart operations will be performed.
C. Continue running for a period of time and find that there is another problem (or the restart after B does not get up), and the score changes again:
Mysql1 score: node + resource + failcount * failure = 200 + 100 + (2 * (-100) = 100
Mysql2 score: node + resource + failcount * failure = 150 + 0 + (0 * (-100) = 150
At this time, heartbeat finds that the mysql2 node has a higher score than the mysql1 node, and the resource will be migrated and switched. mysql1 releases the resources related to mysql_db, mysql2 takes over the relevant resources, and runs the resource mysql_db on mysql2. At this time, the node score will change as follows:
Mysql1 score: node + resource + failcount * failure-resource = 200 + 100 + (2*(-100)-100 = 0
Mysql2 score: node + resource + failcount * failure = 150 + 100 + (0 * (-100) = 250
At this time, if the problem occurs three times on mysql2, the score of mysql2 will be changed to-50, and less than that of mysql1. The resource will be migrated back to mysql1, and the score of mysql1 will be changed to 100, the score of mysql2 is-150, because the resource owner's score is less than 100. Here, the score of mysql2 node is already negative. Heartbeat also has a rule that resources will never be migrated to a node with a negative score. That is to say, no matter how many times the mysql_db resource fails on the mysql1 node, No matter what problems the resource has, it will not be migrated back to the mysql2 node. The score of a node is reset to the initial status after the heartbeat restart of the node. You can also reset or view a resource or resource group of a node in the cluster by using related commands, as shown below:
# Crm_failcount-G-U mysql1-r mysql_db # view the failcount of the resource mysql_db on the mysql1 Node
# Crm_failcount-D-U mysql1-r mysql_db # The failcount of the resource mysql_db on the mysql1 node will be reset.
Of course, in practice, we usually put some associated resources together to form a Resource Group. Once a resource in the Resource Group has a problem, the resources of the entire Resource Group need to be migrated. This is not much different from the above situation for a single resource. You just need to change the above mysql_db settings to the Resource Group, as shown below:
<Group id = "group-mysql">
<Meta_attributes id = "group-mysql_meta_attr">
<Attributes>
<Nvpair id = "group-mysql_meta_attr-1" name = "resource_stickiness" value = "100"/>
<Nvpair id = "group-mysql_meta_attr-1" name = "resource_failure_stickiness" value = "-100"/>
</Attributes>
</Meta_attributes>
<Primitive>
...
</Primitive>
...
</Group>
In this way, if any resource in the Resource Group has a problem, it will be considered that the resource group has a problem. If the score is lower than that of other nodes, the entire Resource Group will be switched.
In addition, for the values of INFINITY and-INFINITY, the main purpose is to control whether to switch or not. Because it means that the positive and infinite scores and failures reach the negative infinity, which is mainly used to meet the simple configuration items of extreme rules.
In general, the formula for calculating the number of failures of a resource (or resource group) before one node is migrated to another can be as follows:
(NodeA score-nodeB score + stickiness)/abs (failure stickiness), that is, the total score obtained after A node score minus B node score plus the resource running score, divide by the absolute value of the resource failure score.
Multi-Resource Group single resource score Calculation
The preceding point calculation only applies to the scenario where only one resource group exists and only one resource group exists in the Domain resource group. The following table lists the points calculation process of each resource group.
Default-resource-stickiness = 100 default-resource-failure-stickiness =-101
Mysql4.ipaddr. score = 350 mysql3.ipaddr. score = 400
Mysql4.mysql. score = 350 mysql3.mysql. score = 400
 
It can be seen that the resource score calculation in the Resource Group is relatively independent, but whether the resource is switched is still determined based on the sum of the scores between the resource group and the Resource Group.
Multi-Resource Group Multi-resource score Calculation
Resource switching does not compare the scores of a single resource. Instead, it is the sum of the scores of N Resources in the Resource Group.
NodeX. all. score = mysqlX. resource1.score +... + mysqlX. resourceN. score
1. When HB finds that the resource on NodeX fails or switches, the "Success score: N * default-resource-stickiness" assigned to the NODE is subtracted ",
NodeX. resourceY. score-= N * default-resource-stickiness
NodeX. all. score = NodeX. resource1.score +... + NodeX. resource2.score
2. When the HB NodeX resource fails, it will add "failure score: default-resource-failure-stickiness" to the NODE"
NodeX. resourceY. score + = default-resource-failure-stickiness
NodeX. all. score = NodeX. resource1.score +... + NodeX. resource2.score
3. When the HB resource is successfully started on the NODE, "Success score: N * default-resource-stickiness" will be added to the NodeX"
NodeX. resourceY. score + = N * default-resource-stickiness
NodeX. all. score = NodeX. resource1.score +... + NodeX. resource2.score
Example 1
Default-resource-stickiness = 100 default-resource-failure-stickiness =-100
Mysql4.ipaddr. score = 150 mysql3.ipaddr. score = 200
Mysql4.mysql. score = 350 mysql3.mysql. score = 400
 
Example 2
Default-resource-stickiness = 0 default-resource-failure-stickiness =-100
Mysql4.ipaddr. score = 375 mysql3.ipaddr. score = 400
Mysql4.mysql. score = 775 mysql3.mysql. score = 800
In this way, as long as any resource is DOWN, the resource will be switched to the other party. You can switch back until the score is negative. However, if a machine restarts, it will take over the resources after the restart, because his SCORE is relatively high.
 
Example 3
Default-resource-stickiness = 5 default-resource-failure-stickiness =-23
Mysql4.ipaddr. score = 99 mysql3.ipaddr. score = 100
Mysql4.mysql. score = 99 mysql3.mysql. score = 100
In this configuration, if the HB of the failed NODE is restarted after each switch, or the score is set to CIB. SET. you can always switch back and forth. otherwise: for the first time, if any resource fails, the switchover will take place. the second time, two resource failures are required before switching.


 
Resource score configured with colocation
Configure the following in the cib. xml file:
<Configuration>
...
<Constraints>
...
<Rsc_colocation id = "colocation1" to = "ipaddr_10_225_225" from = "mysql" score = "INFINITY" 1_rical = "true">
</Rsc_colocation> </constraints>
</Configuration>
Resource switching is not compared by the SCORE of a single resource. Instead, it is the sum of the N resource scores of the NODE and then multiplied by N. We call it
NodeX. all. score = (mysqlX. resource1.score +... + mysqlX. resourceN. score) * N
1) when the HB NodeX resource fails, the NODE
NodeX. resourceN. score + = default-resource-failure-stickiness
NodeX. resourceN. score-= default-resource-stickiness
NodeX. resourceN. score + = default-resource-stickiness
NodeX. all. score = (NodeX. resource1.score +... + NodeX. resourceN. score) * N
Then compare NodeX. all. score among multiple nodes.
2) When HB finds that the resource on NodeX is switched to "NodeY", it will subtract the "Success score: default-resource-stickiness ",
NodeX. resource [1 .. N]. score-= default-resource-stickiness
NodeY. resource [1 .. N]. score + = default-resource-stickiness
NodeX. all. score = NodeX. resource1.score +... + NodeX. resourceN. score
NodeY. all. score = NodeY. resource1.score +... + NodeY. resourceN. score
Example:
<Rsc_colocation id = "colocation1" from = "ipaddr_10_225_225" to = "mysql" score = "INFINITY" 1_rical = "true">
Default-resource-stickiness = 5 default-resource-failure-stickiness =-1 5
Mysql4.ipaddr. score = 100 mysql3.ipaddr. score = 100
Mysql4.mysql. score = 100 mysql3.mysql. score = 10 1

Author: "CzmMiao's blog life"
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.