About Pacemaker cluster configuration version _ PHP Tutorial

Source: Internet
Author: User
The version of the Pacemaker cluster configuration. About the version of the Pacemaker cluster configuration, CIB in Pacemaker has a version composed of admin_epoch, epoch, and num_updates. when a node is added to the cluster, the version number is used, obtain the version of the Pacemaker cluster.
In Pacemaker, CIB has a version composed of admin_epoch, epoch, and num_updates. when a node joins the cluster, the largest version is used as the unified configuration of the entire cluster based on the version number.

Among admin_epoch, epoch, and num_updates, admin_epoch is usually not changed. epoch accumulates and sets num_updates to 0 during each "configuration" change, num_updates is accumulated every time a "status" change occurs. "Configuration" refers to the persistent content under the configuration node in CIB, including the cluster attribute, node's forever attribute, and resource attribute. "Status" indicates the reboot attribute of the node, whether the node is active or not, and whether the resource is started or not.

"Status" can be re-obtained through monitor (unless there is a problem with the RA script design), but "configuration" errors may cause cluster faults, therefore, we need to be more concerned with epoch changes and the impact on cluster configuration after nodes are added. In particular, some RA scripts that support the master-slave architecture will dynamically modify the configuration (for example, mysql's mysql_REPL_INFO
And pgsql-data-status in pgsql.

1. manual description
Http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Pacemaker_Explained/index.html#idm140225199219024

3.2.Configuration Version When a node joins the cluster, the cluster will perform a check to see who has the best configuration based on the fields below. it then asks the node with the highest (admin_epoch, epoch, num_updates) tuple to replace the configuration on all the nodes-which makes setting them, and setting them correctly, very important.

Table3.1.Configuration Version Properties

Field Description
Admin_epoch Never modified by the cluster. Use this to make the changes on any inactive nodes obsolete.Never set this value to zero, In such cases the cluster cannot tell the difference between your configuration and the "empty" one used when nothing is found on disk.
Epoch Incremented every time the configuration is updated (usually by the admin)
Num_updates Incremented every time the configuration or status is updated (usually by the cluster)



2. verify that there are three servers in the 2.1 environment: srdsdevapp69, srdsdevapp71, and srdsdevapp73.
OS: CentOS 6.3
Pacemaker: 1.1.14-1. el6 (Build: 70404b0)
Corosync: 1.4.1-7. el6

2.2 Basic Verification 0. epoch = "48304", num_updates = "4"
  1. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

1. update the cluster configuration. as a result, epoch is added to 1 and num_updates is cleared to 0.
  1. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo1-v "1"
  2. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

2. if the updated value is the same as the existing value, the epoch remains unchanged.
  1. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo1-v "1"
  2. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

3. the node attribute with a new life cycle of forever also causes epoch to add 1
  1. [Root @ srdsdevapp69 mysql_ha] # crm_attribute-N 'hostname'-l forever-n foo2-v 2
  2. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

4. when the new life cycle is the node attribute of reboot, num_updates is added to 1.
  1. [Root @ srdsdevapp69 mysql_ha] # crm_attribute-N 'hostname'-l reboot-n foo3-v 2
  2. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

2.3 Partition verification 1. manually isolate srdsdevapp69 from the networks of the other two nodes to form a partition. The DC (Designated Controller) before the partition is srdsdevapp73.
  1. [Root @ srdsdevapp69 mysql_ha] # iptables-a input-j DROP-s srdsdevapp71
  2. [Root @ srdsdevapp69 mysql_ha] # iptables-a output-j DROP-s srdsdevapp71
  3. [Root @ srdsdevapp69 mysql_ha] # iptables-a input-j DROP-s srdsdevapp73
  4. [Root @ srdsdevapp69 mysql_ha] # iptables-a output-j DROP-s srdsdevapp73
Epoch on both partitions is not changed, but it is still 48306, but srdsdevapp69 uses itself as the DC of its own partition.

Partition 1 (srdsdevapp69): QUORUM not obtained
  1. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

Partition 2 (srdsdevapp71, srdsdevapp73): Get QUORUM
  1. [Root @ srdsdevapp71 ~] # Cibadmin-Q | grep epoch

2. perform two configuration updates on srdsdevapp69 to increase its epoch by 2
  1. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo4-v "1"
  2. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo5-v "1"
  3. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

3. perform one configuration update on srdsdevapp71 to increase its epoch by 1
  1. [Root @ srdsdevapp71 ~] # Crm_attribute -- type crm_config-s set1 -- name foo6-v "1"
  2. [Root @ srdsdevapp71 ~] # Cibadmin-Q | grep epoch

4. restore the network and check the cluster configuration.
  1. [Root @ srdsdevapp69 mysql_ha] # iptables-F
  2. [Root @ srdsdevapp69 mysql_ha] # cibadmin-Q | grep epoch

  3. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo5-q
  4. 1
  5. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo4-q
  6. 1
  7. [Root @ srdsdevapp69 mysql_ha] # crm_attribute -- type crm_config-s set1 -- name foo6-q
  8. Error configurming operation: No such device or address
It can be found that the cluster uses the srdsdevapp69 partition configuration. because of its larger version, the updates made on the srdsdevapp71 and srdsdevapp73 partitions are lost.
This test reflects a problem: the partition configuration for obtaining QUORUM may be overwritten by the partition configuration without obtaining QUORUM. If you develop your own RA, this is a problem that requires attention. 2.4 Partition verification 2 in the previous test, the DC before the partition is generated is in the partition where QUORUM is obtained. now, try again the scenario where the DC before the partition is generated is in the partition where QUORUM is not obtained.

1. manually isolate the DC (srdsdevapp73) and the networks of the other two nodes to form a partition.
  1. [Root @ srdsdevapp73 ~] # Iptables-a input-j DROP-s srdsdevapp69
  2. [Root @ srdsdevapp73 ~] # Iptables-a output-j DROP-s srdsdevapp69
  3. [Root @ srdsdevapp73 ~] # Iptables-a input-j DROP-s srdsdevapp71
  4. [Root @ srdsdevapp73 ~] # Iptables-a output-j DROP-s srdsdevapp71
Epoch on srdsdevapp73 is not changed
  1. [Root @ srdsdevapp73 ~] # Cibadmin-Q | grep epoch

However, epoch on another partition (srdsdevapp69, srdsdevapp71) is added with 1.
  1. [Root @ srdsdevapp69 ~] # Cibadmin-Q | grep epoch

After the network is restored, the cluster uses a higher version, and the DC is still the DC before the partition (srdsdevapp73)
  1. [Root @ srdsdevapp73 ~] # Iptables-F
  2. [Root @ srdsdevapp73 ~] # Cibadmin-Q | grep epoch

Through this test, we can find that:
  • DC consultation resulted in epoch Plus 1
  • After the partition is restored, Pacemaker tends to use the DC before the partition as the new DC.

3. Summary of Pacemaker's behavioral characteristics
  1. CIB configuration change will cause epoch to add 1
  2. DC consultation resulted in epoch Plus 1
  3. After the partition is restored, Pacemaker takes the version number as the cluster configuration.
  4. After the partition is restored, Pacemaker tends to use the DC before the partition as the new DC.


Notes for RA development
  1. Avoid dynamically modifying cluster configurations
  2. If you cannot do the first step, try to avoid using multiple dynamic cluster configuration parameters. for example, you can splice multiple parameters into one (mysql mysql_REPL_INFO does this)
  3. Check crm_attribute errors and try again (pgsql does this)
  4. When quorum is lost, do not modify the cluster configuration in demote (stop ).


CIB in Pacemaker has a version composed of admin_epoch, epoch, and num_updates. when a node is added to the cluster, it is obtained according to the version number...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.