Split-brain and keepalived split-brain in keepalived

Source: Internet
Author: User

Split-brain and keepalived split-brain in keepalived

In a high-availability (HA) system, when the "Heartbeat line" of the two nodes is disconnected, the HA system, originally integrated and coordinated, is split into two independent individuals. Because they lost contact with each other, they thought the other party had a fault. The HA software on the two nodes compete for "sharing resources" and "Application Services" like "Split-brain ", there will be serious consequences-or shared resources are divided, and two sides of the "service" cannot start; or two sides of the "service" are up, but at the same time read and write "shared storage ", data corruption occurs. (common errors include online log errors during database polling ).

There are probably the following consensus measures to deal with the "split brain" of the HA system:
1) Add redundant heartbeat lines, such as dual-line lines (Heartbeat lines are also HA) to minimize the probability of split brain;
2) Enable the disk lock. The Service side is locking the shared disk. When the split brain occurs, the other party can completely share the disk resources. However, locking a disk also poses a major problem. If one party who uses a shared disk does not "unlock" the disk, the other party will never get the shared disk. In reality, if a service node suddenly crashes or crashes, the UNLOCK command cannot be executed. The backup node cannot take over shared resources and application services. So someone designed the "smart" Lock In HA. That is, the disk lock is enabled only when the service side finds that the heartbeat line is completely disconnected (the opposite side is not noticed. It is usually not locked.
3) set up the arbitration mechanism. For example, if the reference IP address (such as the gateway IP address) is set, when the jumper is completely disconnected, ping the reference IP address for both nodes. Otherwise, the breakpoint is displayed at the local end. Not only does the local network link of "Heartbeat" and "External Service" have been broken, but even if the application service is no longer used, the competition will be abandoned, ping the IP address to start the service. More secure. If you cannot ping the IP address, the IP address Provider simply restarts to completely release the shared resources that may also be occupied.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.