First, the reason analysis
1, the current structure analysis diagram:
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M02/89/B5/wKioL1gasQ6i4sKMAAU4ouVcKQE537.png-wh_500x0-wm_3 -wmp_4-s_2563946043.png "title=" Dd1.png "alt=" Wkiol1gasq6i4skmaau4ouvckqe537.png-wh_50 "/>
2. Cause analysis
Because the platform business network is not stable, the heartbeat of DRBD points to the Gateway guide, in the network problems caused by brain fissure, there are two results of brain fissure:
1, shared resources are carved up, both sides of the "service" are not up;
2, both sides of the "service" are up, but also read and write "shared storage", resulting in data corruption
3. Causes of brain fissure occurrence
1. Heartbeat link failure between high-availability servers, resulting in no heartbeat checking with each other
2, high-availability server on the firewall, blocking the heartbeat detection
3, high-availability server card address and other information configuration is not normal, causing the heartbeat failed to send
4, other reasons such as improper configuration of the service, such as different heartbeat mode, heartbeat broadcast conflict, software bugs, etc.
Ii. Purpose of testing
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M00/89/B8/wKiom1gasTaiBQtbAAOOEm_V_MQ067.png-wh_500x0-wm_3 -wmp_4-s_1446668340.png "title=" Dd2.png "alt=" Wkiom1gastaibqtbaaooem_v_mq067.png-wh_50 "/>
Through the use of a dedicated line as the heart rate, whether there are two cases of brain fissure and the impact on the service, the reconstructed structure diagram:
Third, the test process
1. Stop the heart jumper on MS
2. Stop Heartbeat Service
Iv. Testing Phenomena
1, stop the heartbeat line, the resources do not switch, but the outside network can not access through VIP, intranet normal, need to perform ARP
2, stop S on the service (drbd,heartbeat) resources do not switch, do not affect the business
3, stop the service on M, the normal switching of resources (Autofail=off)
4, DB service stop, through script detection, service resource switch
5, Ms A heartbeat line, s heartbeat stop recovery will seize resources, if you can add two straight lines, you can avoid
V. Test results
Using a dedicated line as the heartbeat line, the network is more stable, when writing large amounts of data, through the 3-layer switch to the peer, to avoid data blocking
VI. Test Recommendations
According to the current platform business architecture, it is recommended to add two dedicated lines as the heartbeat line, improve the robustness of the network, to avoid the heartbeat caused by resource preemption or resource abandonment behavior.
Vii. some programmes to prevent brain fissures
1, plus redundant lines
2, detect the split brain, forcibly closed heartbeat detection (remote shut down the main node, control the power supply circuit fence)
3, do a good job of monitoring the brain crack alarm
This article is from the "DBSpace" blog, so be sure to keep this source http://dbspace.blog.51cto.com/6873717/1868844
Drbd+heartbeat+mysql's Test report