The simulation of a DRBD brain fissure behavior

Last Update:2018-07-26 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

DRBD1, Drbd2 as a supplement; Personally, I think this drbd the behavior of the brain crack, it should be the early artificial or failover caused, such as ha. Last time I went to a client with a friend, he's the one that belongs to using HA to do failover, finally do not know what they do, on a machine to the DRBD service to hang up, because the server is very important, they are not very familiar with HA and DRBD architecture, in a HA switch test process, there is a problem, Let's simulate the problem here. 1, disconnect primary down machine or disconnect network cable 2, view secondary machine status [root@drbd2 ~]# drbdadm role fs secondary/unknown [root@drbd2 ~]# cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) GIT-HASH:&NBSP;0DE839CEE13A4160EED6037C4BDDD066645E23C5 build by root@drbd2.localdomain, 2011-07-08 11:10:20 #注意下drbd2的cs状态 1: cs:wfconnection ro:secondary/unknown ds:uptodate/dunknown c r----- ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 Configure secondary to primary role [ Root@drbd2 ~]# drbdadm primary fs [root@drbd2 ~]# drbdadm role fs primary/unknown [root@drbd2 ~]# cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20 &NBSP;1:&NBSP;CS:WFCONNECTION&NBSP;RO:PRIMARY/UNKNOWN&NBSP;DS:UPTODATE/DUNKNOWN&NBSP;C r----- ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 #挂载 [root@drbd2 ~]# mount /dev/drbd1 /mnt/ [root@drbd2 ~]# cd /mnt/ [Root@drbd2 mnt]# ll total 102524 -rw-r--r-- 1 root root 104857600 jul 8 12:35 100M drwx------ 2 root root 16384 jul 8 12:33 lost+found #原来的primary机器好了, there was a brain crack. [root@drbd1 ~]# tail -f /var/log/messages jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1 exit code 0 (0x0) jul 8 13:14:01 localhost kernel: block drbd1: split-brain detected but unresolved, dropping connection! Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1 exit code 0 (0x0) Jul 8 13:14:01 localhost kernel: block drbd1:&nbsP;conn ( NetworkFailure -> Disconnecting ) jul 8 13:14:01 localhost kernel: block drbd1: error receiving reportstate, l: 4 ! jul 8 13:14:01 localhost kernel: block drbd1: connection closed Jul 8 13:14:01 localhost kernel: block drbd1: conn ( disconnecting -> standalone ) jul 8 13:14:01 localhost kernel: block drbd1: receiver terminated jul 8 13:14:01 localhost kernel: block drbd1: terminating receiver thread [ root@drbd1 ~]# drbdadm role fs primary/unknown [root@drbd2 mnt]# drbdadm role fs primary/unknown #drbd1现在是standalone, this time, the master and the auxiliary will not contact each other. [root@drbd1 ~]# cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38 1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- ns:20405516 nr:567256 dw:567376 dr:20405706 al:2 bm:1246 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 [ROOT@DRBD1 /]# service drbd status drbd driver loaded ok; device status: version: 8.3.11 (api:88/proto:86-96) git-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38 m:res cs ro ds p mounted fstype 1:fs standalone primary/unknown uptodate/dunknown r----- ext3 This time, if users try to restart Drbd2 's DRBD service, you will find it impossible to get up. [root@drbd2 /]# service drbd start starting drbd resources: [ ].......... *************************************************************** &NBSP;DRBD ' s Startup script waits for the peer node (s) to appear. - in case this node was already a degraded cluster before the reboot the timeout is 120 seconds. [ degr-wfc-timeout] - if the peer was available before the reboot the timeout will expire after 0 seconds. [wfc-timeout] (these values are for resource ' FS '; 0 sec -> wait forever) to abort waiting enter ' yes ' [ -- ]:[ 13]:[ 15]:[ 16]:[ 18]:[ 19]:[ 20]:[ 22]: in Drbd2 processing method: [root@drbd2 /]# drbdadm disconnect fs [root@drbd2 /]# drbdadm secondary fs [Root@drbd2 /]# drbdadm -- --discard-my-data fs after three steps, you find that you still can't start the DRBD service on DRBD2 The last time a client I personally estimate that this problem, the DRBD reboot, can not start DRBD.
They're going to have to hurry their DBAs to death. Hehe need to reconnect resources on DRBD1: [root@drbd1 ~]# drbdadm connect fs Start drbd2 on the DRBD service again, became. [root@drbd2 /]# service drbd start starting drbd resources: [ ]. Again look at resource synchronization: [root@drbd2 /]# cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) git-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20 1: cs:synctarget ro: Secondary/primary ds:inconsistent/uptodate c r----- &NBSP;&NBSP;&NBSP;&NBSP;NS:0&NBSP;NR :185532 dw:185532 dr:0 al:0 bm:15 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:299000 [======&gt ......... sync ' ed: 39.5% (299000/484532) k finish: 0:00:28 speed: 10,304 (10,304) want: 10,240 K/sec

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More