The simulation of a DRBD brain fissure behavior

Source: Internet
Author: User
Tags failover
  DRBD1, Drbd2 as a supplement; Personally, I think this drbd the behavior of the brain crack, it should be the early artificial or failover caused, such as ha.   Last time I went to a client with a friend, he's the one that belongs to using HA to do failover, finally do not know what they do, on a machine to the DRBD service to hang up, because the server is very important, they are not very familiar with HA and DRBD architecture, in a HA switch test process, there is a problem, Let's simulate the problem here.     1, disconnect primary  down machine or disconnect network cable   2, view secondary machine status   [root@drbd2 ~]# drbdadm  role fs  secondary/unknown    [root@drbd2 ~]# cat /proc/drbd   version: 8.3.11  (api:88/proto:86-96)   GIT-HASH: 0DE839CEE13A4160EED6037C4BDDD066645E23C5  build by root@drbd2.localdomain, 2011-07-08 11:10:20  #注意下drbd2的cs状态     1: cs:wfconnection ro:secondary/unknown ds:uptodate/dunknown c r-----       ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0  pe:0 ua:0 ap:0 ep:1 wo:b oos:0    Configure secondary to primary role   [ Root@drbd2 ~]# drbdadm primary fs  [root@drbd2 ~]# drbdadm role fs  primary/unknown  [root@drbd2 ~]#  cat /proc/drbd   version: 8.3.11  (api:88/proto:86-96)   GIT-hash:  0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08  11:10:20     1: CS:WFCONNECTION RO:PRIMARY/UNKNOWN DS:UPTODATE/DUNKNOWN C  r-----      ns:567256 nr:20435468 dw:21002724 dr:169 al:229  bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0  #挂载   [root@drbd2  ~]# mount /dev/drbd1 /mnt/  [root@drbd2 ~]# cd /mnt/  [Root@drbd2  mnt]# ll  total 102524 -rw-r--r-- 1 root root 104857600 jul   8 12:35 100M  drwx------ 2 root root      16384 jul  8 12:33 lost+found    #原来的primary机器好了, there was a brain crack.     [root@drbd1 ~]# tail -f /var/log/messages   jul  8  13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm  initial-split-brain minor-1 exit code 0  (0x0)   jul  8  13:14:01 localhost kernel: block drbd1: split-brain detected but  unresolved, dropping connection!  Jul  8 13:14:01 localhost kernel:  block drbd1: helper command: /sbin/drbdadm split-brain minor-1    8 13:14:01 localhost kernel: block drbd1: helper command:  /sbin/drbdadm split-brain minor-1 exit code 0  (0x0)   Jul  8  13:14:01 localhost kernel: block drbd1:&nbsP;conn ( NetworkFailure -> Disconnecting )    jul  8 13:14:01  localhost kernel: block drbd1: error receiving reportstate, l: 4 !  jul  8 13:14:01 localhost kernel: block drbd1: connection  closed  Jul  8 13:14:01 localhost kernel: block drbd1: conn (  disconnecting -> standalone )    jul  8 13:14:01 localhost  kernel: block drbd1: receiver terminated  jul  8 13:14:01  localhost kernel: block drbd1: terminating receiver thread    [ root@drbd1 ~]# drbdadm role fs  primary/unknown    [root@drbd2 mnt]#  drbdadm role fs  primary/unknown    #drbd1现在是standalone, this time, the master and the auxiliary will not contact each other.   [root@drbd1 ~]# cat /proc/drbd   version: 8.3.11  (api:88/proto:86-96)   GIT-hash:  0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08  11:10:38     1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown    r-----      ns:20405516 nr:567256 dw:567376 dr:20405706 al:2  bm:1246 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0    [ROOT@DRBD1  /]# service drbd status  drbd driver loaded ok; device  status:  version: 8.3.11  (api:88/proto:86-96)   git-hash:  0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08  11:10:38  m:res  cs          ro                ds                  p       mounted  fstype  1:fs    standalone  primary/unknown  uptodate/dunknown  r-----   ext3     This time, if users try to restart Drbd2 's DRBD service, you will find it impossible to get up.   [root@drbd2 /]# service drbd start  starting drbd resources: [  ]..........  ***************************************************************   DRBD ' s  Startup script waits for the peer node (s)  to appear.   -  in case this node was already a degraded cluster before  the     reboot the timeout is 120 seconds. [ degr-wfc-timeout]   - if the peer was available before the reboot the timeout will     expire after  0 seconds. [wfc-timeout]      (these values are for  resource  ' FS ';  0 sec -> wait forever)    to abort waiting  enter  ' yes '  [ -- ]:[  13]:[  15]:[  16]:[   18]:[  19]:[  20]:[  22]:    in Drbd2 processing method:  [root@drbd2 /]#  drbdadm disconnect fs  [root@drbd2 /]# drbdadm secondary fs  [Root@drbd2  /]# drbdadm -- --discard-my-data fs  after three steps, you find that you still can't start the DRBD service on DRBD2 The last time a client I personally estimate that this problem, the DRBD reboot, can not start DRBD.
They're going to have to hurry their DBAs to death. Hehe     need to reconnect resources on DRBD1:  [root@drbd1 ~]# drbdadm connect fs    Start drbd2 on the DRBD service again, became.   [root@drbd2 /]# service drbd start  starting drbd resources: [  ].    Again look at resource synchronization:  [root@drbd2 /]# cat /proc/drbd   version:  8.3.11  (api:88/proto:86-96)   git-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build  by root@drbd2.localdomain, 2011-07-08 11:10:20     1: cs:synctarget ro: Secondary/primary ds:inconsistent/uptodate c r-----      NS:0 NR :185532 dw:185532 dr:0 al:0 bm:15 lo:0 pe:0 ua:0 ap:0 ep:1  wo:b oos:299000          [======&gt .........  sync ' ed: 39.5%  (299000/484532) k          finish: 0:00:28 speed: 10,304  (10,304)  want: 10,240 K/sec 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.