DRBD Introduction
DRBD is called DistributedReplicatedBlockDevice (DRBD) distributed block device replication DRBD is composed of kernel modules and related scripts to build highly available clusters. The implementation method is to mirror the entire device through the network. It allows you to create a real-time image of a local block device on a remote machine. Combined with heartbeat connections, you can also regard it as a network RAID.
DRBD Working Mechanism
Drbd receives data, writes the data to a local disk, and sends it to another host. The other host then saves the data to its own disk. Currently, drbd only allows read/write access to one node at a time, which is sufficient for failover clusters. Later versions support read/write access to two nodes.
DRBD protocol description
Once data is written to the disk and sent to the network, the write operation is considered complete.
B is deemed to have completed the write operation after receiving the confirmation.
C. The write operation is deemed to have been completed after receiving the write confirmation.
Software Download list address: http://oss.linbit.com/drbd/
Environment Introduction:
System Version: CentOS6.4 (32-bit) kernel version 2.6.32-358. el6.i686
Software Version: drbd-8.4.3.tar.gz
MASTER: 10.0.7.103 from: 10.0.7.104
To facilitate the experiment, both machines add a 20 GB hard disk
[Root @ localhost ~] # Vim/etc/hosts
10.0.7.103node1
10.0.7.20.node2
Then partition and format the newly added hard disk sdb)
[Root @ node1 ~] # Mkdir/qq
[Root @ node1 ~] # Mount/dev/sdb1/qq must be written into/etc/fstab; otherwise, the restart is invalid)
[Root @ node1 ~] # Init6
Install yum-y install kernel-devel kernel-headers flex. Note: To install kernel-devel, it must be consistent with the kernel version you see in uname-r, we recommend that you install kernel-devel with local source, do not install [root @ node1 soft] # tar zxf drbd-8.4.3.tar.gz [root @ node1 soft] # cd drbd-8.4.3 [root @ node1 drbd-8.4.3] #. /configure -- prefix =/usr/local/drbd -- with-km Note: -- with-km is used to enable the kernel module [root @ node1 ~] # Make KDIR =/usr/src/kernels/2.6.32-358. el6.i686 note KDIR path this kernel source code path needs to be modified according to their own system) [root @ node1 drbd-8.4.3] # make install [root @ node1 ~] # Mkdir-p/usr/local/drbd/var/run/drbd [root @ node1 ~] # Cp/usr/local/drbd/etc/rc. d/init. d/drbd/etc/rc. d/init. d/[root @ node1 ~] # Chkconfig -- add drbd [root @ node1 ~] # Chkconfig drbd on install the drbd module and return to the directory where the drbd is just extracted, then [root @ node1 drbd-8.4.3] # cd drbd [root @ node1 drbd] # make clean [root @ node1drbd] # make KDIR =/usr/src/kernels/2.6.32-358. el6.i686 [root @ node1 drbd] # cp drbd. ko/lib/modules/'uname-R'/kernel/lib/[root @ node1 drbd] # modprobe drbd check whether the module has been loaded successfully [root @ node1 drbd] # lsmod | grep drbddrbd 292307 0libcrc32c 841 1 drbd
################################## Configuration ##### ########################### official documentation: http://www.drbd.org/users-guide-8.4/ View the main configuration file [root @ node1 etc] # pwd/usr/local/drbd/etc [root @ node1 etc] # cat drbd. conf # You can find an example in/usr/share/doc/drbd... /drbd. conf. exampleinclude "drbd. d/global_common.conf "; include" drbd. d /*. res "; it can be seen that the main configuration file contains the global configuration file and the drbd directory. modify the global configuration file at the end of res: [root @ node1 drbd. d] # pwd/usr/local/drbd/etc/drbd. d [root @ node1 drbd. d] # lsglobal_common.conf: [root @ node1 drbd. d] # cat global_common.c Onfglobal {usage-count yes; # whether to participate in drbd user statistics. The default value is yes # minor-count dialog-refresh disable-ip-verification} common {protocol C; # Use the synchronization protocol handlers {# These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh;/usr/lib/drbd/groovy-em Ergency-reboot.sh; echo B>/proc/sysrq-trigger; reboot-f "; pri-lost-after-sb"/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo B>/proc/sysrq-trigger; reboot-f "; local-io-error"/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o>/proc/sysrq-trigger; halt-f "; # fence-peer"/usr/lib/drbd/crm-fence-peer.sh "; # split-brai N "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root "; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh-p 15 ---c 16 k "; # after-resync-target/usr/lib/drbd/unsnapshot-resync-target-lvm.sh ;} startup {# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb} options {# cpu-mask on-no-data-accessible} disk {on- io-error detach; # Configure the I/O error handling policy as separation # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync- after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout} net {# protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared- secret after-sb- 0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use -rle} syncer {rate 1024 M; # setting the network rate for master-slave node synchronization }############################ Resource Configuration to configure the file, you must create a new file ################ [root @ node1 ~] # Vim/usr/local/drbd/etc/drbd. d/drbd. resresource r1 {# This r1 defines the resource name starting with on node1 {# on, followed by the host name device/dev/drbd0; # The drbd device name disk/dev/sdb1; # The disk partition used by drbd0 is sdb1address 10.0.0.105: 7789; # Set the drbd listening address and port meta-disk internal;} on node2 {# on, the following is the host name device/dev/drbd0; # The Name Of The drbd device disk/dev/sdb1; # the disk partition used by drbd0 is sdb1address 10.0.0.106: 7789; # Set the drbd listening address and port meta-disk internal ;}}
######################################## ########################### Initialize resources on node1 [root @ node1 ~] # Drbdadm create-md r1 start the service [root @ node1 ~] # Service drbd start: [root @ node1 ~] # Netstat-anput | grep 7789tcp 0 0 10.0.0.105: 7789 0.0.0.0: * LISTEN-tcp 0 10.0.0.105: 55715 10.0.0.106: 7789 TIME_WAIT-tcp 0 10.0.0.105: 41264 10.0.0.106: 7789 TIME_WAIT note: when drbd is started for the first time, both drbd nodes are in the Secondary status by default. [root @ node1 ~] # Drbdadm role r1Secondary/Secondary because there are no primary and Secondary nodes by default, you need to set the primary and Secondary nodes of the two hosts, select the host that needs to be set as the primary node, and then execute the following command: [root @ node1/] # drbdadm -- overwrite-data-of-peer primary all, you can use another command: drbdadm primary r1, which is the role that defines resources) or drbdadm primary all
View the connection status of the resource [root @ node1 ~] # Connection status of drbdadm cstate r1Connected resource; a resource may have one StandAlone independent of the following connection status: the network configuration is unavailable; the resource has not been connected or managed to be disconnected using the drbdadm disconnect command), or Disconnecting is disconnected due to authentication failure or split-brain: the disconnection is only temporary, the next status is StandAlone's independent Unconnected: it is the temporary status before the connection attempt. The next status may be WFconnection and WFReportParamsTimeout Timeout: the connection to the peer node times out, which is also the temporary status, the next state is Unconected suspended BrokerPipe: the connection to the peer node is lost, which is also a temporary state. The next state is Unconected suspended NetworkFailure: the temporary state after the connection is pushed to the peer node, the next state is Unconected undefined ProtocolError: With the peer node Push the temporary status after the connection, the next status is Unconected suspended TearDown dismantling: temporary status, the peer node is closed, the next status is Unconected suspended WFConnection: waiting to establish a network connection with the peer node WFReportParams: a TCP connection has been established. This node is waiting for the first network package Connected connection from the peer node. DRBD has established a connection. The data image is now available and the node is in the normal state. StartingSyncS: fully synchronized, A synchronization task initiated by an administrator may be in the SyncSource or PausedSyncSStartingSyncT state in the future: full synchronization. A synchronization task initiated by an administrator has just started. The next synchronization task is in the WFSyncUUIDWFBitMapS state, the next step may be in the SyncSource or PausedSyncSWFBitMapT status: Some synchronization has just started, and the next step may be in the WFSyncUUIDWFSyncUUID status: synchronization is about to begin. In one step, the possible status is SyncTarget or PausedSyncTSyncSource: SyncTarget is in progress for synchronization with this node as the synchronization Source: PausedSyncS is in progress for synchronization with this node as the synchronization target: the local node is a continuous synchronization source, but the synchronization has been paused. It may be because another synchronization is in progress or the synchronization PausedSyncT is paused using the command (drbdadm pause-sync: the local node is the target for continuous synchronization, but the synchronization has been paused. This can be because another synchronization is in progress or VerifyS is paused using the command (drbdadm pause-sync: verifyT is being executed for online device verification with the local node as the verification Source: the online device verification with the local node as the verification target is executing the command to view the resource role [root @ node1 ~] # Drbdadm role r1Primary/Secondary is the current node.) Parimary master: the resource is currently the master node and may be being read or written, if it is not a dual-master node, it will only appear on one of the two nodes for Secondary times: the resource is currently the current time, normally receives updates from the peer node Unknown: The resource role is currently Unknown, local resources do not display this status to view the hard disk status [root @ node1 ~] # Drbdadm dstate r1UpToDate/UpToDate local and peer node hard disks may be in one of the following States: Diskless: no local Block devices are allocated to DRBD, which means no available devices are available, alternatively, you can use the drbdadm command to manually detach the device or an I/O error at the underlying layer to automatically detach the device. Attaching: An instantaneous Failed failure occurs when no data is read. The local block device reports the next status of an I/O error, the next status is Diskless Negotiating: the instantaneous status before Attach reads data in the connected DRBD settings. Inconsistent: The data is Inconsistent, before the initial full synchronization between the two nodes) when this status occurs, create a new resource immediately. In addition, during synchronization, the synchronization target is in this State on a node. Outdated: The data resources are Consistent, but it is out of date. DUnknown: This State occurs when the peer node network connection is unavailable. Consistent: the data of an unconnected node is consistent. When a connection is established, it determines whether the data is UpToDate or OutdatedUpToDate: consistent and latest data status, check the synchronization progress in the normal status [root @ node1 ~] # Cat/proc/drbd or run/usr/local/drbd/sbin/drbd-overviewversion: 8.4.3 (api: 1/proto: 86-101) GIT-hash: build by root@localhost.localdomain, 20: 16: 240: cs: SyncSource ro: Primary/Secondary ds: UpToDate/Inconsistent C r ----- ns: 2767088 nr: 0 dw: 0 dr: 2774680 al: 0 bm: 168 lo: 0 pe: 1 ua: 7 ap: 0 ep: 1 wo: f oos: 18202972 [=> ......] sync 'ed: 13.3% (17776/20476) Mfinish: 0:12:59 speed: 23,344 (22,492) K/sec: The progress has been completed by 13.3%, and the transmission speed is about 22 Mb/s. Note: ds indicates the disk status. dw indicates the disk write information. dr indicates the disk read information.
########################### Test and verification ########### ####################### format a file system file based on your system environment) [root @ node1 ~] # Mount this file system to mkfs. ext4/dev/drbd0 [root @ node1 ~] # Mkdir/data [root @ node1 ~] # Mount/dev/drbd0/data/create a test file in the mounted data directory, unmount the mount directory, and switch between the master and slave nodes, check whether Node1 exists in the created test file on the slave node: [root @ node1 ~] # Mkdir/data/test [root @ node1 ~] # Umount/data/change node1 to a slave node [root @ node1 ~] # Drbdadm secondary r1 [root @ node1 ~] # Drbdadm role r1Secondary/SecondaryNode2: Change node2 to the master node [root @ node2 ~] # Drbdadm primary r1 [root @ node2 ~] # Drbdadm role r1Primary/Secondary mount the device and check whether the file exists [root @ node2 ~] # Mount/dev/drbd0/mnt/[root @ node2 ~] # Cd/mnt/[root @ node2 mnt] # lstestOK! This is done!
################################## Error message #### ######################################## Error Reporting 1: configure: error: Cannot build utils without flex, either install flex or pass the -- without-utils option. solution: yum-y intallflex error 2: make: [check-kdir] error 1 solution: yum install-y kernel-devel error 3: SORRY, kernel makefile not found. you need to tell me a correct KDIR, Or install the neccessary kernel source packages. error 4: 'drbd 'not defined in your config (for this host ). the error "Command 'drbdmeta 0 v08/dev/sdb1 internal create-md 'terminated with exit code 40" is reported because sdb1 already has a file system, the data already exists. Solution: [root @ node1 ~] # Dd if =/dev/zero bs = 1 M count = 1 of =/dev/sdb1 check whether the path in the KDIR directory is correct ######### ######################################## ######################################## ######################################## #############################
This article from the "demon dog" blog, please be sure to keep this source http://yangdonglin.blog.51cto.com/5404572/1301166