Drbd learning Summary

Last Update:2013-12-28 Source: Internet

Author: User

Tags echo b flushes hmac

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Well, some of them are taken directly from others' blogs, and more are manpage. It is not illegal. I have no copyright statement, GPL

DistributedReplicatedBlockDevice (DRBD) is a software-based, non-shared, replicated storage solution that mirrors hard disks, partitions, and logical volumes of Block devices between servers. DRBD works in the kernel, Class

Similar to a driver module. The DRBD job is located between the buffercache of the file system and the disk scheduler. It is sent to another host through TCP/IP to the peer's TCP/IP and finally to the drbd of the peer, then the drbd of the other party is stored in the local disk.

Disk, similar to a network-based RAID-1.

DRBD needs to be built on the underlying device and then build a block device. A drbd device is like a physical disk. You can create a file system on it. DRBD supports the following underlying devices:
1. a disk, or a partition of the disk;
2. A softraid Device;
3. A logical volume of LVM;
4. An EVMSEnterpriseVolumeManagementSystem, an Enterprise Volume Management System) volume;
5. Any other block device.

Due to restrictions on the single-host file system, drbd can only be mounted on the master node.
Inspiteofthislimitation, therearestillafewwaystoaccessthedataonthesecondnode:

UseDRBDonlogicalvolumesanduseLVM 'scapabilitiestotakesnapshotsonthestandbynode, andaccessthedataviathesnapshot.
DRBD 'sprimary-primarymodewithashareddiskfilesystem (GFS, OCFS2). Thesesystemsareverysensitivetofailuresofthereplicationnetwork.

DRBD Configuration:
Example:
Environment:
1. The two nodes are node1 (10.0.9.1) and node2 (10.0.9.2) respectively. node11 and node2 are the host name, which are the same as the output of 'uname-n.
2. Each node provides a block device with the same size (I use an lv here), including/dev/centos_vg/web and/dev/myvg/web, both are 100 MB.
3. I don't know if the drbd version is required to be consistent. The operating system versions of my two virtual machines are consistent with those of drbd. Centos6.4x86 _ 64

Procedure:
1. Ensure that the time of the two drbd hosts is synchronized. This requirement is the same as configuring the cluster nodes.
2. The host names of the two nodes should be correctly resolved, and the host names should be the same as those of 'uname-n, the resolved IP address should be the IP address used for data synchronization and communication between two nodes. We recommend that you configure the hosts file, which is more reliable than DNS.
3. install the software package
Drbd consists of a kernel module and a user space management tool. The drbd kernel module code has been integrated into Versions later than Linux kernel 2.6.33. Therefore, if your kernel version is later than this version, you only need to install

Management tools; otherwise, you must install both the kernel module and management tools, and the version numbers of the modules and management tools must be consistent.
Because the centos kernel I use does not contain the drbd module, I need to install it.

Currently, drbd versions applicable to CentOS5 mainly include 8.0, 8.2, and 8.3. The rpm packages are named drbd, drbd82, and drbd83 respectively, the corresponding kernel modules are named kmod-drbd, kmod-drbd82 and kmod-

Drbd83. For CentOS6, Version 8.4 corresponds to the rpm packages drbd and drbd-kmdl. However, when selecting an rpm package, remember that the drbd and drbd-kmdl versions correspond to each other; the other is that the drbd-kmdl version must be consistent with the kernel of the current system.

Corresponding version. Here I use Version 8.4 (drbd-8.4.3-33.el6.x86_64.rpm and drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm) for ftp://rpmfind.net/linux/atrpms/
After the download is complete, install it directly:
# Rpm-ivhdrbd-8.4.3-33.el6.x86_64.rpmdrbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm

4. Configuration
The main configuration file of drbd is/etc/drbd. conf. The content is:
Include "drbd. d/global_common.conf ";
Include "drbd. d/*. res ";

The global and common segments are defined in global_common.conf, and each. res file is used to define a resource.
In the configuration file, the global segment can only appear once. If all the configuration information is saved to the same configuration file and not separated into multiple files, the global segment must be at the beginning of the configuration file. Parameters that can be defined in the global segment

Only minor-count, dialog-refresh, disable-ip-verification, and usage-count are supported.

The common section is used to define parameters inherited by each resource by default. parameters that can be used in the resource definition can be defined in the common section. In practice, the common segment is not mandatory, but we recommend that you define multiple resource sharing parameters

Parameters in the common section to reduce the complexity of the configuration file.

The resource segment is used to define drbd resources. The resource must be named during definition. The name can contain non-blank ASCII characters. The definition of each resource segment must contain two or more) host sub-segments to define

Node.

The configuration of the two nodes on the same drbd resource should be identical. The reason for this is that there may be multiple drbd nodes and multiple drbd resources: for example, there is resource A on node1, node 2 has resources B, C, and node3.

Resource C. This situation is not considered here.
Configure on node1 first:
Global_common.conf File

global {usage-count no;# minor-count dialog-refresh disable-ip-verification}common {protocol C;handlers {# These are EXAMPLE handlers only.# They may have severe implications,# like hard resetting the node under certain circumstances.# Be careful when chosing your poison.pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";# split-brain "/usr/lib/drbd/notify-split-brain.sh root";# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;}startup {# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb}options {# cpu-mask on-no-data-accessible}disk {on-io-error detach;# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes# disk-drain md-flushes resync-rate resync-after al-extents# c-plan-ahead c-delay-target c-fill-target c-max-rate# c-min-rate disk-timeout}net {cram-hmac-alg "sha1";shared-secret   "mydrbd";# protocol timeout max-epoch-size max-buffers unplug-watermark# connect-int ping-int sndbuf-size rcvbuf-size ko-count# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri# after-sb-1pri after-sb-2pri always-asbp rr-conflict# ping-timeout data-integrity-alg tcp-cork on-congestion# congestion-fill congestion-extents csums-alg verify-alg# use-rle}syncer {rate 30M;}}

Content of web. res:

Resource web {meta-disk internal; on node1 {device/dev/drbd0; # equivalent to device minor 0 disk/dev/centos_vg/web; address 10.0.9.1: 7789 ;} on node2 {device/dev/drbd0; # name of the drbd device generated on node2: disk/dev/myvg/web; # use the device on node2 as the underlying Device of drbd address 10.0.9.2: 7789; # node2 uses this socket to communicate with the peer node }}

You can also use floatingnode-ip instead of onnode-name:
Here is an example in manpage:

Resourcer2 {
ProtocolC;
Deviceminor2;
Disk/dev/sda7;
Meta-diskinternal;

# Terraform, device, diskandmeta-diskinherited
Floating10.1.1.31: 7802;

# Longerform, onlydeviceinherited
Floating10.1.1.32: 7802 {
Disk/dev/sdb;
Meta-disk/dev/sdc8;
}
}

5. Copy the preceding configurations to the peer node.
6. initialize the resource, create the/var/lib/drbd directory on both nodes respectively (this directory is used to place files similar to the drbd-minor-0.lkbd), execute:
# Drbdadmcreate-mdweb
7. Start the drbd Service
#/Etc/init. d/drbdstart

8. view the startup status
# Cat/proc/drbd
Or
# Drbd-overview
Both nodes are in the secondary status.
9. set one of the nodes as the master node as needed, and execute
# Drbdadmprimary -- forceweb
View the status of drbd, And the slave node synchronizes data from the master node by bit.
10. On the master node, create a file system for the drbd device or create the drbd device as an LVM device) And then mount
In a drbd cluster, only the master node can be read and the drbd devices on the slave node cannot read and write, the underlying devices of drbd on all nodes cannot perform any file system operations (when drbd is running on it), including commands such as dumpe2fs.

11. Switch between master and slave nodes
On the master node:
Detach the mounted drbd device, and then execute
Drbdadmsecondaryweb
On the slave node:
Drbdadmprimaryweb
Then mount the drbd device. Note that do not format it. Because the drbd device synchronizes data from one device to the other in bit mode, and the data is formatted during synchronization.

Some configuration explanations:

Protocolprot-id
OntheTCP/IPlinkthespecifiedprotocolisused. ValidprotocolspecifiersareA, B, andC.

ProtocolA: writeIOisreportedascompleted, ifithasreachedlocaldiskandlocalTCPsend
Buffer.

ProtocolB: writeIOisreportedascompleted, ifithasreachedlocaldiskandremotebuffer
Cache.

ProtocolC: writeIOisreportedascompleted, ifithasreachedbothlocalandremotedisk.

Handlers: inthissectionyoucandefinehandlers (executables) thatarestartedbytheDRBDsysteminresponsetocertainevents

Devicenameminornr
Thenameoftheblockdevicenodeoftheresourcebeingdescribed. Youmustusethisdevicewith
Yourapplication (filesystem) andyoumustnotusethelowlevelblockdevicewhichisspecified
Withthediskparameter.

Onecanetheromitthenameorminorandtheminornumber. Ifyouomitthenameadefaultof
/Dev/drbdminorwillbeused.

Udevwillcreateadditionalsymlinksin/dev/drbd/by-resand/dev/drbd/by-disk.

Diskname
DRBDusesthisblockdevicetoactuallystoreandretrievethedata. Neveraccesssuchadevice
WhileDRBDisrunningontopofit. Thisalsoholdstruefordumpe2fs (8) andsimilarcommands.

AddressAFaddr: port
AresourceneedsoneIPaddressperdevice, whichisusedtowaitforincomingconnectionsfrom
Thepartnerdevicerespectivelytoreachthepartnerdevice. AFmustbeoneofipv4, ipv6,
Ssocksorsdp (forcompatibilityreasonssciisanaliasforssocks). ItmaybeomitedforIPv4
Addresses. theactual%6addressthatfollowsthe%6keywordmustbeplacedinsidebrackets:
Ipv6 [fd01: 2345: 6789: abcd: 1]: 7800.

EachDRBDresourceneedsaTCPportwhichisusedtoconnecttothenode 'spartnerdevice. Two
DifferentDRBDresourcesmaynotusethesameaddr: portcombinationonthesamenode.

On-io-errorhandler
Istaken, ifthelowerleveldevicereportsio-errorstotheupperlayers.

Handlermaybepass_on, call-local-io-errordetach.

Pass_on: Thenodedowngradesthediskstatustoinconsistent, markstheerroneousblockas
InconsistentinthebitmapandretriestheIOontheremotenode.

Call-local-io-error: Callthehandlerscriptlocal-io-error.

Detach: Thenodedropsitslowleveldevice, andcontinuesindisklessmode.

Ping-inttime
IftheTCP/IPconnectionlinkingaDRBDdevicepairisidleformorethantimeseconds, DRBDwill
Generateakeep-alivepackettocheckifitspartnerisstillalive.Thedefaultis10seconds,
Theunitis1second.

Ping-timeouttime
Thetimethepeerhastimetoanswertoakeep-alivepacket.Incasethepeer 'sreplyisnot
Receivedwithinthistimeperiod, itisconsideredasdead. thedefavaluvalueis500ms,
Defaultunitaretenthsofasecond.

Become-primary-onnode-name
Setsonwhichnodethedeviceshouldbepromotedtoprimaryroleby
Theinitscript. Thenode-namemighteitherbeahostnameorthe
Keywordboth. Whenthisoptionisnotsetthedevicesstayin
Secondaryroleonbothnodes. Usuallyonedelegatestherole
Assignmenttoaclustermanager (e. g. heartbeat ).
It must be set in startup of common. If it is set in the resource, an error appears.

Verify-alghash-alg
Duringonlineverification (asinitiatedbytheverifysub-command ),
Ratherthandoingabit-wisecomparison, DRBDappliesahash
Functiontothecontentsofeveryblockbeingverified, andcompares
Thathashwiththepeer. Thisoptiondefinesthehashalgorithm
Beingusedforthatpurpose. Itcanbesettoanyofthekernel's
Datadigestalgorithms. inatypicalkernelconfigurationyoushold
Haveatleastoneofmd5, sha1, andcrc32cavailable. Bydefault
Thisisnotenabled; youmustsetthisoptionexplicitlyinorderto
Beabletouseon-linedeviceverification.

Cram-hmac-alg
YouneedtospecifytheHMACalgorithmtoenablepeerauthentication
Atall. Youarestronglyencouragedtousepeerauthentication.
HMACalgorithmwillbeusedforthechallengeresponse
Authenticationofthepeer. Youmayspecifyanydigestalgorithm
Thatisnamedin/proc/crypto.

Shared-secret
Thesharedsecretusedinpeerauthentication. Maybeupto64
Characters. Notethatpeerauthenticationisdisabledaslongasno
Cram-hmac-alg (seeabve) isspecified.

Resync-raterate
ToensureasmoothoperationoftheapplicationontopofDRBD, it
Ispossibletolimitthebandwidthwhichmaybeusedbybackground
Synchronizations. Thedefaultis250KB/sec, thedefauniunitis
KB/sec. OptionalsuffixesK, M, Gareallowed.

Common options and sub-commands of drbdadm
-D, -- dry-run
-C, -- config-file

Attach
AttachesalocalbackingblockdevicetotheDRBDresource 'sdevice.
Detach
RemovesthebackingstoragedevicefromaDRBDresource 'sdevice.
Connect
Setsupthenetworkconfigurationoftheresource 'sdevice. Ifthepeerdeviceisalready
Configured, thetwoDRBDdeviceswillconnect. Iftherearemorethantwohostsectionsinthe
Resourceyouneedtousethe -- peeroptiontoselectthepeeryouwanttoconnectto.
Disconnect
Removesthenetworkconfigurationfromtheresource. thedevicewillthengoappsstandalone
State.
Up
Isashortcutforattachandconnect.

Down
Isashortcutfordisconnectanddetach.
Syncer
Loadstheresynchronizationparametersintothedevice.
Verify
Startsonlineverify. Duringonlineverify, dataonbothnodesiscomparedforequality. See
/Proc/drbdforonlineverifyprogress. Ifout-of-syncblocksarefound, theyarenot
Resynchronizedautomatically. Todothat, disconnectandconnecttheresourcewhenverification
Hascompleted.
Pause-sync
Temporarilysuspendanongoingresynchronizationbysettingthelocalpauseflag. Resynconly
Progressesifneitherthelocalnortheremotepauseflagisset. Itmightbedesirableto
PostponeDRBD 'sresynchronizationuntilafteranyresynchronizationofthebackingstore' sRAID
Setup.
Resume-sync
Unsetthelocalsyncpauseflag.
Dstate
Showthecurrentstateofthebackingstoragedevices. (local/peer)

Hidden-commands
Showsallcommandsunappsentedonpurpose.

Common commands:
Drbdadmcreate-mdresource_name # initialize drbd Resources
Drbdadmverifyresource_name # enable online verification
Drbdsetup/dev/drbd0syncer-r100M # temporarily set the re-synchronization speed to 100 M
Drbdadmadjustresource_name # restore the re-synchronization speed of drbd. the value set in the configuration file
Cat/proc/drbd # view the status of drbd
Drbdadmprimary [-- force] resource_name # [force] upgrade the current node to the master node of resource_name
Drbdadmsecondary <resource_name> # used to downgrade the current node to the slave node of resource_name after the drbd device is uninstalled.
Drbdadmdump # Justparsetheconfigurationfileanddumpittostdout. can be used to check the syntax
Drbdadmdisconnect <resource_name>
Drbdadmdetach <resource_name>

Splitbrain split-brain
In fact, splitbrain means that in some cases, the two nodes of drbd are disconnected and run as primary. When a drbd primary node connects to the other node to prepare to send information, if it finds that the other node is in the primary status,

The system will immediately disconnect the connection and determine that splitbrain has occurred. At this time, he will record the following information in the system log: "Split-Braindetected, droppingconnection !" After splitbrain occurs, if you view

Connection status. At least one of them is in the StandAlone status, and the other may be in the WFConnection status if StandAlone discovers the splitbrain status at the same time.
If the configuration file is configured to automatically resolve splitbrain as if linbit is not recommended), drbd automatically solves the splitbrain problem. You can configure the splitbrain according to the following policies.
Discardingmodificationsal eonthe "younger" primary. In this mode, when the network is re-established and split-brain is found, DRBD discards the data modified by the host that finally switches to the master node.
Discardingmodificationsal eonthe "older" primary. In this mode, when the network is re-established and split brain is found, DRBD discards the data modified after it switches to the master node.
Discardingmodificationsontheprimarywithfewerchanges

All data on the machine.
Gracefulrecoveryfromsplitbrainifonehosthashadnointermediatechanges

The split-brain problem has been solved. (This situation is almost impossible)
Note: whether automatic split-brain repair can be accepted depends on personal applications. Consider creating a DRBD example library. The "discard the modification of the master node with fewer modifications" may be better than the database application for web applications. In contrast, the financial database is

The loss of any modifications is intolerable. In this case, you need to manually fix split-brain problems no matter under any circumstances. Therefore, you need to consider your application before enabling automatic split-brain repair.
If the splitbrain automatic solution is not configured, we can solve it manually. First, we must determine which side should be used as the primary after the problem is solved. Once this is determined, we are also sure to accept the loss in the split

After brain, all the data on the other node is changed. After confirming these items, we can restore them through the following operations:
1. First, switch to secondary on the node to be used as secondary and discard the data of the Resource:
Drbdadmsecondaryresource_name
Discard-my-dataconnectresource_name-drbdadm
2. Reconnect to secondary as the primary node. If the current connection status of this node is WFConnection, skip this step)
Drbdadmconnectresource_name
After these actions are completed, the re-synchnorisation from the new primary to secondary will automatically start.

Metadata
DRBD stores various information blocks of data in a dedicated region. These metadata include
A. Size of the DRBD Device
B. ID generated
C. Activity logs
D. Fast synchronized bitmap
Metadata can be stored in both internal and external ways. Which configuration is defined in resource configuration?
Internal metadata
The internal metadata is stored in the last location of the same hard disk or partition.
Advantage: metadata is closely related to the data. If the hard disk is damaged, metadata will be lost, and metadata will be restored together.
Disadvantages: metadata and data on the same hard disk have a negative impact on the write throughput, because write requests of applications trigger metadata updates, in this way, the write operation will cause two additional head reads and writes.
External metadata
External metadata is stored on an independent block device separated from the data disk.
Advantage: Some write operations can improve some potential behaviors.
Disadvantages: metadata is not associated with the data. Therefore, if the data disk fails, you need to consider the intervention operation when replacing the new disk to synchronize the existing node to the hard disk.
If there is data on the hard disk, and the hard disk or partition does not support expansion, or the existing file system does not support shrinking, you must use the external metadata method.

Disk is a subsection in the common section, which defines some parameters about the drdb device, and a configuration parameter in the resource section. It defines the name of the underlying block device used to provide the drbd device.

Reference:

Http://www.drbd.org/docs/introduction/
Http://czmmiao.iteye.com/blog/1773079
Man5drbd. conf
Mandrbdadm
DRBDUser 'sguide

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More