1. Online adjustment parameters:
Modify the configuration file of existing resources. The two peer nodes must be consistent, and then run drbdadm adjust <resource> on both nodes;
2. Online verification of data integrity; (this has a great impact on performance)
It verifies that the source calculates the encryption digest for a block storage device of a resource of each underlying device at a time and transmits it to the peer node to verify the local copy block corresponding to the digest, if they do not match, they are identified and re-synchronized. During online verification, resource replication is not blocked and the system is not interrupted;
Operation Method:
Modify the configuration file:
Resource <resource>
Net {
Verify-alg <algorithm>
}
} (Ps: can also be configured to the common block, applicable to all resources)
Run the following command:
Drbdadm verify <resource>
If the out-of-sync block appears during verification, you need to use it after the verification is completed:
Drbdadm disconnect <resource>
Drbdadm connect <resource>
This method is rarely used, but it can be configured as weekly or once a month;
3. Configure the synchronization rate:
In general, it is good to be suitable. It depends on the disk speed and Nic I/O. If the backend bandwidth is full, replication and programs are affected;
The recommended bandwidth is 30% of the available bandwidth.
Fixed synchronization rate:
Resource <resource>
Disk {
Sync-rate 40 M;
} (Ps: can be configured to the common block for all resources)
Temporary adjustment rate:
It may be used to speed up Synchronization After expected maintenance:
Drbdadm disk-options -- resync-rate = 200 M <resource>
To restore to the original synchronization rate: drbdadm adjust <resource> is executed on both nodes.
A range for configuring synchronization rates for multiple resources [Skipped for the moment]
Congestion policies and suspension of replication mostly appear in the Wide Area Network. We will not discuss them for the moment;
4. Disk IO error handling
Resource <resouce> {
Disk {
On-io-error <strategy>;
}
} (Ps: can be set in the common block, effective for all resources)
Several options to handle disk errors:
Detach: the default option. If an I/O error occurs on the underlying disk of the node, the device runs in diskless mode;
Pass_on: drbd reports errors to the upper layer, that is, the file system, but is often ignored;
Local-io-error calls the commands defined in the local disk I/O handler; local-IO-error is required to define the commands for error handling;
5. Disk Flushing
As long as the disk Controller supports DRBD disk flushing (most of them are supported ),
In a RAID environment containing the BBC, you can disable the DRBD disk flushing function to achieve higher performance;
Resource <resource>
Disk {
Disk-flushes no;
...
}
6. Split-brain notification:
Handlers {split-brain "/usr/lib/drbd/notify-split-brain.sh root ";...}
7. Split-brain repair strategies:
In most cases, manually fix the problem:
After-sb-0pri: Split brain has been detected, but now no node in the master role, for this option, drbd has the following keywords:
Disconnect: automatic recovery is not required. It only calls the script of the split brain processing program (IF configured), and the connection is closed and in the disconnection mode.
Discard-younger-primary: discard and roll back the last modification made above as the master.
Discard-least-changes: discard and roll back the changes on the host with few changes.
Discard-zero-changes: If no node changes, you only need to apply for modification on the same node.
After-sb-1pri: Split brain has been detected, there is a node in the master role, for this option, drbd has the following keywords:
Disconnect: like the after-sb-0pri, call the split brain processing script (IF configured), disconnect and in disconnected mode.
Consensus: same repair policy as in after-sb-0pri. If you can choose to use these policies to split your brain, you can automatically resolve them. Otherwise, the specified action is also disconnected.
Call-pri-lost-after-sb: The same repair policy as in the after-sb-0pri. If you can choose to use these policies to split your brain, call the pri-lost-after-sb program on the compromised node. This program must be configured in handlers and consider removing the node from the cluster.
Discard-secondary: No matter which host is in the secondary role, it is the hazard of split brain.
After-sb-2pri: Split brain is found when both nodes are in the master role. The secondary option uses the same keywords as the after-sb-1pri, discarding the secondary node and reaching consensus
A brief Configuration:
Resource data {
Protocol C;
Handlers {
Split-brain "/usr/lib/drbd/notify-split-brain.sh root ";
Local-io-error "/usr/lib/drbd/notify-io-error.sh;/usr/lib/drbd/notify-emergency-shutdown.sh; echo o>/proc/sysrq-trigger; halt-f ";
}
Startup {
Wfc-timeout 0;
Degr-wfc-timeout 120;
}
Disk {
On-io-error detach;
}
Net {
Cram-hmac-alg sha1;
After-sb-0pri discard-zero-changes;
After-sb-1pri discard-secondary;
After-sb-2pri disconnect;
Max-buffers 8000;
Max-epoch-size 8000;
Sndbuf-size 0;
}
Syncer {
Rate 90 M;
Al-extents 257;
}
On BGP-LF-1MS2232 {
Device/dev/drbd0;
Disk/dev/sda4;
Address 192.168.1.104: 7788;
Meta-disk internal;
}
On BGP-LF-1MS2233 {
Device/dev/drbd0;
Disk/dev/sda4;
Address 192.168.1.105: 7788;
Meta-disk internal;
}
}
DRBD for Linux high availability (HA) CLUSTERS
DRBD Chinese application guide PDF
Installation and configuration notes for DRBD in CentOS 6.3
High-availability MySQL based on DRBD + Corosync
Install and configure DRBD in CentOS 6.4
DRBD details: click here
DRBD: click here