DRBD installation configuration, operating principle and failure recovery

Last Update:2016-03-30 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Introduction of DRBD

The full name of DRBD is: Distributed Replicatedblock Device (DRBD) distributed block devices replication, which is made up of kernel modules and related scripts to build high-availability clusters. The way to do this is to mirror the entire device over the network. You can think of it as a network raid. It allows the user to create a real-time image of a local block device on a remote machine.

Second, how does DRBD work?

(DRBD Primary) is responsible for receiving data, writing the data to a local disk and sending it to another host (DRBD secondary). The other host then saves the data to its own disk. Currently, DRBD only allows read and write access to one node at a time, but this is sufficient for the usual failover of a highly available cluster. It is possible that later versions will support two nodes for read and write access.

Third, the relationship between DRBD and Ha

A DRBD system consists of two nodes, similar to Ha clusters, with the primary and standby nodes, where applications and operating systems can run and access the DRBD device (/dev/drbd*) on nodes with primary devices. The data written by the master node is stored on the primary node's disk device through the DRBD device, and the data is automatically sent to the backup node's DRBD device, which eventually writes to the standby node on the disk device, and on the standby node, DRBD simply writes data from the DRBD device to the disk on the standby node. Most high-availability clusters now use shared storage, and DRBD can act as a shared storage device, using DRBD that does not require much hardware to invest. Because it runs in a TCP/IP network, using DRBD as a shared storage device saves a lot of cost because the price is much cheaper than a dedicated storage network, and its performance and stability are good.

Four, DRBD replication mode

Protocol A:

Asynchronous replication protocol. Once the local disk write has completed and the packet is already in the Send queue, the write is considered complete. In the event of a node failure, data loss can occur because the data that is written to the remote node may still be in the sending queue. Although the data on the failover node is consistent, it is not updated in a timely manner. This is typically used for geographically separate nodes

Protocol B:

Memory synchronization (semi-synchronous) replication protocol. Once the local disk write is completed and the replication packet reaches the peer node, it is considered to be written on the master node as completed. Data loss can occur in the case of simultaneous failure of participating two nodes because the data in transit may not be committed to disk

Protocol C:

Synchronous replication Protocol. Write is considered complete only if the disk on the local and remote nodes has confirmed that the write operation is complete. There is no data loss, so this is a popular mode for cluster nodes, but I/O throughput depends on network bandwidth

Generally, protocol C is used, but the choice of C protocol will affect traffic, thus affecting network latency. For data reliability, we need to be cautious about which protocol to use when using a production environment

Iv. working principle diagram of DRBD

DRBD is a distributed storage system in the storage layer of the kernel of Linux, and can be used to share file systems and data between two Linux servers using DRBD. Similar to the functionality of a network RAID-1:

DRBD installation configuration, operating principle and failure recovery

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More