Sheepdog is an open-source project developed by three Japanese researchers of NTT. It is mainly used to provide Block devices for virtual machines.
Its architecture is as follows:
Below, we will introduce the architecture, modules, and other aspects:
I. Architecture Diagram
For example:
The secondary node uses a fully symmetrical architecture with no central nodes. It has no single point of failure, and its storage capacity and performance can be linearly expanded;
Can I add nodes by myself through simple configuration? (IP: port), the data automatically implements load balancing;
When a node fails, the data can be automatically restored;
Directly supporting qemu/KVM applications;
Ii. Modules
For example:
Corosync completes cluster member management and message delivery;
Uses qemu as the sheepdog client to provide NBD/iSCSI protocol support;
The gateway implements the DHT routing of data and is locally stored by the storage server;
Iii. Detailed data storage
For example:
VM data is stored using VDI object and a block device is exposed to users;
There are four data objects: VDI, data object, attribute object, and Vm real-time status data object used for snapshot;
OBS is implemented using a small file of 4 MB, but it is very easy to use based on this extension. For example, use a library to replace a small file of 4 MB;
Iv. Cluster Management
1. Commit uses corosync, and TOT is an open-source implementation of the EM Protocol. Totem is mainly used to manage cluster members and reliably transmit data in sequence.
2. corosync provides services by providing a pki api.
First, bind a FD to cpg_handle, and note the callback function cpg_dispatch;
Then, inject FD into epoll;
Messages on corosync will trigger FD changes, and general epoll will trigger the callback function cpg_dispatch;
There are two main functions, cpg_deliver_fn and cpg_confchg_fn, respectively corresponding to sd_deliver and sd_confchg.
Sd_deliver is responsible for sending messages from corosync to the local cluster, mainly for VDI operations, while sd_confchg is mainly for node operations to monitor cluster member changes.
V. Storage Object Management
Cluster Object version number epoch;
In the OBJ folder, each new epoch needs to create a new folder;
Data can be recovered from epoch;
Vi. Consistency Model
Epoll mechanism guarantee;
Implement strong consistency through data operations (Return to the client only when multiple copies are successfully written at the same time );
VII. DHT Routing
Proxy routing method;
Generate node numbers by IP: port for consistent hashing;
8. Copy placement
Consistent hash;
Virtual node;
If you need to know more specific information, you can test its official website: http://www.osrg.net/sheepdog/