Distributed Storage System sheepdog

Source: Internet
Author: User
Tags epoll

Sheepdog is an open-source project developed by three Japanese researchers of NTT. It is mainly used to provide Block devices for virtual machines.

Its architecture is as follows:




Below, we will introduce the architecture, modules, and other aspects:


I. Architecture Diagram

For example:

The secondary node uses a fully symmetrical architecture with no central nodes. It has no single point of failure, and its storage capacity and performance can be linearly expanded;

Can I add nodes by myself through simple configuration? (IP: port), the data automatically implements load balancing;

When a node fails, the data can be automatically restored;

Directly supporting qemu/KVM applications;


Ii. Modules


For example:

Corosync completes cluster member management and message delivery;

Uses qemu as the sheepdog client to provide NBD/iSCSI protocol support;

The gateway implements the DHT routing of data and is locally stored by the storage server;


Iii. Detailed data storage


For example:

VM data is stored using VDI object and a block device is exposed to users;

There are four data objects: VDI, data object, attribute object, and Vm real-time status data object used for snapshot;

OBS is implemented using a small file of 4 MB, but it is very easy to use based on this extension. For example, use a library to replace a small file of 4 MB;


Iv. Cluster Management

1. Commit uses corosync, and TOT is an open-source implementation of the EM Protocol. Totem is mainly used to manage cluster members and reliably transmit data in sequence.

2. corosync provides services by providing a pki api.

First, bind a FD to cpg_handle, and note the callback function cpg_dispatch;

Then, inject FD into epoll;

Messages on corosync will trigger FD changes, and general epoll will trigger the callback function cpg_dispatch;


There are two main functions, cpg_deliver_fn and cpg_confchg_fn, respectively corresponding to sd_deliver and sd_confchg.

Sd_deliver is responsible for sending messages from corosync to the local cluster, mainly for VDI operations, while sd_confchg is mainly for node operations to monitor cluster member changes.


V. Storage Object Management

Cluster Object version number epoch;

In the OBJ folder, each new epoch needs to create a new folder;

Data can be recovered from epoch;


Vi. Consistency Model

Epoll mechanism guarantee;

Implement strong consistency through data operations (Return to the client only when multiple copies are successfully written at the same time );


VII. DHT Routing

Proxy routing method;

Generate node numbers by IP: port for consistent hashing;


8. Copy placement

Consistent hash;

Virtual node;


If you need to know more specific information, you can test its official website: http://www.osrg.net/sheepdog/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.