The things That cloud storage (2)--Data distribution algorithm crush

Source: Internet
Author: User

In a distributed system, the data is ultimately stored on the physical device, and Ceph's underlying device abstract role is the OSD, then how the data is determined on which OSD, the answer is the crush algorithm.

Keywords: CRUSH, consistent hash, ceph data distribution

In general, data storage is divided into 2 phases

1, first of all, according to the General hash algorithm, the data x hash value of the PG number to take the modulo operation

The PG is a virtual storage unit in the pool of resources in Ceph that is further abstracted by the OSD.

Why abstract this layer, in fact, can be analogous to the role of LVM, that is, in order to make all resources can become "not so hard",

Convenient in the event of a failure, the system can be quickly adjusted, and the overall system has minimal impact

The existence of PG makes the increase and decrease of nodes become the adjustment of logic level, which plays the role of encapsulation for the OSD of physical layer.

2, after determining the PG, and then according to the global topology in the Monitor node cluster map, combined with data rules (number of copies), again mapped to the specific OSD node

It is important to note that the selected OSD node may be multiple, and the algorithm will guarantee that it is not in the same fault domain

The size of the fault domain can be (from small to large): Host-rack-row-root

The following is a usual summary of the picture:

The things That cloud storage (2)--Data distribution algorithm crush

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.