In a distributed system, the data is ultimately stored on the physical device, and Ceph's underlying device abstract role is the OSD, then how the data is determined on which OSD, the answer is the crush algorithm.
Keywords: CRUSH, consistent hash, ceph data distribution
In general, data storage is divided into 2 phases
1, first of all, according to the General hash algorithm, the data x hash value of the PG number to take the modulo operation
The PG is a virtual storage unit in the pool of resources in Ceph that is further abstracted by the OSD.
Why abstract this layer, in fact, can be analogous to the role of LVM, that is, in order to make all resources can become "not so hard",
Convenient in the event of a failure, the system can be quickly adjusted, and the overall system has minimal impact
The existence of PG makes the increase and decrease of nodes become the adjustment of logic level, which plays the role of encapsulation for the OSD of physical layer.
2, after determining the PG, and then according to the global topology in the Monitor node cluster map, combined with data rules (number of copies), again mapped to the specific OSD node
It is important to note that the selected OSD node may be multiple, and the algorithm will guarantee that it is not in the same fault domain
The size of the fault domain can be (from small to large): Host-rack-row-root
The following is a usual summary of the picture:
The things That cloud storage (2)--Data distribution algorithm crush