Keyvaluestore is another storage engine that Ceph supports (the first is Filestore), which is in the Emporer version of Add LevelDB support to Ceph cluster backend store Design Sum At MIT, I put forward and implemented the prototype system, and achieved the docking with ObjectStore in the firely version. is now incorporated into Ceph's Master.
Keyvaluestore is a lightweight implementation relative to Filestore, with the goal of serving different scenarios for Ceph using the capabilities of its different backend. As the current default engine is LevelDB, expect to provide high performance write performance.
main data Structure
keyvaluestore
Genericobjectmap provides a universal kv space for object, Stripobjectmap inherits the Genericobjectmap to implement the encapsulation of object Data, ObjectStore has three kinds Data, attr and Omap, both of which are single KV implementations, can be implemented directly using the native interface of Genericobjectmap, but the interface of data is similar to the ability of POSIX to have parity Write Therefore, it is not appropriate to simply use the data of an object as a key-value pair, and it is necessary to do a strip work by dividing the data of an object into multiple key-value pairs according to a certain width, which is done by Stripobjectmap.
Finally, Keyvaluestore class uses Stripobjectmap to accomplish the method of ObjectStore.
Primary IO Path
similar to the implementation of Filestore, Keyvaluestore also generates a message queue, and all IO requests from the upper PG are placed in this queue, and then multiple Keyvaluestore threads are processed as consumers of the queue to obtain requests. Because of the inherent isolation of PG, Keyvaluestore is currently using PG as an isolation unit, with only one thread processing the same PG request at a time. The Keyvaluestore thread generates a buffer space for each request, because a request can contain multiple atomic operations as a transaction, and in order to ensure the atomicity and isolation of the transaction, each request cannot be written to the persistence layer at the intermediate stage, and only some sequence of operations can be generated. The possible side effects need to be saved by the buffer space as a context for subsequent operations. Finally Keyvaluestore line routines submits this request to complete this transaction. </font
the relevant definitions in KeyValueStore.h are as follows
struct OP {utime_t start; uint64_t op; list<transaction*> TLS; Context *ondisk, *onreadable, *onreadable_sync; uint64_t ops, bytes; Trackedopref Osd_op; }; struct Opwq:public threadpool::workqueue<opsequencer>
Message Processing Request control
Unsigned keyvaluestore::_do_transaction (transaction& transaction, BUFFERTRANSAC tion &t, Threadpool::tphandle *handle)
Ceph Source Code Analysis-keyvaluestore