Introduction to the encoding method of Redis and KV storage (ROCKSDB) fusion
Redis is the most popular deep-fried chicken in the NoSQL world, and it is a small, sharp, and practical, especially suited to solving problems that are difficult to solve with traditional relational databases. As a memory database, Redis has all the data in memory and is especially suitable for processing small amounts of hot data. Redis + KV storage is required when there is a huge amount of data that exceeds the size of the memory and needs to be saved.
The ardb involved in this article is a NoSQL storage service that is fully compliant with the Redis protocol. Its storage is based on the existing mature KV storage engine implementation, theoretically any similar to b-tree/lsm Tree implementation of the KV storage can be implemented as the underlying storage of ARDB, currently ARDB support Leveldb/rocksdb/lmdb.
Taking Ardb as an example, this paper introduces the implementation of codec layer when the fusion between Redis and KV storage is realized.
Encoding method
In the fusion scheme of Redis and KV storage, the codec layer is a very important part. Through the codec layer, we can block a variety of KV storage implementation of different, can be any simple kv storage engine, the implementation of Redis string,hash,list,set,sorted set and other complex types of data structure.
For the string type, it is obviously possible to correspond to a kv of one by one in KV storage;
For other container types, we need to
- A kv to store the meta-information of its entire key (such as the number of members of list, expiration time, etc.);
- Each member needs a KV to hold the name and value of the member;
For sorted set, each member has a score and rank two properties, so it needs to:
- One KV saves the meta-information of the entire key
- Each member needs a KV save score information
- Each member needs a KV to save each member corresponding to rank information
The encoding format of key
For all keys that contain the same prefix, the encoding format is defined as follows:
[<namespace>] <key> <type> <element...>
namespace is used to support library concepts similar to Redis, and can be any string, without restricting the need to be numeric;
Key is a variable-length binary string
The type is used to define a simple key-value, which implicitly indicates the type of data structure of the key; one byte
The type of the META information is fixed to Key_meta in key, and the specific types are defined in value (refer to the next section)
In addition to the above three parts, different types of keys may have additional fields, such as hash key may need to attach field field
The encoding format of value
Internal value is more complex, the encoding begins with type, and the type value is the KeyType defined in the previous section
<type> <element...>
Subsequent formats are different depending on the type of definition.
Various types of data encoding method
Each type of data is encoded as follows: NS for namespace
Keyobject valueobjectstring[<ns>] <key> Key_meta key_string <metaobject>hash[<ns>] <key> Key_meta Key_hash <MetaObject>[<ns>] <key> Key_hash_field <field> Key_hash_field <field-value>set[<ns>] <key> key_meta key_set <MetaObject> [<ns>] <key> K Ey_set_member <member> key_set_memberlist [<ns>] <key> Key_meta key_list <MetaObject> [<ns>] <key> key_list_element <index> key_list_element <element-value>sorted Set [ <ns>] <key> key_meta key_zset <MetaObject> [<ns>] <key> Key_zset_score < member> key_zset_score <score> [<ns>] <key> key_zset_sort <score> <member> Key_zset_sort
Zset code Example
Here is an example of the most complex sorted set. Suppose you have a sorted set of a: {member=frist, score=1}, {Member=second, score=2}. It is stored in Ardb in the following ways:
The storage encoding for Key A is:
// 伪代码中的|代表域的分割,不代表实际存储为"|"。实际序列化的时候每个域是按照特定位置序列化的.键为:ns|1|A(1代表是KEY_META元信息类型) 值为:元信息编码(redis数据类型/zset,过期时间,成员个数,最大最小score等)
The score Information store encoding for member first is:
键为:ns|11|A|first (11代表类型为KEY_ZSET_SCORE)值为:11|1 (11代表类型KEY_ZSET_SCORE,1为该成员first的score)
The rank Information store encoding for member first is:
键为:ns|10|A|1|first (10代表类型为KEY_ZSET_SORT, 1为score)值为:10 (代表类型KEY_ZSET_SORT,无意义。rocksdb中自动按key大小排序,所以很容易算出rank,不需要存储和更新)
The score Information store encoding for member second is slightly.
When the user uses Zcard a command, direct access to the namespace_1_a can get the number of the ordered set in meta-information;
When the user uses Zscore A first, direct access to the Namespace_a_first can get the first member's score;
When the user uses Zrank A first, Zscore get score, and then find the namespace_10_a_1_first sequence number;
The specific storage method code is as follows:
To read the full text, please click: http://click.aliyun.com/m/8714/
Encoding of Redis and KV storage (ROCKSDB) fusion