04Redis Getting Started Guide notes (Introduction to internal coding rules) _

04Redis Getting Started Guide notes (Introduction to internal coding rules) __redis

Last Update:2018-08-21 Source: Internet

Author: User

Tags redis

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Redis is a memory-based database with all the data stored in memory. So how to optimize storage and reduce memory footprint is a very important topic. Thin key names and key values are the most intuitive way to reduce memory footprint, such as changing the key name Very.important.person:20 to Vip:20.

But sometimes the space that is reduced only by the thin key name and the key value is not enough to satisfy the requirement, then you need to save more space according to the REDIS internal coding rules.

Redis provides two types of internal encoding for each data type, for example, in the case of a hash type, the hash type is implemented through a hash table, which allows for an O (1) Time complexity lookup, assignment operation, whereas when the elements in the key are few, O (1) does not have a significant performance improvement over O (n). So in this case, Redis uses an internal encoding that saves memory but performs slightly worse (gets the time complexity of an element O (n)).

The choice of internal coding method is transparent to the user, Redis will adjust automatically according to the actual situation. When the element in the key is changed Redis automatically converts the internal encoding of the key into a hash table. If you want to see the internal encoding of a key, you can use the "Object Encoding" command, for example:

127.0.0.1:6379> Lpush List A
(integer) 2
127.0.0.1:6379> object Encoding list
"Ziplist"

Each value of the Redis is saved using a redisobject structure, and the redisobject is defined as follows:

typedef struct REDISOBJECT 
{
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:24; /* LRU time (relative to Server.lruclock) */
    int refcount;
    void *ptr;
} RobJ;

Where the Type field represents the data type of value, the value can be as follows:

/* Object types/
#define REDIS_STRING 0
#define REDIS_LIST 1
#define REDIS_SET 2
#define Redis_zset 3< c11/> #define Redis_hash 4

The encoding field represents the internal encoding of value, which can be taken as:

#define REDIS_ENCODING_RAW 0/     * RAW representation/
#define REDIS_ENCODING_INT 1/     * encoded As Integer */
   #define REDIS_ENCODING_HT 2/      * encoded as hash table */
#define REDIS_ENCODING_ZIPMAP 3/  * encoded as ZIPMA P/
#define REDIS_ENCODING_LINKEDLIST 4/* encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5/* Encoded as Ziplist/
#define REDIS_ENCODING_INTSET 6/  * encoded as Intset */
#define Redis_encoding_ Skiplist 7/  * Encoded as Skiplist * *
#define REDIS_ENCODING_EMBSTR 8/  * Embedded SDS string ENCODING * *

The possible internal encoding for each data type and the corresponding object encoding command execution results are shown in the following table:

One: String type

Redis uses a variable of type SDSHDR to store the string, and Redisobject's PTR field points to the address of the variable. The definition of SDSHDR is as follows:

struct SDSHDR 
{
    unsigned int len;
    unsigned int free;
    Char buf[];

Note that sizeof (struct sdshdr) = = 8, which means that the last BUF member is not counted.

Where the Len field represents the length of the string, the Free field represents the remaining space in the BUF, and the BUF field stores the contents of the string.

So when the "set key Foobar" is executed, the space required to store the key value is: sizeof (redisobject) + sizeof (SDSHDR) +strlen ("foobar") = 16 + 8 + 6 = 30, as shown in the following figure:

When the content of a key value can be represented with a 64-bit signed integer, Redis converts the key value to a long type to store. such as "Set Key 12345", the actual space occupied is sizeof (redisobject) = 16 bytes, save half the storage space than the storage "Foobar", as shown in the following figure:

The RefCount field in Redisobject stores the number of references to the key value, that is, a key value that can be referenced by multiple keys. Redis starts with 10,000 redis0bject type variables as shared objects that store the numbers from 0 to 9999, respectively. If you want to set the string key value within these 10,000 digits (such as "set Key1 123"), you can directly reference the shared object without creating a redisobject, meaning that the storage key value occupies 0 bytes when the "set Key1 123" and "set" are executed. Key2 123, Key1 and key2 two keys directly refer to a shared object that has been established and save storage space. As shown in the following illustration:

This shows that the use of string type keys to store object IDs This small number is very save storage space to ask, Redis only need to store the key name and a reference to the shared object.

Note that when the maximum amount of space available Redis is set through the profile parameter maxmemory, Redis does not use the shared object, because each key value requires a redisobject to record its LRU information.

Two: Hash type

The internal encoding of a hash type may be redis_encoding_ht or redis_encoding_ziplist. In a configuration file, you can define the timing of encoding a hash type using redis_encoding_ziplist:

Hash-max-ziplist-entries
hash-max-ziplist-value  64

When the number of fields for a hash type key is less than hash-max-ziplist-entries, and each field name and field value is less than Hash-max-ziplist-value (bytes), Redis uses Redis_encoding_ Ziplist to store the key, otherwise it will use REDIS_ENCODING_HT, the conversion process is transparent, every time the key changes Redis will automatically determine whether to meet the conditions to complete the conversion.

The REDIS_ENCODING_HT encoding is a hash table that can implement an O (1) Time complexity assignment value, and its field and field values are stored using Redisobject, so the previous optimization method for string type key values also applies to the field and field values of the hash type key.

Redis's Key-value storage is also implemented by hashing, similar to REDIS_ENCODING_HT encoding, but the key name is not used for redisobject storage, so the health name "123456" does not occupy less space than "abcdef". The key names are not optimized because in most cases the key names are not pure digits.

The Redis_encoding_ziplist encoding type is a compact coding format that sacrifices partial read performance in exchange for extremely high spatial utilization and is suitable for use in less elements. The encoding type is also used in list types and ordered collection types. The redis_encoding_ziplist encoding structure is shown in the following illustration:

Where Zlbytes is the uint32_t type, representing the space occupied by the entire structure. Zltail is also a uint32_t type. Represents the offset to the last element, Zltail allows the program to navigate directly to the tail element without traversing the entire structure, and to perform operations that are ejected from the tail (in terms of the list type) faster. Zllen is the uint16_t type, the number of stored meta cables. Zlend is a single-byte identifier, the end of the tag structure, and the value is always 255.

Each element in the redis_encoding_ziplist is made up of 4 parts. The 1th part is used to store the size of the previous element for reverse lookup; 2nd, 3 parts are the encoding type and element size of the element, and the fourth part is the actual content of the element.

The elements are arranged by using redis_encoding_ziplist encoding when storing a hash type: element 1 Stores Field 1, Element 2 stores the field value 1, and so on, as shown in the following illustration:

For example, after you execute the command "Hset hkey foo bar", the next time you need to perform the "Hset hkey foo anthervalue" Redis need to find the element foo from scratch (one element is skipped each time to find the field name), and When found, deletes its next element, Maita the new value Anothervalue insert. Delete and insert all need to move behind the memory data, and find operation also need to traverse to complete, imagine when the hash key data for a long time performance will be very low, Therefore, it is not appropriate to hash-max-ziplist-entries and Hash-max-ziplist-value two parameters set very large.

Three: List type

The internal encoding of a list type may be redis_encoding_linkedlist or redis_encoding_ziplist. In the same way, you can define the timing of using redis_encoding_ziplist encoding in the configuration file:

List-max-ziplist-entries
list-max-ziplist-value  64

The exact conversion is the same as the hash type, which is not covered here.

Redis_encoding_linkedlist encoding is a two-way linked list, each element in the linked list is stored in a redisobject manner, so the method of optimizing the element value in this encoding is the same as the key value of the string type.

When using the Redis_encoding_ziplist encoding, the exact performance is the same as the hash type, and since the encoding also supports reverse access, it is still faster to obtain data at both ends when using this encoding method.

Four: Collection type

The internal encoding of a collection type may be redis_encoding_ht or redis_encoding_intset. When all of the elements in the collection are integers and the number of elements is less than set-max-intset-entries in the configuration file (default is 512), Redis stores the collection using Redis_encoding_intset encoding, or it uses Redis_ Encoding_ht to store.

The definition of REDIS_ENCODING_INTSET encoded storage structure Intset is:

typedef struct INTSET {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} Intset;

Where contents stores the element values in the collection, each element uses a different byte size depending on the encoding. The default encoding is intset_enc_int16 (2 bytes), and when the newly added integer element cannot be represented with 2 bytes, Redis upgrades the encoding of the collection to Intset_enc_int32 (that is, 4 bytes). and adjusts the position and length of all previous elements, the same set of encoding can also be upgraded to Intset_enc_int64 (that is, 8 bytes).

The Redis_encoding_intset encoding stores the elements in an orderly fashion (so the results obtained with the smembers command are ordered), making it possible to find elements using the binary algorithm. However, whether you add or remove elements, Redis needs to adjust the memory position of the elements that follow, so when the elements in the collection are too long, performance is poor.

When the newly added element is not an integer, or the number of elements in the collection exceeds the Set-max-intset-entries parameter, Redis automatically converts the storage structure of the collection to Redis_encoding_ht.

Note When the storage structure of a collection is converted to REDIS_ENCODING_HT, Redis does not automatically convert the storage structure back to Redis_encoding_intset, even if all the non-integer elements in the collection are removed. Because if you want to support automatic rotation, it means that every time you delete an element, you need to traverse the Redis of the collection to determine if it can be converted back to the original encoding, which makes the deletion of the element a time complexity O (n) operation.

Five: Ordered collection type

The internal encoding of an ordered collection type may be redis_encoding_skiplist or redis_encoding_ziplist. Also in the configuration file you can define the time to encode using the Redis_encoding_ziplist method:

Zset-max-ziplist-entries  128
zset-max-ziplist-value  64

Specific conversion rules and hash types and list types, no longer repeat.

When the encoding is Redis_encoding_skiplist, Redis uses the hash list and the skip list two data structures to store the ordered set type key values, where the hash table stores the mapping relationship between the element value and the element fraction to achieve O (1) Time complexity of Zscore and other commands. A Jump List is used to store the score of an element and its mapping to the value of an element to achieve the ability to sort. Redis a few changes to the implementation of the Jump List, including allowing the elements in the Jump List (that is, fractions) to be the same, and adding a pointer to the previous element for each node in the Jump List to achieve reverse lookup.

When this encoding is used, the element value is stored using Redisobject, so you can optimize the element value using the optimization of the string type key value, and the element's score is stored using the double type.

When using Redis_encoding_ziplist encoding, ordered set storage is arranged in the order of "element 1, Element 1, Element 2, Element 2, and the score is ordered."

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More