In-depth introduction to the Redis-redis underlying data structure (I), in-depth introduction to redis-redis

Source: Internet
Author: User
Tags rehash redis cluster

In-depth introduction to the Redis-redis underlying data structure (I), in-depth introduction to redis-redis
1. Overview

I believe everyone who has used Redis knows that Redis is a key-value-based distributed storage system, similar to Memcached, but it is better than Memcached's high-performance key-value database.

    

InRedis Design and ImplementationDescription:

EveryKey-value Pair)BothObject)Composed:

The database key is always a string object );

Database values can be string objects, list objects, hash objects, set objects, and sorted sets) one of the five objects.

 

Why do we say that Redis is better than Memcached? Because of the emergence of Redis, the lack of key-value storage in memcached is enriched. In some cases, it can complement relational databases, in addition, these data types support push/pop, add/remove, Intersection Set and difference set, and more abundant operations, and these operations are atomic.

    

What we are discussing today is not the value data type in Redis, but their specific implementation --Underlying data type.

    RedisThe underlying data structure has the following data types:

          

Next, we will further explore the characteristics of these data structures and how they have formed the value data type we use.

 

2. simple dynamic string (simple dynamic string) SDS 2.1 Overview

Redis is an open-source key-value database written in the ansi c language. We may think that the strings in Redis are represented by the traditional strings in the C language, but they are not, redis does not directly use a traditional string representation in the C language. Instead, it builds an abstract type named simple dynamic string SDS and uses SDS as the default string representation of Redis:

redis>SET msg "hello world"OK

Set a new key-value pair with key = msg and value = hello world. The underlying data structure will be:

The key is a string object, and the underlying Implementation of the object is an SDS that stores the string "msg;

Value is also a String object. The underlying Implementation of the object is an SDS that stores the string "hello world ".

 

From the above example, we can intuitively see what type of data we create when using redis. In addition to storing strings, SDS is also used as the AOF buffer in the buffer AOF module.

 

2.2 SDS Definition

Redis defines the structure of dynamic strings:

/** Save the structure of the string object */struct sdshdr {// The length of occupied space in the buf int len; // The length of available space in the buf int free; // data space char buf [];};

   

 

1. len variable, used to record the space length used in the buf (the Redis length is 5)

2. The free variable is used to record the free space in the buf. (The space is allocated for the first time. Generally, there is no space available. When modifying the string, there will be space available)

3. buf character array, used to record our strings (record Redis)

 

 

2.3 differences between SDS and C strings

    The traditional C string uses a string array with a length of N + 1 to represent a string with a length of N. In this way, it is inefficient in obtaining String Length, string extension, and other operations. Using this simple string representation in C language does not meet Redis's security, efficiency, and functional requirements for strings.

2.3.1 obtain the string length (sds o (1)/C string O (n ))

The traditional C string uses a string array with a length of N + 1 to represent a string with a length of N. Therefore, to obtain a string with a length of C, the entire string must be traversed.

Different from the C string, there are variables in the SDS data structure specifically used to save the string length. We can get the value of the len attribute to know the string length directly.

    

 

2.3.2 prevent Buffer Overflow

C stringThe length of the string is not recorded. In addition to the high complexity, the buffer overflow is also easily caused.

Assume that there are two strings s1 and s2 in the program that are adjacent to the memory. s1 stores the string "redis", and s2 stores the string "MongoDb ":

      

If we change the content of s1Redis clusterBut I forgot to allocate enough space for s1 again. At this time, the following problems will occur:

      

We can see that the original content in s2 has been occupied by the content in S1, and s2 is now a cluster instead of a "Mongodb ".

 

     RedisThe space allocation policy of SDS completely eliminates the possibility of buffer overflow:

When we need to modify an SDS, redis will pre-check whether the space of the given SDS is sufficient before performing the splicing operation. If not, it will first expand the space of the SDS, then perform the merging operation.

 

 

2.3.3 reduce the number of times the memory is reallocated when the string is modified   

C-language strings are subject to the re-allocation of memory space during string expansion and contraction.

1. string concatenation will expand the memory space of the string. During the concatenation process, the size of the original string may be smaller than the size of the merged string. In this case, this will cause memory overflow once you forget to apply for space allocation.

2. when the string is reduced, the memory space will be reduced accordingly. If the memory space is not re-allocated when the string is cut, therefore, the extra space becomes Memory leakage.

  For example:We need to expand the following SDS, then we need to expand the space. At this time, redis will change the SDS length to 13 bytes, And the unused space will also be changed to 1 byte.

  

Because the space has been extended during the last string modification, the space is sufficient when the string is modified again, so there is no need to expand the space.

  

 

Using this pre-allocation policy, SDS reduces the number of times of memory reallocation required for consecutive string growth from a certain N times to a maximum of N times.

 

2.3.4 release of inert space

    When we observe the structure of SDS, we can see the free attribute in it, which is used to record the free space. In addition to expanding the string, we use free to record the free space. When shrinking the string, we can also use the free property to record the remaining space, the advantage of doing so is to avoid the need to expand the space of the string when the string is modified again next time.

However, we do not mean that we cannot release the empty space of SDS. SDS provides the corresponding API so that we can release the free space of SDS as needed.

Through the release of the inert space, SDS avoids the Memory re-allocation operations required to shorten the string, and does not provide optimization for some future growth operations.

 

 

2.3.5 binary Security

    The characters in the C string must conform to a certain encoding. Besides the end of the string, the string cannot contain null characters. Otherwise, the null characters first read by the program will be mistakenly considered as the end of the string, these restrictions allow C strings to only store text data, but not binary data such as images, audios, videos, and compressed files.

However, in Redis, the end of a string is determined not by a null character, but by the len attribute. So, even if there is an empty character in the middle, it is still acceptable for SDS to read this character.

For example:

 

 

2.3.6 compatible with some C string functions

Although SDS APIs are binary secure, they follow the same convention that C strings end with an empty string.

 

2.3.7 Summary

 

C string SDS
The complexity of obtaining the string length is O (N) The complexity of obtaining string length is O (1)
The API is insecure and may cause buffer overflow. The API is safe and does not cause buffer overflow.
To modify the string length N times, you must execute N times of memory reallocation. Modify the string length N times and execute up to N times memory reallocation
Only text data can be saved. Binary data and text data can be saved
All functions in the <String. h> library can be used. You can use some functions in the <string. h> library.

 

 

3. Linked List

 

 

3.1 Overview

  The linked list provides efficient node shuffling capabilities and sequential node access methods. You can also flexibly adjust the length of the Linked List by adding or deleting nodes.

The linked list is widely used in Redis. For example, one of the underlying implementations of the list key is the linked list. When a list key contains a large number of elements, or all the elements in the list are long strings, Redis uses the linked list as the underlying implementation of the list key.

  

3.2 data structure of the linked list

Each linked list node uses oneListNodeStructure Representation (adlist. h/listNode ):

typedef struct listNode{      struct listNode *prev;      struct listNode * next;      void * value;  }

 

A double-ended linked list composed of multiple linked list nodes:

  

    

You can directly operateListTo operate the linked list more conveniently:

Typedef struct list {// listNode * head of the header node; // listNode * tail of the End Node of the table; // The length of the chain table unsigned long len; // node value replication function void * (* dup) (void * ptr); // node value release function void (* free) (void * ptr ); // node Value Comparison function int (* match) (void * ptr, void * key );}

ListStructure:

 

 

3.3 features of linked list
  • Double-ended: The linked list node carries the prev and next pointers to obtain the time complexity of the front and back nodes of a node is O (N)
  • No loop: both the prev pointer of the header node and the next of the end node point to NULL. when accessing the filing table, NULL is used as the deadline.
  • Table header and table end: Because the linked list has the head pointer and tail pointer, the time complexity for the program to obtain the head and end nodes of the linked list is O (1)
  • Length counter: The chain table contains the property len of the length of the record chain table.
  • Polymorphism: The linked list node uses the void * pointer to save the node value. You can set the type-specific function for the node value through the dup, free, and match attributes in the list structure.

 

 

4. Dictionary

 

  

 

4.1 Overview

A dictionary, also known as a symbol table, an associated array, or a map, is an abstract data structure used to save key-value pairs.

In a dictionary, a key can be associated with a value. Each key in the dictionary is unique. This data structure is not available in C,Redis builds its own dictionary implementation.

A simple example:

redis > SET msg "hello world"OK

Creating such a key-Value Pair ("msg", "hello world") is stored in a database in the form of a dictionary.

 

 

4.2 definition of dictionary 4.2.1 hash table

   The hash table used by the Redis dictionary is defined by the dict. h/dictht structure:

Typedef struct dictht {// hash table array dictEntry ** table; // hash table size: unsigned long size; // hash table size mask, used to calculate the index value: unsigned long sizemask; // number of existing nodes in the hash table: unsigned long used ;}

 

The structure of an empty dictionary is as follows:

We can see that there is a pointer to the dictEntry array in the structure, and the space we use to store data is both dictEntry

4.2.2 hash table node (dictEntry)

DictEntry structure definition:

Typeof struct dictEntry {// key void * key; // value union {void * val; uint64_tu64; int64_ts64 ;}
Struct dictEntry * next ;}

 

In the data structure, we know that the key is unique, but the key we store in is not a direct string, but a hash value. Through the hash algorithm, convert the string to the corresponding hash value, and find the corresponding position in dictEntry.

At this time, we will find a problem. What if the hash value is the same? Redis adoptsLink address method:

   

When the hash values of k1 and k0 are the same, point next in k1 to k0 as a linked list.

 

4.2.3 dictionary
Typedef struct dict {
// Type-specific function dictType * type;
// Private Data void * privedata;
// Hash table dictht ht [2]; // rehash index in trehashidx ;}

 

The type and privdata attributes are for key-value pairs of different types and are set to create a multi-state dictionary.

The ht attribute is an array containing two items (two hash tables)

Dictionary in normal state:

  

 

 

4.3 resolve hash conflicts

When analyzing hash nodes, we have mentioned that when a new data is inserted, the hash value is calculated. If the hash value is the same, redis uses the separate chaining method to resolve key conflicts. Each hash table node has a next pointer. Multiple hash table nodes can use next to form a one-way linked list, multiple nodes allocated to the same index can use this one-way linked list to solve the hash value conflict problem.

For example:

The hash table now has the following data: k0 and k1

We want to insert k2 now, and use the hash algorithm to calculate that the hash value of k2 is 2, that is, we need to insert k2 into dictEntry [2:

After insertion, we can see that dictEntry points to k2, and next of k2 points to k1, this completes the insert operation (the insert header is selected here because the position of the End Node of the linked list is not recorded in the hash table node)

 

 

4.4 Rehash

As the hash table continues to operate, the key-value pairs stored in the hash table will gradually change. To keep the load factor of the hash table within a reasonable range, we need to expand or compress the size of the hash table. At this time, we can perform the rehash operation.

 

4.4.1 current hash table status:

We can see that each node in the hash table has been used. At this time, we need to expand the hash table.

  

4.4.2 allocate space for the hash table

Hash tablespace allocation rules:

If an extended operation is performed, the size of ht [1] is the first n power that is greater than or equal to 2 of ht [0 ].

If a contraction operation is performed, the size of ht [1] is the first n power that is greater than or equal to 2 of ht [0 ].

Therefore, we allocate 8 space for ht [1,

  

 

4.4.3 Data Transfer

Transfer the data in ht [0] to ht [1]. During the transfer, you need to re-calculate the hash value of the data on the hash table node.

Result After data transfer:

  

 

 4.4.4 release ht [0]

    Release ht [0], set ht [1] to ht [0], and allocate a blank hash table for ht [1:

  

 

  4.4.5 progressive rehash

    As mentioned above, during expansion or compression, all key-value pairs can be directly rehash to ht [1], because the data volume is small. In the actual development process, this rehash operation is not completed in a one-time and centralized manner, but completed multiple times and progressively.

Detailed steps for progressive rehash:

1. Allocate space for ht [1] so that the dictionary can hold both the ht [0] and ht [1] hash tables.

2. Maintain an index counter variable rehashidx in a few minutes and set its value to 0, indicating that rehash starts.

3. During rehash, each time you perform the CRUD operation on the dictionary, the program will rehash the data in ht [0] To the ht [1] Table in addition to the specified operation, and add rehashidx to one

4. When all data in ht [0] is transferred to ht [1], set rehashidx to-1, indicating that rehash ends.

    

The advantage of progressive rehash is that it adopts a divide-and-conquer method, avoiding the huge computing workload caused by centralized rehash.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.