Redis is a key-value storage system that is now used more and more in a variety of systems, mostly because of its high-performance features, and is used as a cache. Redis can also be applied to other scenarios because of its rich data structure. Redis is a k-v database (NoSQL), a common NoSQL database: K-v database such as Redis, Memcached, column database such as large data component HBase, document database such as Mogodb. Redis is widely used, especially as a cache.
Redis has a number of advantages:
(1) High reading and writing performance-100,000 times/s read speed, 80,000 times more than the write speed;
(2) The data types supported by K-v,value are many: string, queue (List), hash (hash), set (Sets), ordered set (Sorted Sets) 5 different data types.
(3) Atomicity, all operations of the Redis are of a single threaded atomic nature.
(4) Features rich--support subscription-publishing mode, notification, set key expiration and other characteristics.
(5) The Redis cluster is introduced into the Redis3.0 version and can be used for distributed deployment. Redis data type and its underlying implementation method Redis is written in C language. Redis supports 5 types of data, stored in k-v form, K is of type string, and V supports 5 different data types, respectively: string,list,hash,set,sorted set, each of which has its own specific application scenario. From the internal implementation point of view is how to better implement these data types. Redis the underlying data structures are of the following types: Simple dynamic strings (SDS), linked lists, dictionaries, jump tables, integer sets, compressed lists, objects. Next, let's explore how Redis uses these data structures to implement the 5 kinds of value. A simple dynamic string (plain dynamical string SDS) The data type of string is implemented by SDS. Instead of using a C-string representation, Redis built a abstract type called SDS and expressed SDS as the default string for Redis.
Redis>set msg "Hello World"
OK
The top set Key=msg,value=hello world's key value pairs, their underlying storage is: key (keys) is a string type, its underlying implementation is a storage of "MSG" SDS. Value is a string type whose underlying implementation is a SDS that holds "Hello World." Note: SDS is used in addition to implementing string types and is used as a buffer for aof persistence. The definition of SDS is:
* * Save the structure of the string object * *
struct SDSHDR {
//buf space occupied length
int len;
The length of the remaining free space in BUF
int.
Data space
Char buf[];
Why to use SDS:
We will certainly think, redis why not use the C language string but trouble to do a SDS, this is because the C language with a n+1 character array to represent the string of length n, doing so in order to get the length of the string, string extension, and other operations less efficient, and cannot meet the security, efficiency, and functional requirements of redis for strings.
Get string Length (SDS O (1))
In the C language string, in order to get the length of a string, the entire string must be traversed, the time complexity is O (1), and the SDS has a variable specifically for holding the string length, so it can be obtained in O (1) time.
Prevent buffer Overflow
C string, easy to cause buffer overflow, assuming that there is memory in the program adjacent to the string S1 and S2,s1 save Redis,s2 Save MongoDB, as shown in the following figure:
If we now modify the contents of S1 to Redis cluster, but forget to allocate enough space for S1, the following problems occur:
Because S1 and S2 are adjacent, the contents of the original S2 have been occupied by the S1 content, S2 is now cluster, not "Mongodb". The SDS in Redis eliminates the possibility of a buffer overflow.
When we need to modify a SDS, Redis will check the given SDS space before performing the stitching operation (Free records the remaining available data length), if not enough, will expand the SDS space, and then perform stitching operation.
reduce the number of memory allocations that are caused by expanding or shrinking strings
When a string expands or shrinks, the memory space is reassigned.
1. String concatenation will produce a string of memory space expansion, in the stitching process, the original string size is likely to be less than the size of the string after the concatenation, so that, if you forget to apply for allocation space, will lead to memory overflow.
2. The string in the contraction, the memory space will be the corresponding contraction, and if the string in the cutting, there is no memory space for a redistribution, then this part of the extra space to become a memory leak.
For example: string "Redis", when the string concatenation, will be redis+cluster=13, the length of the SDS will be modified to 13, while the free also changed to 13, which means that the pre-allocation, the buffer size into 26. This is so that if the string concatenation operation is performed again, if the string length of the concatenation is <13, there is no need to reassign the memory.
With this pre-allocation strategy, the amount of memory reallocation required by SDS to increase the number of consecutive N-times strings is reduced from a certain n times to up to N times. With inert space release, SDS avoids the memory reallocation required to shorten the string, and does not provide optimizations for possible future growth operations.
Binary security
The characters in the C string must conform to some sort of encoding. And in addition to the end of the string, the string cannot contain null characters, otherwise the first empty character read by the program will be mistaken for the end of the string, which makes the C string can only save text data, and cannot save images, audio, video, Binary data such as compressed files.
In Redis, however, the end of the string is not judged by the null character, but by the attribute of Len. So, even if there's a null character in the middle. For SDS, it is still possible to read the character. However, SDS can still be compatible with partial C string functions. chain list is one of the ways to realize list. When the list contains a large number of elements, or the lists contain elements that are relatively long strings, Redis uses a linked list as the underlying implementation of the list. This list is a two-way list:
typedef struct listnode{
struct listnode *prev;
struct ListNode * NEXT;
void * value;
}
In general, we operate the list by manipulating lists:
typedef struct list{
//Table header node
listnode * head;
Table Tail node
listnode * tail;
Chain table length
unsigned long len;
Node value copy function
void * (*dup) (void *ptr);
Node value free function
void (*free) (void *ptr);
Node value contrast function
int (*match) (void *ptr, void *key);
The feature of the list structure is that it is possible to insert and delete elements quickly in the header and footer, but finding complexity is one of the underlying implementations of the list, and therefore lists do not provide an excuse to determine whether an element is in the list because it is more complex to find in a linked table.
Dictionaries
Dictionary, also known as symbol table, associative array (associative array), or map (map), is an abstract data structure used to hold key-value pairs.
In the dictionary, a key can be associated with a value (value), and each key in the dictionary is unique. In the C language, there is no such data structure, but the redis constructs its own dictionary implementation.
Redis > SET msg "Hello World"
OK
Redis itself K-V storage is the use of dictionary of this data structure, another value type of hash table is also implemented through this. The hash table dicy is defined as:
typedef struct DICTHT {
//hash table array
dictentry **table;
Hash table size
unsigned long size;
A hash table size mask used to compute the index value
unsigned long sizemask;
The hash table already has the number of nodes
unsigned long used;
}
We can think of the implementation of the comparison Java HashMap, in dictht, the type of table array is:
typeof struct dictentry{
//key
void *key;
Value
union{
void *val;
Uint64_tu64;
Int64_ts64;
}
struct dictentry *next;
}
We deposit the key is not a direct string, but a hash value, through the hash algorithm, the string into the corresponding hash value, and then find the corresponding location in the dictentry.
At this point we will find a problem, if the hash value of the same situation. Redis uses the chain address method to solve the hash conflict. This is similar to the HASHMAP implementation.
Note: Redis is also based on DICTHT and abstracts a layer of dictionary dict, which is defined as:
typedef struct DICT {
//type-specific functions
dicttype *type;
Private data
void *privedata;
Hash table
dictht ht[2];
Rehash index in
trehashidx;
}
The Type property and the Privdata property are set to create a polymorphic dictionary for different types of key-value pairs.
The HT property is an array that contains two items (two hash tables), as shown in the figure:
Solve hash conflict : Using chain address method to achieve. expansion Rehash: as the hash table continues to operate, the hash table to save the key value will gradually change, in order to keep the load factor of the hash table within a reasonable range, we need to the size of the hash table to expand or compress, at this time, we can through Rehash (re-hashing) operation to complete. Its implementation and HashMap slightly different, because Dict has two hash table dictht, so it is through these two dictht transfer each other. Like what:
In this case, the representative is going to expand, so it is necessary to transfer the ht[0] data to the ht[1]. HT[1] is created as a 2*ht[0].size size, as shown in the following figure:
Release Ht[0], then set the ht[1 to Ht[0], and then assign a blank hash table to ht[1:
In fact, the expansion of the above process and the Java HashMap specific expansion of the implementation of the method is quite similar.
Progressive Rehash: in the actual development process, this rehash operation is not a one-time, centralized completion, but several times, step-by-step completion. The advantage of adopting the progressive rehash is that it adopts a divide-and-conquer approach, avoiding the huge amount of computation that the centralized rehash brings.
Step-by-Step steps for Rehash:
1, for ht[1] to allocate space, so that the dictionary holds both ht[0] and ht[1] two hash tables
2, in the time to maintain an index counter variable rehashidx, and its value set to 0, indicating that rehash started
3. During rehash, each time a crud operation is performed on the dictionary, the program will rehash the data in ht[0] to the Ht[1 table in addition to the specified operation, and will rehashidx add a
4. When all data in ht[0] is transferred to Ht[1], the REHASHIDX is set to 1, which means the end of rehash
Jump Table Redis only used a jump table in two places, one to implement an ordered set of keys (sorted Sets), and the other to use as an internal data structure in a cluster node.
In fact, the main jump table to replace the balance of the binary tree, compared to the balance tree, the realization of the jump table is more simple and intuitive.
A jump Table (skiplist) is an ordered data structure that maintains multiple pointers to other nodes in each node to achieve the goal of fast lookup access nodes. Jump table is a random data, jump table in an orderly manner in the hierarchical chain of the elements, efficiency and balance tree-Search, delete, add, etc. can be done in O (Logn) expected time.
Redis's jump table consists mainly of two parts: zskiplist (linked list) and Zskiplistnode (node):
typedef struct zskiplistnode{
//layer
struct zskiplistlevel{
/forward pointer
struct ZSKIPLISTNODE >//span
unsigned int span;
} level[];
The back pointer
struct Zskiplistnode *backward;
Score
double score;
Member Object
robj *obj;
}
1, Layer: Level array can contain multiple elements, each containing a pointer to another node. Each element of the level array contains: Forward pointer: A forward pointer to the end of the table, span: Used to record the distance between two nodes
2. Back pointer: Used to access nodes from the end of the table to the header direction
3, Points and members: all the nodes in the jumping table are sorted by the score from small to large (sorted by this, that is, the node size of the balance binary tree (search tree)). The member object points to a string that holds a SDS value (the actual stored value)
typedef struct ZSKIPLIST {
//Table head node and footer node
structz skiplistnode *header,*tail;
The number of nodes in the table
unsigned long length;
The layer int level of the largest node in the table middle number
;
} Zskiplist;
From the structure diagram, we can see clearly that the header,tail points to the head node and tail nodes of the jump table respectively. Level is used to record the maximum number of layers, and length is used to record the number of our nodes.
Jump table is one of the low-level implementations of ordered sets
There are mainly zskiplist and zskiplistnode two structure of each jump table node of the layer height is 1 to 32 of the random number in the same jump table, multiple nodes can contain the same score, but the object of each node must be the only node sorted by the size of the score from big to small, if Values are the same, sorted by member object size
How to use the Jump table to achieve O (logn) additions and deletions change check.
In fact, the principle of the realization of the jump table, we can combine the two-point method to see.
Like above, we're going to look for 55, and if we traverse, you have to traverse to the last to find, so in the array implementation, we can use the binary to achieve, but in the list, we can not directly through the subscript to access the elements, so we generally use binary search tree, balance tree to store elements, We know that the jump table is to replace the balance tree, so the jump table is how to quickly query. Look at the picture below:
As we can see from the image above, we'll be able to find 55 in a single step through the 4th tier, and the most time-consuming access 46 will require 6 queries. That is, L4 access to 55,L3 Access 21, 55,L2 Access 37, 55,L1 Access 46. We intuitively believe that such a structure would make it quicker to query an element of an ordered list. This approach is similar to the two-point, whose time complexity is O (logn). Its inserts, deletes are all O (Logn).
We can see that Redis is precisely by defining this structure to achieve the top of the process, its highest level of 32 layers, that is, he can store 2^32 data, the search process is similar to the above figure. integer Set (intset)
In Redis design and implementation, this defines an integer set: "An integer set is one of the underlying implementations of a set build (sets), and when a collection contains only integers and the number of elements in the collection is not long, Redis uses an integer collection Intset as the underlying implementation of the collection." ”
We can understand an integer set like this, he is actually a special set, the data stored in it can only be integers, and the amount of data can not be too large.
typedef struct intset{
//Coding mode
uint32_t enconding;
The number of elements contained in the collection
uint32_t length;
Save an array of elements
int8_t contents[];
}
A set of integers is one of the underlying implementations of a set build.
The underlying implementation of an integer set is an array that saves the collection elements in an orderly, no repetitive form, and, if necessary, the program changes the type of the array according to the newly added element type. Compress list