Within Redis, data structure type values are supported by efficient data structures and algorithms, and are used extensively in the construction of Redis itself.
This section describes the data structures and algorithms used by Redis memory.
Dynamic string
Sds (Simple dynamic string
The main roles of SDS in Redis are the following two:
1. Implement a String object (Stringobject);
2. As a substitute for the char* type within the Redis program;
In contrast to the C string, SDS has the following characteristics:
– can perform length calculations efficiently (strlen);
– The ability to perform additional operations efficiently (append);
– binary security;
SDS is optimized for append operations: Speed up the append operation and reduce the number of memory allocations, at the cost of taking up more memory and not being actively freed.
typedef Char *sds;
structSdshdr {
//buf occupied length
intLen;
//BUF remaining available length
intfree;
// where the string data is actually saved
Charbuf[];
};
# If the total length of the new string is less than sds_max_prealloc
# then allocate 2 times the required length of space for the string
# Otherwise, allocate the required length plus the amount of sds_max_prealloc space
Double-ended linked list
Most C programs will implement a list type on their own, and Redis is no exception. Double-ended lists are also one of the underlying implementations of the Redis list type
Note: The Redis list uses two types of data structures as the underlying implementation:
1. Double-ended linked list
2. Compression list
Because the double-ended list uses more memory than the compression list, when you create a new list key, it takes precedence
The compression list is used as the underlying implementation, and when necessary, the conversion from the compression list to the double-ended chain table is implemented.
In addition to implementing the list type, the doubly linked list is also used by many Redis internal modules:
• The transaction module uses a double-ended linked list to sequentially save the input commands;
• The server module uses a double-ended linked list to hold multiple clients;
• The subscription/Send module uses a double-ended list to hold multiple clients of the subscription pattern;
• The event module uses a double-ended list to hold the time event;
typedefstructList {
// table head pointer
Listnode*head;
// footer pointer
Listnode*tail;
// Number of nodes
unsigned long len;
// copy function
void* (*dup) (void *ptr);
// release function
void (*free) (void *ptr);
// alignment function
int (*match) (void *ptr, void *key);
} list;
Redis implements an iterator for a double-ended list that iterates over a doubly-linked list in two directions:
The performance characteristics of the double-ended linked list and its nodes are as follows:
– The node has precursors and successors, the complexity of accessing the precursor and subsequent nodes is O(1), and the linked list
The iteration can be made from the header to the footer and from the footer to the table head in two directions;
– The list has pointers to the header and footer, so the complexity of processing the header and footer is O(1);
– The list has attributes that record the number of nodes, so the number of nodes in the list can be returned in O(1) complexity (long
degree);
Dictionary
The Dictionary (dictionary), aka Mapping (map) or associative array (associative array), is widely used in Redis and is comparable to SDS and double-ended lists, and basically each function module is useful to the dictionary.
Among them, the main use of the dictionary has the following two:
1. Implement the database key space (key spaces);
2. One of the underlying implementations used as a hash type key;
The following two subsections describe each of these uses.
The hash type key for Redis uses the following two data structures as the underlying implementation:
1. Dictionaries;
2. Compression list;
Because the compression list is more memory-efficient than a dictionary, the program uses a compressed list as the underlying implementation when creating a new hash key, and the program converts the underlying implementation from a compressed list to a dictionary when needed.
Redis chooses an efficient and easy-to-implement hash table as the underlying implementation of the dictionary.
/*
* Dictionary
* * use two hash tables per dictionary for progressive rehash
*/
typedefstructDict {
// type-specific processing functions
Dicttype*type;
Private data for the // Type handler function
void*privdata;
// hash Table (2 )
DICTHTHT[2];
// record rehash progress flag with a value of -1 indicates rehash not performed
intrehashidx;
// number of security iterators currently in operation
intiterators;
} dict;
Hash Table Implementation
The hash table implementation used by the dictionary is defined by the DICT.H/DICTHT type:
/*
* Hash table
*/
typedefstructdictht {
// Hash table node pointer array (commonly known as buckets,buckets)
dictentry**table;
// size of the pointer array
unsigned long size;
The length mask of the // pointer array used to calculate the index value
unsigned long sizemask;
// hash shows the number of nodes
unsigned long used;
} dictht;
Each dictentry holds a key-value pair, and a pointer to another dictentry structure:
/*
* Hash Table node
*/
typedefstructDictentry {
// key
void*key;
// value
Union {
void*val;
uint64_t U64;
int64_t S64;
} V;
// Chain backward node
structdictentry*next;
} dictentry;
Redis currently uses two different hashing algorithms:
1. MURMURHASH2 algorithm: This algorithm distribution rate and speed are very good, specific information please refer to MurmurHash's homepage: http://code.google.com/p/smhasher/.
2. A case-insensitive hashing algorithm based on the DJB algorithm: For specific information, refer to
Http://www.cse.yorku.ca/~oz/hash.html.
The collision resolution used by the dictionary hash table is called the chain address method:
One of the differences between dictionary contractions and dictionary extensions is:
• The extended operation of the dictionary is triggered automatically (whether it is auto-extended or force-extended);
• The shrink operation of the dictionary is performed manually by the program.
A dictionary is an abstract data structure composed of key-value pairs.
The database and hash keys in Redis are based on dictionaries.
The underlying implementation of the Redis dictionary is a hash table, with two hash tables per dictionary, typically using only the No. 0 hash table, and the No. 0 and 1th hash tables are used only when rehash is in progress.
• Hash tables Use the chain address method to resolve key conflict issues.
rehash can be used to extend or shrink a hash table.
• The rehash of a hash table is carried out in multiple, progressive way.
Jumping table
Its efficiency can be compared with the balance tree--Find, delete, add, etc. can be completed in the logarithmic expected time,
And the realization of the jumping table is much simpler and more straightforward than the balance tree.
• Header (head): The node pointer responsible for maintaining the jumping table.
• Skip Table node: Holds element values, as well as multiple layers.
• Layer: Holds pointers to other elements. High-level pointers over the number of elements greater than or equal to the lower level of the pointer, in order to improve the efficiency of the search, the program always start from the top of the access, and then as the value of the element narrowed, slowly lower the hierarchy.
• Footer: All consists of NULL, representing the end of the jumping table.
Look at the picture image:
1) Find simple: For example, to find 5, the first layer is not found, the second level of positioning 4-6, and then a layer to find 5.
2) What about the insertion algorithm? I'm still not sure how to make it.
The only function of a jumping table in Redis is to implement an ordered set of data types.
The skip table will point to the score value of the ordered set and a pointer to the member field as the element, and the ordered set elements are sorted with the score value as an index.
To accommodate its own needs, Redis has been modified based on the jump tables described in the William Pugh paper, including: 1. Score values can be duplicated.
2. Comparing an element requires checking its score and memeber at the same time.
3. Each node has a back pointer with a height of 1 layers, which iterates from the footer direction to the head direction of the table.
Redisbook notes--redis Internal data structure