DICT Data Structure Dict is actually a hash table, but there is already a data structure called Hash in Redis. So, rename the Hash table to Dict... Dict is the soul of Redis's key value processing. No matter how large the data volume is, it always maintains the time complexity of O (1) (excluding the long chain table under the bucket) all the keys saved globally,
DICT Data Structure Dict is actually a hash table, but there is already a data structure called Hash in Redis. So, rename the Hash table to Dict... Dict is the soul of Redis's key value processing. No matter how large the data volume is, it always maintains the time complexity of O (1) (excluding the long chain table under the bucket) all the keys saved globally,
DICT Data Structure
Dict is actually a hash table, but there is already a "Hash" Data Structure in Redis. So, rename the Hash table to Dict...
Dict is the soul of Redis's key value processing. No matter how big the data volume is, it always maintains O (1) time complexity (excluding the long chain table under the bucket)
All globally saved keys exist in a Dict.
Other data structures, such as set and hash, may also use Dict.
Dict is implemented in dict. h dict. c.
Its type is defined as follows:
1. dict: indicates an independent dict structure, which is provided for external use.
1 typedef struct dict {* privdata; rehashidx; iterators;} dict;
2. dictht: indicates an independent dict container for internal use. External programs are not recommended to directly operate the structure.
1 typedef struct dictht {unsigned} dictht;
3. dictEntry: a data node rented from a Hong Kong server. It is actually a kv key-Value Pair and contains a next pointer.
1 typedef struct dictEntry {2 void * key; 3 void * val; 4 struct dictEntry * next; 5} dictEntry;
4. dictType: defines a set of callback functions for data node operations.
Typedef struct dictType {unsigned * (* keyDup) (* valDup) (* keyCompare) (* key2); (* keyDestructor) (* valDestructor) (void * privdata, void * obj); // destroy val} dictType;
DICT operations
In Redis, dict is a standard "bucket + open chain" hash table.
No more complex processing
This includes preventing too long chain opening due to hash conflicts, and is not considered.
If you carefully construct a string of keys to hit redis, it is easy to kill
Therefore, if your Redis service is Open to users, don't use the next source code. Instead, change the HashFunction and use it!
Redis uses two dictht structures to incrementally export data and prevent too long congestion during Rehash.
This method is already used in memcache, but memcache only opens a thread dedicated to rehash.
In contrast, the thread-less processing method does not need to be locked, and there are fewer bugs.
Namespace
Dict in Redis is divided into two types:
1. System-level Dict, with a global namespace, is defined as follows:
Typedef struct redisDb {dict * dict; dict * expires; dict * blocking_keys; dict * io_keys; dict * watched_keys; id;} redisDb;
2. Application-level Dict, maintained by the metadata data structure, mainly the dict in some set and hash Structures
For example:
Rehash
When the following conditions are met, Rehash is enabled.
1 // when the effective space usage is <10%, htNeedsResize (dict * dict) {size, used; 5 6 size = dictSlots (dict); 7 used = dictSize (dict ); 8 return (size & used & size> DICT_HT_INITIAL_SIZE & 9 (used * 100/size <REDIS_HT_MINFILL); 10}
1 // when the valid space usage is greater than 100%, _ dictExpandIfNeeded (dict * d) 4 {5 ...... (d-> ht [0]. used> = d-> ht [0]. size & 8 (dict_can_resize | 9d-> ht [0]. used/d-> ht [0]. size> dict_force_resize_ratio) 10 {11 return dictExpand (d, (d-> ht [0]. size> d-> ht [0]. used )? 12d-> ht [0]. size: d-> ht [0]. used) * 2); 13} 14 return DICT_ OK; 15}
After Rehash is started, Rehash is started.
However, the cost of Rehash is very high, especially when the capacity exceeds 10 million, it usually takes dozens of seconds to operate (depending on the machine performance)
Therefore, Redis uses progressive Rehash to split operations step by step, so it cannot block user response.
Different Rehash policies are used based on different Dict types:
1. The Global DICT structure (that is, the key in the global namespace) will be rehash cyclically, each time 1 ms
In addition, it can be executed without the interference of the SafeIterator mentioned later (however, it is the same for a virtual host to be uninterrupted. In the iterator loop space, the Safe mode is used. Therefore, A lot of SafeIterator for global dict will also be seen in the source code, which needs to be understood)
After all, the global one is important. We need to squeeze out 1 ms for the squeeze. Use it! I also want to disturb it, so don't be discouraged.
2. Application-level DICT structure (some user-defined DICT), Redis will adopt a Lazy Rehash Policy
The so-called Lazy Rehash means that the more you use, the faster the processing; the less you use, the slower the processing.
What is "use?
It is easy to understand that "add, delete, and query" operations are all called. The corresponding operations in the source code are dictAdd, dictGenericDelete, dictFind, and dictGetRandomKey, which will promote the _ dictRehashStep function for Rehashing.
But don't be too happy. Only one item is triggered at a time. So, come on ~~
Iterator
Due to the complexity of the internal structure of Dict, it is necessary for a VM to provide an iterator for traversing all data.
Dict provides two types of Iterator:
1. dictGetIterator: A common iter. You cannot perform more operations on dict during the time period. Otherwise, data may be omitted or duplicated.
2. dictGetSafeIterator: Safe iter, which can be used for any operation. It's safe, you know.
I will not repeat this point.
DictType
DictType defines dict operation behavior. Redis predefines a set of dicttypes to regulate the operations of various types of dict
The related code is as follows: