Redis' memory storage structure is a large dictionary storage, which is what we usually refer to as a hash table. Redis is as small as CACHE capable of storing tens of thousands of records, and as large as tens of millions or even hundreds of millions of records (depending on memory), which fully shows that Redis is powerful as a buffer. The core data structure of Redis is a dictionary (dict), which is in the process of increasing data volume. You will encounter the problem of HASH (key) collision. Assuming that DICT is not large enough, the probability of collision increases, so that the elements stored in a single hash bucket will be more and more, and the query efficiency will become slower. Assuming that the amount of data has changed from tens of millions to tens of thousands, a process of continuous reduction. DICT memory will cause unnecessary waste. Redis's dict fully considered the dict's own expansion and contraction in the design process, and realized a process called rehash.
There are two conditions for making dict rehash:
1) The total number of elements is divided by the number of DICT buckets to get the average number of elements stored in each bucket (pre_num). Assuming pre_num> dict_force_resize_ratio, the dict expansion operation will be triggered. dict_force_resize_ratio = 5.
2) In total elements * 10 <the number of buckets, that is, the filling rate must be <10%,
DICT will shrink. Make total / bk_num close to 1: 1.
dict rehash expansion process:
Source code function call and analysis:
dictAddRaw-> _ dictKeyIndex-> _ dictExpandIfNeeded-> dictExpand, this function call relationship is to expand the call relationship of dict,
_dictKeyIndex function code:
static int _dictKeyIndex (dict * d, const void * key)
{
unsigned int h, idx, table;
dictEntry * he;
// Suppose there is a need. Expand the dictionary
if (_dictExpandIfNeeded (d) == DICT_ERR)
return -1;
// Calculate the hash value of key
h = dictHashKey (d, key);
// Find the given key in two hash tables
for (table = 0; table <= 1; table ++) {
// According to the hash value and the sizemask of the hash table
// Calculate which index the key may appear in the table array
idx = h & d-> ht [table] .sizemask;
// Find the given key in the node list
// Since the number of elements in the linked list is usually 1 or a very small ratio
// So we can treat this operation as O (1)
he = d-> ht [table] .table [idx];
while (he) {
// key already exists
if (dictCompareKeys (d, key, he-> key))
return -1;
he = he-> next;
}
// When it is executed for the first time, it means that it has finished searching for d-> ht [0]
// At this time, assume that the hash table is not in rehash. There is no need to find d-> ht [1]
if (! dictIsRehashing (d)) break;
}
return idx;
}
_dictExpandIfNeeded function code analysis:
static int _dictExpandIfNeeded (dict * d)
{
// Already in progressive rehash where, return directly
if (dictIsRehashing (d)) return DICT_OK;
// Assume that the hash table is empty. Then expand it to the initial size
// O (N)
if (d-> ht [0] .size == 0) return dictExpand (d, DICT_HT_INITIAL_SIZE);
// Assume the number of used nodes of the hash table> = the size of the hash table.
// And any of the following conditions are true:
// 1) dict_can_resize is true
// 2) The ratio of the number of used nodes divided by the size of the hash table is greater than
// dict_force_resize_ratio
// Then call dictExpand to expand the hash table
// The expanded volume is at least twice the number of used nodes
// O (N)
if (d-> ht [0] .used> = d-> ht [0] .size &&
(dict_can_resize ||
d-> ht [0] .used / d-> ht [0] .size> dict_force_resize_ratio))
{
return dictExpand (d, d-> ht [0] .used * 2);
}
The serverCron function is a heartbeat function, calling the tryResizeHashTables section is:
int serverCron (struct aeEventLoop * eventLoop, long long id, void * clientData) {
....
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1) {
// Maintain the hash table ratio near 1: 1
tryResizeHashTables ();
if (server.activerehashing) incrementallyRehash (); // Rehash action
}
....
}
Code analysis of tryResizeHashTables function:
void tryResizeHashTables (void) {
int j;
for (j = 0; j <server.dbnum; j ++) {
// Narrow the key space dictionary
if (htNeedsResize (server.db [j] .dict))
dictResize (server.db [j] .dict);
// Narrow the expiration time dictionary
if (htNeedsResize (server.db [j] .expires))
dictResize (server.db [j] .expires);
}
}
The htNeedsResize function is to infer whether the condition of dict reduction is necessary. The fill rate must be> 10%, otherwise it will be reduced. The detailed code is as follows:
int htNeedsResize (dict * dict) {
long long size, used;
// Hash table size
size = dictSlots (dict);
// Number of used nodes in the hash table
used = dictSize (dict);
// When the size of the hash table is greater than DICT_HT_INITIAL_SIZE
// and the filling rate of the dictionary is lower than REDIS_HT_MINFILL
// returns 1
return (size && used && size> DICT_HT_INITIAL_SIZE &&
(used * 100 / size <REDIS_HT_MINFILL));
}
dictResize function code:
int dictResize (dict * d)
{
int minimal;
// cannot be false in dict_can_resize
// Or call when the dictionary is rehash
if (! dict_can_resize || dictIsRehashing (d)) return DICT_ERR;
minimal = d-> ht [0] .used;
if (minimal <DICT_HT_INITIAL_SIZE)
minimal = DICT_HT_INITIAL_SIZE;
return dictExpand (d, minimal);
}
The above two processes finally call the dictExpand function. This function is mainly to generate a new HASH table (dictht) and let dict.rehashidx = 0. Indicates that the rehash action is started. The detailed rehash action is to inject the data of ht [0] to ht [1] again according to the rules of hash invisible. The detailed code example is as follows:
int dictExpand (dict * d, unsigned long size)
{
dictht n; / * New hash table of transferred data * /
// Calculate the true size of the hash table
unsigned long realsize = _dictNextPower (size);
if (dictIsRehashing (d) || d-> ht [0] .used> size || d-> ht [0] .size == realsize)
return DICT_ERR;
// Create and initialize a new hash table
n.size = realsize;
n.sizemask = realsize-1;
n.table = zcalloc (realsize * sizeof (dictEntry *));
n.used = 0;
// Assuming ht [0] is empty, then this is the act of creating a new hash table at a time
// Set the new hash table to ht [0] and return
if (d-> ht [0] .table == NULL) {
d-> ht [0] = n;
return DICT_OK;
}
/ * Prepare a second hash table for incremental rehashing * /
// Assume that ht [0] is not empty. Then this is the behavior of expanding the dictionary
// Set the new hash table to ht [1] and open the rehash logo
d-> ht [1] = n;
d-> rehashidx = 0;
return DICT_OK;
}
After the dictionary dict's rehashidx is set to 0, it indicates that the rehash action is started. This flag will be checked during the operation of the heartbeat function. Assuming that rehash is required, a progressive rehash action can be performed. The process of function call is:
serverCron-> incrementallyRehash-> dictRehashMilliseconds-> dictRehash
IncrementallyRehash function code:
/ *
* Called in Redis Cron, the first hash table in the database that can be rehashed
* Perform 1 ms progressive rehash
* /
void incrementallyRehash (void) {
int j;
for (j = 0; j <server.dbnum; j ++) {
/ *
Keys dictionary * /
if (dictIsRehashing (server.db [j] .dict)) {
dictRehashMilliseconds (server.db [j] .dict, 1);
break; / * The specified CPU milliseconds have been exhausted * /
}
...
}
The dictRehashMilliseconds function is to run the rehash action according to the specified number of milliseconds of the CPU operation, one for each 100 units.
The code example is as follows:
/ *
* Within a given number of milliseconds, rehash the dictionary in units of 100 steps.
* /
int dictRehashMilliseconds (dict * d, int ms) {
long long start = timeInMilliseconds ();
int rehashes = 0;
while (dictRehash (d, 100)) {/ * 100 steps of data at a time * /
rehashes + = 100;
if (timeInMilliseconds ()-start> ms) break; / * Time-consuming completion. Pause rehash * /
}
return rehashes;
}
/ *
* Run N-step progressive rehash.
*
* Suppose there are still elements in the hash table that need to be rehash after running. Then return 1.
* Assuming that all the elements in the hash table have been migrated, then return 0.
*
* Each step of rehash will move the entire linked list node on an index in the hash table array,
* So there may be more than one key migrated from ht [0] to ht [1].
* /
int dictRehash (dict * d, int n) {
if (! dictIsRehashing (d)) return 0;
while (n--) {
dictEntry * de, * nextde;
// Assuming ht [0] is already empty, the migration is complete
// Replace the original ht [0] with ht [1]
if (d-> ht [0] .used == 0) {
// Free the hash table array of ht [0]
zfree (d-> ht [0] .table);
// Point ht [0] to ht [1]
d-> ht [0] = d-> ht [1];
// Clear the pointer of ht [1]
_dictReset (& d-> ht [1]);
// Close the rehash logo
d-> rehashidx = -1;
// notify the caller that rehash is complete
return 0;
}
assert (d-> ht [0] .size> (unsigned) d-> rehashidx);
// Move to the index of the first linked list in the array that is not NULL
while (d-> ht [0] .table [d-> rehashidx] == NULL) d-> rehashidx ++;
// point to the head of the list
de = d-> ht [0] .table [d-> rehashidx];
// Migrate all elements in the linked list from ht [0] to ht [1]
// Since there is usually only one element in the bucket, or no more than a certain ratio
// so this operation can be seen as O (1)
while (de) {
unsigned int h;
nextde = de-> next;
/ * Get the index in the new hash table * /
// Calculate the hash value of the element in ht [1]
h = dictHashKey (d, de-> key) & d-> ht [1] .sizemask;
// Add the node to ht [1] and adjust the pointer
de-> next = d-> ht [1] .table [h];
d-> ht [1] .table [h] = de;
// Set the pointer to NULL to facilitate skipping the next time rehash
d-> ht [0] .table [d-> rehashidx] = NULL;
// Advance to the next index
d-> rehashidx ++;
}
// notify the caller that there are elements waiting for rehash
return 1;
}
to sum up. Redis's rehash action is a core operation of memory management and data management, because Redis mainly uses a single thread for data management and message effects. Its rehash data migration process uses a gradual data migration model. This is done to prevent the rehash process from being too long and blocking the data processing thread.
There is no multi-thread migration model using memcached. The rehash process of memcached will be introduced later.
The rehash process from redis is very clever and elegant. It is worth noting here that when redis finds data, it is looking for the ht [0] being migrated and the ht [1] being migrated at the same time. Prevent data hits during migration.
Redis dictionary (dict) rehash process source code analysis
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.