Redis Dictionary (dict) rehash process source code parsing

Source: Internet
Author: User
Tags rehash

Redis's memory storage structure is a large dictionary storage, which is also known as a hash table. Redis can store tens of thousands of records of CACHE and tens of millions or even hundreds of millions of records (depending on the memory), which fully demonstrates Redis's powerful buffer. The core data structure of Redis is the dictionary (dict). During the process of increasing data volume, dict will encounter the HASH (key) collision problem. If DICT is not large enough, the probability of collision increases, in this way, the more elements stored in a single hash bucket, the slower the query efficiency. If the data volume changes from tens of millions to tens of thousands, the DICT memory will cause unnecessary waste. Redis's dict fully considers the automatic expansion and contraction of dict in the design process, and implements a rehash process. There are two conditions for dict to start rehash:

1) the total number of elements is calculated based on the number of DICT buckets to obtain the average number of elements stored in each bucket (pre_num). If pre_num> dict_force_resize_ratio, The dict expansion operation is triggered. Dict_force_resize_ratio = 5.

2) in the total element * 10 <Number of buckets, that is, the filling rate must be <10%, and DICT will contract to make total/bk_num close.


Dict rehash expansion process:

Source code function call and resolution:

DictAddRaw-> _ dictKeyIndex-> _ dictExpandIfNeeded-> dictExpand. This function call relationship needs to expand the call relationship of dict,
_ DictKeyIndex function code:

Static int _ dictKeyIndex (dict * d, const void * key) {unsigned int h, idx, table; dictEntry * he; // if necessary, extend the dictionary if (_ dictExpandIfNeeded (d) = DICT_ERR) return-1; // calculate the hash value of the key h = dictHashKey (d, key ); // search for the given key in two hash tables (table = 0; table <= 1; table ++) {// calculate the index idx = h & d-> ht [table] where the key may appear in the table Array Based on the hash value and sizemask of the hash table. sizemask; // search for a given key in the node linked list /// because the number of elements in the linked list is usually 1 or a very small ratio // Therefore, we can regard this operation as O (1) to process he = d-> ht [table]. table [idx]; while (he) {// The key already exists if (dictCompareKeys (d, key, he-> key) return-1; he = he-> next;} // when you run it here for the first time, it indicates that d-> ht [0] has been searched. // If the hash table is not in rehash, there is no need to find d-> ht [1] if (! DictIsRehashing (d) break;} return idx ;}
_ DictExpandIfNeeded function code parsing:

Static int _ dictExpandIfNeeded (dict * d) {// in progressive rehash, if (dictIsRehashing (d) return DICT_ OK; // if the hash table is empty, then extend it to the initial size // O (N) if (d-> ht [0]. size = 0) return dictExpand (d, DICT_HT_INITIAL_SIZE); // if the number of nodes used in the hash table> = the size of the hash table, // and either of the following conditions is true: // 1) dict_can_resize true // 2) the ratio of the number of used nodes divided by the size of the hash table is greater than // dict_force_resize_ratio // call dictExpand to expand the hash table. // the size of the used nodes must be at least twice the size of the used nodes. // O (n) if (d-> ht [0]. used> = d-> ht [0]. size & (dict_can_resize | d-> ht [0]. used/d-> ht [0]. size> dict_force_resize_ratio) {return dictExpand (d, d-> ht [0]. used * 2);} return DICT_ OK ;}

Dict rehash reduction process:


Source code function call and resolution:

ServerCron-> tryResizeHashTables-> dictResize-> dictExpand

The serverCron function is a heartbeat function. The tryResizeHashTables section is called as follows:

Int serverCron (struct aeEventLoop * eventLoop, long id, void * clientData ){.... if (server. rdb_child_pid =-1 & server. aof_child_pid =-1) {// keep the hash table rate near tryResizeHashTables (); if (server. activerehashing) incrementallyRehash (); // rehash action }....}
TryResizeHashTables function code analysis:

Void tryResizeHashTables (void) {int j; for (j = 0; j <server. dbnum; j ++) {// reduce the key space dictionary if (htNeedsResize (server. db [j]. dict) dictResize (server. db [j]. dict); // reduce the expiration time dictionary if (htNeedsResize (server. db [j]. expires) dictResize (server. db [j]. expires );}}


The htNeedsResize function is used to determine whether dict can be reduced by a condition. The filling rate must be greater than 10%. Otherwise, the function is reduced by the following code:

Int htNeedsResize (dict * dict) {long size, used; // hash table size = dictSlots (dict); // number of nodes used in the hash table used = dictSize (dict ); // when the size of the hash table is greater than DICT_HT_INITIAL_SIZE // when the dictionary filling rate is lower than REDIS_HT_MINFILL // return 1 return (size & used & size> DICT_HT_INITIAL_SIZE & (used * 100/ size <REDIS_HT_MINFILL ));}

Code of the dictResize function:

Int dictResize (dict * d) {int minimal; // you cannot call if (! Dict_can_resize | dictIsRehashing (d) return DICT_ERR; minimal = d-> ht [0]. used; if (minimal <DICT_HT_INITIAL_SIZE) minimal = DICT_HT_INITIAL_SIZE; return dictExpand (d, minimal );}


The preceding two processes finally call the dictExpand function, which generates a new HASH table (dictht) and enables dict. rehashidx = 0. Indicates that the rehash operation is started. The specific rehash action is to re-hide the data of ht [0] to ht [1] According to the hash implicit rule. The specific code is as follows:

Int dictExpand (dict * d, unsigned long size) {dictht n; /* New hash table of transferred data * // calculate the real size of the hash table unsigned long realsize = _ dictNextPower (size); if (dictIsRehashing (d) | d-> ht [0]. used> size | d-> ht [0]. size = realsize) return DICT_ERR; // create and initialize a new hash table n. size = realsize; n. sizemask = realsize-1; n. table = zcalloc (realsize * sizeof (dictEntry *); n. used = 0; // If ht [0] is empty, This is the behavior of creating a new hash table. // set the new hash table to ht [0], then return if (d-> ht [0]. table = NULL) {d-> ht [0] = n; return DICT_ OK ;} /* Prepare a second hash table for incremental rehashing * // If ht [0] is not empty, this is an extended dictionary action. // set the new hash table to ht [1] and enable the rehash ID d-> ht [1] = n; d-> rehashidx = 0; return DICT_ OK ;}

After the rehashidx of the dictionary dict is set to 0, it indicates that the rehash action is started. This flag is checked during the heartbeat function execution. If rehash is required, the progressive rehash action is performed. The function call process is as follows:

ServerCron-> incrementallyRehash-> dictRehashMilliseconds-> dictRehash

IncrementallyRehash function code:

/** Call in Redis Cron to perform incremental rehash */void incrementallyRehash (void) for the first hash table encountered in the database that can be rehash) {int j; for (j = 0; j <server. dbnum; j ++) {/* Keys dictionary */if (dictIsRehashing (server. db [j]. dict) {dictRehashMilliseconds (server. db [j]. dict, 1); break;/* used up the specified CPU in milliseconds */}...}


The dictRehashMilliseconds function executes the rehash operation based on the specified number of milliseconds of CPU operations. Each time, the rehash operation is performed in 100 units. The Code is as follows:

/** Rehash the dictionary in the unit of Step 4 within a given millisecond. */Int dictRehashMilliseconds (dict * d, int MS) {long start = timeInMilliseconds (); int rehashes = 0; while (dictRehash (d, 100 )) {/* Data in step 3 each time */rehashes + = 100; if (timeInMilliseconds ()-start> MS) break;/* time elapsed, pause rehash */} return rehashes ;}
/** Execute step N progressive rehash. ** If rehash is required for an element in the hash table after execution, 1 is returned. * If all elements in the hash table have been migrated, 0 is returned. ** Each rehash step moves the entire linked list node of an index in the number group of hash tables. * Therefore, there may be more than one key to be migrated from ht [0] to ht [1. */Int dictRehash (dict * d, int n) {if (! DictIsRehashing (d) return 0; while (n --) {dictEntry * de, * nextde; // If ht [0] is already empty, then the migration is complete // use ht [1] instead of the original ht [0] if (d-> ht [0]. used = 0) {// release the hash table array zfree (d-> ht [0] for ht [0]. table); // direct ht [0] to ht [1] d-> ht [0] = d-> ht [1]; // clear the pointer _ dictReset (& d-> ht [1]) of ht [1]; // disable the rehash ID d-> rehashidx =-1; // notify the caller, return 0 after rehash is completed;} assert (d-> ht [0]. size> (unsigned) d-> rehashidx); // move to the index of the first non-NULL linked list in the array while (d-> ht [0]. table [d-> rehashidx] = NULL) d-> rehashidx ++; // point to the chain table header de = d-> ht [0]. table [d-> rehashidx]; // migrate all elements in the chain table from ht [0] to ht [1] // because the element in the bucket usually has only one, or no more than a specific ratio // you can think of this operation as O (1) while (de) {unsigned int h; nextde = de-> next; /* Get the index in the new hash table * // calculate the hash value of the element in ht [1] h = dictHashKey (d, de-> key) & d-> ht [1]. sizemask; // Add the node to ht [1] and adjust the pointer de-> next = d-> ht [1]. table [h]; d-> ht [1]. table [h] = de; // update counter d-> ht [0]. used --; d-> ht [1]. used ++; de = nextde;} // set the pointer to NULL, so that the next rehash will skip d-> ht [0]. table [d-> rehashidx] = NULL; // move forward to the next index d-> rehashidx ++;} // notify the caller and wait for rehash return 1 ;}

In summary, Redis's rehash operation is a core operation of memory management and data management. Because Redis uses a single thread for data management and message effects, its rehash data migration process adopts a progressive data migration mode to prevent the rehash process from being too long to block data processing threads. The multi-thread Migration Mode of memcached is not used. The rehash process of memcached will be introduced later. The rehash process from redis is well designed and elegant. It is worth noting that when redis finds data, it searches for the ht [0] being migrated and the migrated ht [1] at the same time. Prevents data loss during migration.









Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.