In-depth understanding of Redis primary key failure principle and implementation mechanism
As an important mechanism for periodically cleaning up invalid data, primary key failures exist in most cache systems, and Redis is no exception. In many of the commands provided by Redis, EXPIRE, Expireat, Pexpire, Pexpireat, and Setex and Psetex can be used to set a Key-value pair's expiration time, and a key-value pair is once associated The expiration time is automatically deleted after expiration (or becomes inaccessible more accurately). It can be said that the concept of primary key failure is relatively easy to understand, but in the specific implementation of the Redis is what? Recently, Benbow the main key failure mechanism of Redis produced a few questions, and based on these questions to carefully explore, and now summed up the following, in order to treat you spectators.
One, the control of the time of failure
In addition to calling the persist command, is there any other situation where the expiration time of a primary key is revoked? The answer is yes. First, when a primary key is deleted via the DEL command, the expiration time will naturally be revoked (this is not nonsense, haha). Second, when a primary key with an expiration time is overwritten, the failure time of the primary key is also revoked (which seems to be nonsense, haha). It should be noted, however, that the primary key is overwritten by the update, instead of the value of the primary key being overwritten by the update, so that SET, MSET, or getset may cause the primary key to be overwritten by the update, while the values of the primary key are updated, such as INCR, DECR, Lpush, Hset, etc. This type of operation does not touch the failure time of the primary key. In addition, there is a special command is RENAME, when we use RENAME to rename a primary key, the previous associated failure time is automatically passed to the new primary key, but if a primary key is overwritten by RENAME (such as the primary key hello may be command RENAME World Hello), the expiration time of the overridden primary key is automatically revoked, and the new primary key continues to retain the original primary key's characteristics.
II. internal realization of failure
How does the primary key invalidation in Redis be implemented, that is, how the failed primary key was deleted? In fact, Redis removes failed primary keys in two main ways:
- Negative method (passive way), if it is found to be invalid when the primary key is accessed, delete it
- Active method (active way), periodically selecting a part of the failed primary key deletion from the primary key setting the expiration time
Internal representation of the failure
Let's go through the code to explore the implementation of both approaches, but before we do, let's look at how Redis manages and maintains the primary key (note: The source code in this blog post is all from Redis-2.6.12).
"Code Snippet I" gives a structure definition of a database in Redis, which is a pointer to a dictionary in addition to the ID, where we only look at Dict and expires, which is used to maintain all key-value pairs contained in a REDIS database (its structure Can be understood as dict[key]:value, which is the mapping between the primary key and the value, and the latter is used to maintain the primary key (whose structure can be understood as expires[key]:timeout, which is the mapping of the primary key to the expiration time) set in the Redis database. When we use the Setex and Psetex commands to insert data into the system, Redis first adds the key and Value to the Dict dictionary table, and then adds the key and the expiration time to the Expires dictionary table. When we use the EXPIRE, Expireat, Pexpire, and Pexpireat commands to set the expiration time of a primary key, Redis first looks in the dictionary table of Dict to see if the primary key to be set exists, and if it exists, adds the primary key and the expiration time to the expires This dictionary table. Simply put, the primary key that sets the expiration time and the specific expiration time are all maintained in the Expires dictionary table.
"Code Snippet One":
typedef struct redisDb {
dict *dict;
dict *expires;
dict *blocking_keys;
dict *ready_keys;
dict *watched_keys; int id;
} redisDb;
Negative methods
After a rough understanding of how Redis maintains the primary key that sets the expiration time, let's take a look at how Redis implements the passive removal of the failed primary key. "Code snippet Two" gives a function called expireifneeded, which is called in any function that accesses data, that is, Redis calls it when it implements all commands that involve reading data, such as GET, MGET, Hget, Lrange, and so on. The meaning of its existence is to check if it has failed before reading the data, and delete it if it fails. All the relevant descriptions of the expireifneeded function are given in "code snippet two", and the implementation of this method is not repeated here. What needs to be explained here is another function called in the expireifneeded function Propagateexpire, which is used to broadcast the message that the primary key has been invalidated before the invalid primary key is officially deleted, and this information is propagated to two destinations: one is sent to the AoF file, This operation to delete the failed primary key is recorded in the standard command format of Del Key, and the other is all Slave sent to the current Redis server, which also deletes the failed primary key in the standard command format of Del key to inform these Slave to delete their respective failed primary keys. As we can see, all Redis servers running as Slave do not need to remove the failed primary key through the negative method, they just need to be obedient to the master is OK!
"Code snippet Two":
int expireIfNeeded (redisDb * db, robj * key) {
// Get the expiration time of the primary key
long long when = getExpire (db, key);
// If the expiration time is negative, it means that the expiration time is not set for the primary key (the expiration time is -1 by default), and it returns 0 directly.
if (when <0) return 0;
// If the Redis server is loading data from the RDB file, the invalid primary key is not deleted for the time being, and 0 is returned directly.
if (server.loading) return 0;
// If the current Redis server is running as a slave, delete the invalid primary key is not performed because the slave
// The deletion of the expired primary key is controlled by the Master, but the expiration time of the primary key and the current time are performed here
// Compare to tell the caller whether the specified primary key has expired
if (server.masterhost! = NULL) {
return mstime ()> when;
}
// If none of the above conditions are met, compare the expiration time of the primary key with the current time, and if the specified primary key is found
// directly return to 0 before invalidation
if (mstime () <= when) return 0;
// If it is found that the primary key has indeed expired, then first update the statistics on the invalid primary key, and then lose the primary key
// broadcast the effective information, and finally delete the primary key from the database
server.stat_expiredkeys ++;
propagateExpire (db, key);
return dbDelete (db, key);
}
"Code snippet three":
void propagateExpire (redisDb * db, robj * key) {
robj * argv [2];
//shared.del is a common Redis object that has been initialized since the Redis server was started, that is, the DEL command
argv [0] = shared.del;
argv [1] = key;
incrRefCount (argv [0]);
incrRefCount (argv [1]);
// Check if the Redis server has AOF enabled, if it is enabled, record a DEL log for the invalid primary key
if (server.aof_state! = REDIS_AOF_OFF)
feedAppendOnlyFile (server.delCommand, db-> id, argv, 2);
// Check if the Redis server owns the slave, and if so, send the DEL invalid primary key command to all slaves, this is
/ / ExpireIfNeeded function found above does not need to actively delete the reason for the invalid primary key when it is a Slave, because it
// Just listen to the command sent by the Master
if (listLength (server.slaves))
replicationFeedSlaves (server.slaves, db-> id, argv, 2);
decrRefCount (argv [0]);
decrRefCount (argv [1]);
}
Positive approach
Above, we understand how Redis removes failed primary keys in a negative way by introducing the expireifneeded function, but this is obviously not enough, because if some of the failed primary keys are not being accessed again, Redis Will never know that these primary keys have been invalidated, and will never delete them, which will undoubtedly lead to a waste of memory space. As a result, Redis also prepares a proactive method of deletion, which uses Redis's time events to interrupt the completion of certain operations at intervals, including checking for and removing failed primary keys. Here we say that the callback function of the time event is Servercron, which is created when the Redis server is started, the number of executions per second is specified by the macro definition redis_default_hz, and is executed 10 times per second by default. "Code snippet Four" gives the program code at the time event creation, which is in the Initserver function of the redis.c file. In fact, Servercron this callback function not only to check and delete the failed primary key, but also to update the statistics, the Client Connection timeout control, BGSAVE and AOF trigger, and so on, here we only focus on the implementation of the deletion of the failed primary key, that is, the function Activeexpirecycle.
"Code snippet Four":
if(aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL) == AE_ERR) {
redisPanic("create time event failed");
exit(1);
}
"Code snippet Five" gives the implementation of the function activeexpirecycle and its detailed description, the main implementation principle is to traverse the Redis server in each database expires dictionary table, from which to try random sampling redis_expirelookups_per _cron (the default is 10) set the failure time of the primary key, check whether they have been invalidated and delete the failed primary key, if the number of failed primary keys accounted for the number of this sampling more than 25%,redis will think the current database of invalid primary key is still many, So it will proceed to the next round of random sampling and deletion until the ratio just below 25% stops processing the current database and turns to the next database. What we need to note here is that the Activeexpirecycle function does not attempt to process all the databases in REDIS at once, but only Redis_dbcron_dbs_per_call (the default value is 16), and The Activeexpirecycle function also has a processing time limit, not how long it takes to execute, and all of these have only one purpose, which is to avoid the failure of primary keys to remove excessive CPU resources. "Code snippet Five" has a detailed description of all activeexpirecycle code, from which you can see how the function is implemented.
"Code snippet Five":
void activeExpireCycle (void) {
// Because each call to the activeExpireCycle function does not check all Redis databases at once, it needs to be recorded
// The number of the last Redis database processed by each function call, so the next time the activeExpireCycle function is called
// You can also continue processing from this database, which is why current_db is declared static, and another
// variable timelimit_exit is to record whether the execution time of the last call to activeExpireCycle function has reached
// Time limit is reached, so it needs to be declared as static
static unsigned int current_db = 0;
static int timelimit_exit = 0;
unsigned int j, iteration = 0;
// The number of Redis databases processed by each call to the activeExpireCycle function is REDIS_DBCRON_DBS_PER_CALL
unsigned int dbs_per_call = REDIS_DBCRON_DBS_PER_CALL;
long long start = ustime (), timelimit;
// If the number of databases in the current Redis server is less than REDIS_DBCRON_DBS_PER_CALL, then all databases are processed,
// If the execution time of the last call to activeExpireCycle reached the time limit, it means that there are more invalid primary keys, and
// will choose to process all databases
if (dbs_per_call> server.dbnum || timelimit_exit)
dbs_per_call = server.dbnum;
// Maximum time to execute activeExpireCycle function (in microseconds), where REDIS_EXPIRELOOKUPS_TIME_PERC
// is the percentage of CPU time that can be allocated to activeExpireCycle function execution per unit time. The default value is 25, server.hz
// That is the number of calls to activeExpireCycle in one second, so this calculation formula should be written more clearly, that is,
(1000000 * (REDIS_EXPIRELOOKUPS_TIME_PERC / 100)) / server.hz
timelimit = 1000000 * REDIS_EXPIRELOOKUPS_TIME_PERC / server.hz / 100;
timelimit_exit = 0;
if (timelimit <= 0) timelimit = 1;
// Traverse the invalid data in each Redis database
for (j = 0; j <dbs_per_call; j ++) {
int expired;
redisDb * db = server.db + (current_db% server.dbnum);
// Increase current_db by one immediately, so that you can ensure that even if you cannot delete all the current within the time limit this time
// Invalid primary key in the database, the next call to activeExpireCycle will start processing from the next database,
// This guarantees that every database has a chance to be processed
current_db ++;
// Start processing invalid primary keys in the current database
do {
unsigned long num, slots;
long long now;
// If the expires dictionary table size is 0, it means that the primary key of the expiration time is not set in the database, check directly
// a database
if ((num = dictSize (db-> expires)) == 0) break;
slots = dictSlots (db-> expires);
now = mstime ();
// If the expires dictionary table is not empty, but its fill rate is less than 1%, then the cost of randomly selecting the primary key to check
// will be very high, so check the next database directly
if (num && slots> DICT_HT_INITIAL_SIZE &&
(num * 100 / slots <1)) break;
expired = 0;
// If the number of entries in the expires dictionary table is not enough to reach the number of samples, select all keys as the sample samples
if (num> REDIS_EXPIRELOOKUPS_PER_CRON)
num = REDIS_EXPIRELOOKUPS_PER_CRON;
while (num--) {
dictEntry * de;
long long t;
// Randomly obtain a primary key with an expiration time and check whether it has expired
if ((de = dictGetRandomKey (db-> expires)) == NULL) break;
t = dictGetSignedIntegerVal (de);
if (now> t) {
/ / Found that the primary key has indeed expired, delete the primary key
sds key = dictGetKey (de);
robj * keyobj = createStringObject (key, sdslen (key));
// Also broadcast the invalidation information of the primary key before deleting
propagateExpire (db, keyobj);
dbDelete (db, keyobj);
decrRefCount (keyobj);
expired ++;
server.stat_expiredkeys ++;
}
}
// Add one to iteration after each sample deletion, and check if the execution time is this after every 16 sample deletions
// The time limit has been reached. If the time limit has been reached, it is recorded that this execution reaches the time limit and exits.
iteration ++;
if ((iteration & 0xf) == 0 &&
(ustime ()-start)> timelimit)
{
timelimit_exit = 1;
return;
}
// If the percentage of invalid primary keys in the number of samples is greater than 25%, continue the sample deletion process
} while (expired> REDIS_EXPIRELOOKUPS_PER_CRON / 4);
}
}
Iii. Memcached How does it compare with Redis to delete a failed primary key?
First, the Memcached is also a negative way to remove the failed primary key, that is, the Memcached does not monitor whether the primary key is invalidated, but does not check if the primary key is invalidated through Get access. Second, the biggest difference between Memcached and Redis in the primary key failure mechanism is that Memcached does not really remove the failed primary key as Redis, but simply reclaims the space occupied by the failed primary key. In this way, when new data is written to the system, Memcached takes precedence over the spaces that fail the primary key. If the failed primary key space is exhausted, Memcached can also use the LRU mechanism to reclaim the long-term inaccessible space, so Memcached does not need a periodic delete operation like Redis, which is determined by the memory management mechanism used by Memcached. At the same time, it should be pointed out that Redis in the presence of Oom can also be configured maxmemory-policy This parameter to determine whether to use the LRU mechanism to reclaim memory space (thanks to @jonathan_dai classmate in the "LRU mechanism of Redis" to correct the original). In Redis, LRU is the default mechanism, and you might ask, if all keys are not set to expire, and Redis's memory footprint reaches MaxMemory, what happens when I add or modify a key? If no appropriate key can be removed, Redis will return an error when writing. See the 2.8-version Redis profile
Four, the primary key failure mechanism of REDIS will affect the system performance?
By introducing the failure mechanism of the Redis primary key, we know that although Redis periodically checks the primary key that set the expiration time and deletes the primary key that has failed, it is limited by the number of databases processed each time, the number of times that the Activeexpirecycle function executes within one second, The limitation of the CPU time assigned to the Activeexpirecycle function, and the percentage of failed primary keys that continue to delete the primary key, Redis has greatly reduced the effect of the primary key failure mechanism on the overall performance of the system, However, if there is a large number of primary keys in the actual application in a short period of time and the failure of the same situation will make the system's responsiveness, so this situation should be avoided.
Reference Links:
- Http://redis.io/commands/expire
- Http://redis.io/topics/latency
- Http://www.cppblog.com/richbirdandy/archive/2011/11/29/161184.html
- Http://www.cnblogs.com/tangtianfly/archive/2012/05/02/2479315.html
Deep understanding of Redis primary key failure principle and implementation mechanism (RPM)