Analysis on the AOF principle of Redis data persistence Mechanism

Source: Internet
Author: User

/* Function called at startup to load RDB or AOF file in memory. */void loadDataFromDisk(void) {    long long start = ustime();    if (server.aof_state == REDIS_AOF_ON) {        if (loadAppendOnlyFile(server.aof_filename) == REDIS_OK)            redisLog(REDIS_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000);    } else {        if (rdbLoad(server.rdb_filename) == REDIS_OK) {            redisLog(REDIS_NOTICE,"DB loaded from disk: %.3f seconds",                (float)(ustime()-start)/1000000);        } else if (errno != ENOENT) {            redisLog(REDIS_WARNING,"Fatal error loading the DB: %s. Exiting.",strerror(errno));            exit(1);        }    }}
Server first judges that the AOF file is loaded because the data in the AOF file is newer than that in the RDB file.

Int loadAppendOnlyFile (char * filename) {struct redisClient * fakeClient; FILE * fp = fopen (filename, "r"); struct redis_stat sb; int old_aof_state = server. aof_state; long loops = 0; // redis_fstat is the fstat64 function. The file descriptor is obtained through fileno (fp) and the file state is stored in sb. // For details, refer to the stat function, st_size is the number of bytes of the file if (fp & redis_fstat (fileno (fp), & sb )! =-1 & sb. st_size = 0) {server. aof_current_size = 0; fclose (fp); return REDIS_ERR;} if (fp = NULL) {// redisLog (REDIS_WARNING, "Fatal error: can't open the append log file for reading: % s ", strerror (errno); exit (1) ;}/ * Temporarily disable AOF, to prevent EXEC from feeding a MULTI * to the same file we're about to read. */server. aof_state = REDIS_AOF_OFF; fakeClient = createFakeClient (); // Create a pseudo-terminal startLoading (fp); // defined in rdb. c. Update the loading status of the server while (1) {int argc, j; unsigned long len; robj ** argv; char buf [128]; sds argsds; struct redisCommand * cmd; /* Serve the clients from time to time * // process external requests at intervals. the ftello () function obtains the current position of the file, and the returned value is long if (! (Loops ++ % 1000) {loadingProgress (ftello (fp); // Save the location where the aof file is read. ftellno (fp) gets the current location of the file aeProcessEvents (server. el, AE _FILE_EVENTS | AE _DONT_WAIT); // process events} // read AOF data by row if (fgets (buf, sizeof (buf), fp) = NULL) {if (feof (fp) // reaches the end of the file EOF break; else goto readerr;} // read the command in the AOF file, if (buf [0]! = '*') Goto fmterr; argc = atoi (buf + 1); // number of parameters if (argc <1) goto fmterr; argv = zmalloc (sizeof (robj *) * argc); // parameter value for (j = 0; j <argc; j ++) {if (fgets (buf, sizeof (buf), fp) = NULL) goto readerr; if (buf [0]! = '$') Goto fmterr; len = strtol (buf + 1, NULL, 10); // The length of each bulk argsds = sdsnewlen (NULL, len ); // create an empty sds // read if (len & fread (argsds, len, 1, fp) = 0) goto fmterr according to the bulk length; argv [j] = createObject (REDIS_STRING, argsds); if (fread (buf, 2, 1, fp) = 0) goto fmterr; /* discard CRLF skip \ r \ n */}/* Command lookup */cmd = lookupCommand (argv [0]-> ptr); if (! Cmd) {redisLog (REDIS_WARNING, "Unknown command '% s' reading the append only file", (char *) argv [0]-> ptr); exit (1 );} /* Run the command in the context of a fake client */fakeClient-> argc = argc; fakeClient-> argv = argv; cmd-> proc (fakeClient ); // execute The command/* The fake client shocould not have a reply */redisAssert (fakeClient-> bufpos = 0 & listLength (fakeClient-> reply) = 0 ); /* The fake client shoshould neve R get blocked */redisAssert (fakeClient-> flags & REDIS_BLOCKED) = 0);/* Clean up. command code may have changed argv/argc so we use the * argv/argc of the client instead of the local variables. */for (j = 0; j <fakeClient-> argc; j ++) decrRefCount (fakeClient-> argv [j]); zfree (fakeClient-> argv );} /* This point can only be reached when EOF is reached without errors. * If the client is in the mid Dle of a MULTI/EXEC, log error and quit. */if (fakeClient-> flags & REDIS_MULTI) goto readerr; fclose (fp); freeFakeClient (fakeClient); server. aof_state = old_aof_state; stopLoading (); aofUpdateCurrentSize (); // Update server. aof_current_size, AOF file size server. aof_rewrite_base_size = server. aof_current_size; return REDIS_ OK ;............}
The previous blog on AOF parameter configuration left a problem. The initialization of the server. aof_current_size parameter solves this problem.

void aofUpdateCurrentSize(void) {    struct redis_stat sb;    if (redis_fstat(server.aof_fd,&sb) == -1) {        redisLog(REDIS_WARNING,"Unable to obtain the AOF file length. stat: %s",            strerror(errno));    } else {        server.aof_current_size = sb.st_size;    }}
Redis_fstat is the author's renaming the fstat64 function in Linux. This is to get the file-related parameter information. For details, refer to Google. sb. st_size is the size of the current AOF file. The server. aof_fd is the AOF file descriptor. The parameter is initialized in the initServer () function.

/* Open the AOF file if needed. */    if (server.aof_state == REDIS_AOF_ON) {        server.aof_fd = open(server.aof_filename,O_WRONLY|O_APPEND|O_CREAT,0644);        if (server.aof_fd == -1) {            redisLog(REDIS_WARNING, "Can't open the append-only file: %s",strerror(errno));            exit(1);        }    }

So far, the Redis Server starts loading AOF file data in the hard disk and ends successfully.

When the client executes commands such as Set to modify fields in the database, the data in the Server database will be modified. The modified data should be updated to the AOF file in real time, in addition, it is necessary to refresh the data to the hard disk according to a certain fsync mechanism to ensure that data will not be lost.

/* This function gets called every time Redis is entering the * main loop of the event driven library, that is, before to sleep * for ready file descriptors. */void beforeSleep (struct aeEventLoop * eventLoop) {REDIS_NOTUSED (eventLoop); listNode * ln; redisClient * c; /* Run a fast expire cycle (the called function will return * ASAP if a fast cycle is not needed ). */if (server. active_expire_enabled & Server. masterhost = NULL) activeExpireCycle (ACTIVE_EXPIRE_CYCLE_FAST);/* Try to process pending commands for clients that were just unblocked. */while (listLength (server. unblocked_clients) {ln = listFirst (server. unblocked_clients); redisAssert (ln! = NULL); c = ln-> value; listDelNode (server. unblocked_clients, ln); c-> flags & = ~ REDIS_UNBLOCKED;/* Process remaining data in the input buffer. * /// process the request sent by the client during blocking. if (c-> querybuf & sdslen (c-> querybuf)> 0) {server. current_client = c; processInputBuffer (c); server. current_client = NULL;}/* Write the AOF buffer on disk * // set server. data in aof_buf is appended to the AOF file and fsync to the hard disk flushAppendOnlyFile (0 );}
Through the above Code and comments, we can find that the beforeSleep function has done three things: 1. Processing the expiration key, 2. processing client requests during the blocking period, 3. Converting server. the data in aof_buf is appended to the AOF file and fsync is refreshed to the hard disk. The flushAppendOnlyFile function sets a parameter force to indicate whether to forcibly write the AOF file, 1 indicates forced write.

Void flushAppendOnlyFile (int force) {ssize_t nwritten; int sync_in_progress = 0; if (sdslen (server. aof_buf) = 0) return; // return the number of fsyncs awaiting execution in the background if (server. aof_fsync = AOF_FSYNC_EVERYSEC) sync_in_progress = bioPendingJobsOfType (REDIS_BIO_AOF_FSYNC )! = 0; // The AOF mode is fsync per second, and the force mode is not 1. if yes, the if (server. aof_fsync = AOF_FSYNC_EVERYSEC &&! Force) {/* With this append fsync policy we do background fsyncing. * If the fsync is still in progress we can try to delay * the write for a couple of seconds. * /// if the aof_fsync queue has a waiting task if (sync_in_progress) {// The last time the flushing has not been postponed, record the current delay, and then return if (server. aof_flush_postponed_start = 0) {/* No previous write postponinig, remember that we are * postponing the flush and return. */server. aof_flus H_postponed_start = server. unixtime; return;} else if (server. unixtime-server. aof_flush_postponed_start <2) {// enable deferred flushing within two seconds/* We were already waiting for fsync to finish, but for less * than two seconds this is still OK. postpone again. */return;}/* Otherwise fall trough, and go write since we can't wait * over two seconds. */server. aof_delayed_fsync ++; redisLog (REDIS_NOTICE, "Asynchrono Us AOF fsync is taking too long (disk is busy ?). Writing the AOF buffer without waiting for fsync to complete, this may be slow down Redis. ") ;}}/* If you are following this code path, then we are going to write so * set reset the postponed flush sentinel to zero. */server. aof_flush_postponed_start = 0;/* We want to perform a single write. this shoshould be guaranteed atomic * at least if the filesystem we are writing is a real physical one. * Whi Le this will save us against the server being killed I don't think * there is much to do about the whole server stopping for power problems * or alike * // write AOF Cache to file, if everything is lucky, write will perform nwritten = write (server. aof_fd, server. aof_buf, sdslen (server. aof_buf); if (nwritten! = (Signed) sdslen (server. aof_buf) {// error/* Ooops, we are in troubles. the best thing to do for now is * aborting instead of giving the compression sion that everything is * working as expected. */if (nwritten =-1) {redisLog (REDIS_WARNING, "Exiting on error writing to the append-only file: % s", strerror (errno ));} else {redisLog (REDIS_WARNING, "Exiting on short write while writing to" "the append-only fi Le: % s (nwritten = % ld, "" expected = % ld) ", strerror (errno), (long) nwritten, (long) sdslen (server. aof_buf); if (ftruncate (server. aof_fd, server. aof_current_size) =-1) {redisLog (REDIS_WARNING, "cocould not remove short write" "from the append-only file. redis may refuse "" to load the AOF the next time it starts. "" ftruncate: % s ", strerror (errno) ;}} exit (1);} server. aof_current_size + = nwritten ;/* Re-use AOF buffer when it is small enough. the maximum comes from the * arena size of 4 k minus some overhead (but is otherwise arbitrary ). * /// if the aof cache is not too large, reuse it. Otherwise, clear the aof cache if (sdslen (server. aof_buf) + sdsavail (server. aof_buf) <4000) {sdsclear (server. aof_buf);} else {sdsfree (server. aof_buf); server. aof_buf = sdsempty ();}/* Don't fsync if no-appendfsync-on-rewrite is set to yes and There are * children doing I/O in the background. * /// aof rdb sub-process does not support fsync while aof rdb sub-process is running, then return directly, // but the data has been written to the aof file, but not refreshed to the hard disk if (server. aof_no_fsync_on_rewrite & (server. aof_child_pid! =-1 | server. rdb_child_pid! =-1) return;/* Perform the fsync if needed. */if (server. aof_fsync = AOF_FSYNC_ALWAYS) {// always fsync, then directly perform fsync/* aof_fsync is defined as fdatasync () for Linux in order to avoid * flushing metadata. */aof_fsync (server. aof_fd);/* Let's try to get this data on the disk */server. aof_last_fsync = server. unixtime;} else if (server. aof_fsync = AOF_FSYNC_EVERYSEC & server. unixtime> server. aof _ Last_fsync) {if (! Sync_in_progress) aof_background_fsync (server. aof_fd); // put it in the background thread for fsync server. aof_last_fsync = server. unixtime ;}}
Please pay attention to the server in the above Code. the aof_fsync parameter is used to set the policy for transferring Redis fsync AOF files to the hard disk. If it is set to AOF_FSYNC_ALWAYS, fsync is directly in the main process. If it is set to AOF_FSYNC_EVERYSEC, fsync is put, the background thread code is in bio. c.







Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.