Redis Source Analysis (19)---the implementation of master-slave data replication in replication

Source: Internet
Author: User
Tags auth prepare readable redis sha1


The original meaning of replication's English word is "copy", the replication file is the last file of my analysis in the data directory, enough to explain his importance, the code quantity 1800 +, indeed very difficult to chew. Can only say that I look at the code down the general impression of it, I want to draw a structural map to the various APIs in this diagram, which I can not really do at present. When it comes to master-slave replication, this is the best way to achieve the separation of reading and writing, but also very common, when the number of users reached a certain amount, when a server can not withstand tens of millions of PV, take the form of master-slave database is also a general architect can think of a means. Redis's master-slave database is called the primary client, from the client, because the client has a db in it, because the database is replicated based on the customer order itself. In other words, a redis, there is a master master client, multiple slave from the client, to achieve is slave to the primary client for replication operations. Because the APIs are more numerous, they are slightly categorized:


/ * ---------------------------------- MASTER ------------- ------------------- * /
void createReplicationBacklog (void) / * Create backlog buffer * /
void resizeReplicationBacklog (long long newsize) / * Adjust the size of the replication backup log when the replication backlog is modified * /
void freeReplicationBacklog (void) / * Free backup log * /
void feedReplicationBacklog (void * ptr, size_t len) / * Adding data to the backup log will cause the master_repl_offset offset to increase * /
void feedReplicationBacklogWithObject (robj * o) / * Add data to the backlog with Redis string objects as parameters * /
void replicationFeedSlaves (list * slaves, int dictid, robj ** argv, int argc) / * Copy the master database to the slave database * /
void replicationFeedMonitors (redisClient * c, list * monitors, int dictid, robj ** argv, int argc) / * send data to the monitor listener client * /
long long addReplyReplicationBacklog (redisClient * c, long long offset) / * slave adds the backup log from the customer order * /
int masterTryPartialResynchronization (redisClient * c) / * The master database attempts partition synchronization * /
void syncCommand (redisClient * c) / * synchronous command function * /
void replconfCommand (redisClient * c) / * This function is used to perform execution parameter setting in the configuration copy process from the client * /
void sendBulkToSlave (aeEventLoop * el, int fd, void * privdata, int mask) / * Send BULK data to the slave client * /
void updateSlavesWaitingBgsave (int bgsaveerr) / * This method will be used when the background save process is about to end and update the slave from the client * /
Ranch
/ * ----------------------------------- SLAVE ------------ -------------------- * /
void replicationAbortSyncTransfer (void) / * abort the synchronization operation with the master data * /
void replicationSendNewlineToMaster (void) / * Send a blank line from the client to the master client, destroying the original protocol format, and avoiding the master client to detect that the slave client timed out
void replicationEmptyDbCallback (void * privdata) / * Callback method after emptying the database, called when old data is flushed out and waiting to load new data * /
void readSyncBulkPayload (aeEventLoop * el, int fd, void * privdata, int mask) / * read synchronous BULK data from the client * /
char * sendSynchronousCommand (int fd, ...) / * The command to synchronize data sent from the client to the main client, with authentication information, and some parameter configuration information * /
int slaveTryPartialResynchronization (int fd) / * Attempt partition synchronization operation from client * /
void syncWithMaster (aeEventLoop * el, int fd, void * privdata, int mask) / * Keep in sync with the main client, including confirmation of port number, etc., during socket connection * /
int connectWithMaster (void) / * Connect master client * /
void undoConnectWithMaster (void) / * Cancel connection with master client * /
int cancelReplicationHandshake (void) / * abort a non-blocking replication replication attempt when a replication process already exists * /
void replicationSetMaster (char * ip, int port) / * Set the IP address and port number of the master client * /
void replicationUnsetMaster (void)
void slaveofCommand (redisClient * c)
void roleCommand (redisClient * c)
void replicationSendAck (void) / * Send an ACK packet to the main client to inform the current process offset * /
Ranch
/ * ---------------------- MASTER CACHING FOR PSYNC ---------------------- ---- * /
void replicationCacheMaster (redisClient * c) / * cache client information * /
void replicationDiscardCachedMaster (void) / * When a client will no longer reply, you can release the cached master client * /
void replicationResurrectCachedMaster (int newfd) / * Revive the cache client * /
Ranch
/ * ------------------------- MIN-SLAVES-TO-WRITE ---------------- ----------- * /
void refreshGoodSlavesCount (void) / * Update the number of slaves from the client * /
void replicationScriptCacheInit (void)
void replicationScriptCacheFlush (void)
void replicationScriptCacheAdd (sds sha1)
int replicationScriptCacheExists (sds sha1)
void replicationCron (void)
Find a standard slave to synchronize from client to master:
/ * Keep in sync with the main client, including confirmation of port number, etc., socket connection * /
void syncWithMaster (aeEventLoop * el, int fd, void * privdata, int mask) {
    char tmpfile [256], * err;
    int dfd, maxtries = 5;
    int sockerr = 0, psync_result;
    socklen_t errlen = sizeof (sockerr);
    REDIS_NOTUSED (el);
    REDIS_NOTUSED (privdata);
    REDIS_NOTUSED (mask);

    / * If this event fired after the user turned the instance into a master
     * with SLAVEOF NO ONE we must just return ASAP. * /
    if (server.repl_state == REDIS_REPL_NONE) {
        close (fd);
        return;
    }

    / * Check for errors in the socket. * /
    / * Whether the socket connection is normal * /
    if (getsockopt (fd, SOL_SOCKET, SO_ERROR, & sockerr, & errlen) == -1)
        sockerr = errno;
    if (sockerr) {
        aeDeleteFileEvent (server.el, fd, AE_READABLE | AE_WRITABLE);
        redisLog (REDIS_WARNING, "Error condition on socket for SYNC:% s",
            strerror (sockerr));
        goto error;
    }

    / * If we were connecting, it's time to send a non blocking PING, we want to
     * make sure the master is able to reply before going into the actual
     * replication process where we have long timeouts in the order of
     * seconds (in the meantime the slave would block). * /
    / * Connection test, the master client will send a PING command to the slave client, and observe if there is a reply within the given delay time * /
    if (server.repl_state == REDIS_REPL_CONNECTING) {
        redisLog (REDIS_NOTICE, "Non blocking connect for SYNC fired the event.");
        / * Delete the writable event so that the readable event remains
         * registered and we can wait for the PONG reply. * /
        aeDeleteFileEvent (server.el, fd, AE_WRITABLE);
        server.repl_state = REDIS_REPL_RECEIVE_PONG;
        / * Send the PING, don't check for errors at all, we have the timeout
         * that will take care about this. * /
        // Send PING command
        syncWrite (fd, "PING \ r \ n", 6,100);
        return;
    }

    / * Receive the PONG command. * /
    // received a reply
    if (server.repl_state == REDIS_REPL_RECEIVE_PONG) {
        char buf [1024];

        / * Delete the readable event, we no longer need it now that there is
         * the PING reply to read. * /
        aeDeleteFileEvent (server.el, fd, AE_READABLE);
/ * Read the reply with explicit timeout. * /
        buf [0] = '\ 0';
        if (syncReadLine (fd, buf, sizeof (buf),
            server.repl_syncio_timeout * 1000) == -1)
        {
            redisLog (REDIS_WARNING,
                "I / O error reading PING reply from master:% s",
                strerror (errno));
            goto error;
        }

        / * We accept only two replies as valid, a positive + PONG reply
         * (we just check for "+") or an authentication error.
         * Note that older versions of Redis replied with "operation not
         * permitted "instead of using a proper error code, so we test
         * both. * /
        if (buf [0]! = '+' &&
            strncmp (buf, "-NOAUTH", 7)! = 0 &&
            strncmp (buf, "-ERR operation not permitted", 28)! = 0)
        {
            redisLog (REDIS_WARNING, "Error reply to PING from master: '% s'", buf);
            goto error;
        } else {
            redisLog (REDIS_NOTICE,
                "Master replied to PING, replication can continue ...");
        }
    }

    / * AUTH with the master if required. * /
    // auth authentication
    if (server.masterauth) {
        err = sendSynchronousCommand (fd, "AUTH", server.masterauth, NULL);
        if (err [0] == '-') {
            redisLog (REDIS_WARNING, "Unable to AUTH to MASTER:% s", err);
            sdsfree (err);
            goto error;
        }
        sdsfree (err);
    }

    / * Set the slave port, so that Master's INFO command can list the
     * slave listening port correctly. * /
    / * Set the listening port from the client * /
    {
        sds port = sdsfromlonglong (server.port);
        err = sendSynchronousCommand (fd, "REPLCONF", "listening-port", port,
                                         NULL);
        sdsfree (port);
        / * Ignore the error if any, not all the Redis versions support
         * REPLCONF listening-port. * /
        if (err [0] == '-') {
            redisLog (REDIS_NOTICE, "(Non critical) Master does not understand REPLCONF listening-port:% s", err);
        }
        sdsfree (err);
    }

    / * Try a partial resynchonization. If we don't have a cached master
     * slaveTryPartialResynchronization () will at least try to use PSYNC
     * to start a full resynchronization so that we get the master run id
     * and the global offset, to try a partial resync at the next
     * reconnection attempt. * /
    psync_result = slaveTryPartialResynchronization (fd);
    if (psync_result == PSYNC_CONTINUE) {
        redisLog (REDIS_NOTICE, "MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.");
        return;
    }

    / * Fall back to SYNC if needed. Otherwise psync_result == PSYNC_FULLRESYNC
     * and the server.repl_master_runid and repl_master_initial_offset are
     * already populated. * /
    if (psync_result == PSYNC_NOT_SUPPORTED) {
        redisLog (REDIS_NOTICE, "Retrying with SYNC ...");
        if (syncWrite (fd, "SYNC \ r \ n", 6, server.repl_syncio_timeout * 1000) == -1) {
            redisLog (REDIS_WARNING, "I / O error writing to MASTER:% s",
                strerror (errno));
            goto error;
        }
    }

    / * Prepare a suitable temp file for bulk transfer * /
    while (maxtries--) {
        snprintf (tmpfile, 256,
            "temp-% d.% ld.rdb", (int) server.unixtime, (long int) getpid ());
        dfd = open (tmpfile, O_CREAT | O_WRONLY | O_EXCL, 0644);
        if (dfd! = -1) break;
        sleep (1);
    }
    if (dfd == -1) {
        redisLog (REDIS_WARNING, "Opening the temp file needed for MASTER <-> SLAVE synchronization:% s", strerror (errno));
        goto error;
    }

    / * Setup the non blocking download of the bulk file. * /
    if (aeCreateFileEvent (server.el, fd, AE_READABLE, readSyncBulkPayload, NULL)
            == AE_ERR)
    {
        redisLog (REDIS_WARNING,
            "Can't create readable event for SYNC:% s (fd =% d)",
            strerror (errno), fd);
        goto error;
    }

    server.repl_state = REDIS_REPL_TRANSFER;
    server.repl_transfer_size = -1;
    server.repl_transfer_read = 0;
    server.repl_transfer_last_fsync_off = 0;
    server.repl_transfer_fd = dfd;
    server.repl_transfer_lastio = server.unixtime;
    server.repl_transfer_tmpfile = zstrdup (tmpfile);
    return;

error:
    close (fd);
    server.repl_transfer_s = -1;
    server.repl_state = REDIS_REPL_CONNECT;
    return;
}
         In replication, the concept of a cacheMaster is required to temporarily cache the information of the master client. It is generally used when the master and slave are suddenly disconnected, and can be quickly restored next time the master-slave synchronization is performed:
/ * Cache client information * /
void replicationCacheMaster (redisClient * c) {
    listNode * ln;

    redisAssert (server.master! = NULL && server.cached_master == NULL);
    redisLog (REDIS_NOTICE, "Caching the disconnected master state.");

    / * Remove from the list of clients, we don't want this client to be
     * listed by CLIENT LIST or processed in any way by batch operations. * /
    // Remove this client first
    ln = listSearchKey (server.clients, c);
    redisAssert (ln! = NULL);
    listDelNode (server.clients, ln);

    / * Save the master. Server.master will be set to null later by
     * replicationHandleMasterDisconnection (). * /
    // Save as slow / * Read the reply with explicit timeout. * /
        buf [0] = '\ 0';
        if (syncReadLine (fd, buf, sizeof (buf),
            server.repl_syncio_timeout * 1000) == -1)
        {
            redisLog (REDIS_WARNING,
                "I / O error reading PING reply from master:% s",
                strerror (errno));
            goto error;
        }

        / * We accept only two replies as valid, a positive + PONG reply
         * (we just check for "+") or an authentication error.
         * Note that older versions of Redis replied with "operation not
         * permitted "instead of using a proper error code, so we test
         * both. * /
        if (buf [0]! = '+' &&
            strncmp (buf, "-NOAUTH", 7)! = 0 &&
            strncmp (buf, "-ERR operation not permitted", 28)! = 0)
        {
            redisLog (REDIS_WARNING, "Error reply to PING from master: '% s'", buf);
            goto error;
        } else {
            redisLog (REDIS_NOTICE,
                "Master replied to PING, replication can continue ...");
        }
    }

    / * AUTH with the master if required. * /
    // auth authentication
    if (server.masterauth) {
        err = sendSynchronousCommand (fd, "AUTH", server.masterauth, NULL);
        if (err [0] == '-') {
            redisLog (REDIS_WARNING, "Unable to AUTH to MASTER:% s", err);
            sdsfree (err);
            goto error;
        }
        sdsfree (err);
    }

    / * Set the slave port, so that Master's INFO command can list the
     * slave listening port correctly. * /
    / * Set the listening port from the client * /
    {
        sds port = sdsfromlonglong (server.port);
        err = sendSynchronousCommand (fd, "REPLCONF", "listening-port", port,
                                         NULL);
        sdsfree (port);
        / * Ignore the error if any, not all the Redis versions support
         * REPLCONF listening-port. * /
        if (err [0] == '-') {
            redisLog (REDIS_NOTICE, "(Non critical) Master does not understand REPLCONF listening-port:% s", err);
        }
        sdsfree (err);
    }

    / * Try a partial resynchonization. If we don't have a cached master
     * slaveTryPartialResynchronization () will at least try to use PSYNC
     * to start a full resynchronization so that we get the master run id
     * and the global offset, to try a partial resync at the next
     * reconnection attempt. * /
    psync_result = slaveTryPartialResynchronization (fd);
    if (psync_result == PSYNC_CONTINUE) {
        redisLog (REDIS_NOTICE, "MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.");
        return;
    }

    / * Fall back to SYNC if needed. Otherwise psync_result == PSYNC_FULLRESYNC
     * and the server.repl_master_runid and repl_master_initial_offset are
     * already populated. * /
    if (psync_result == PSYNC_NOT_SUPPORTED) {
        redisLog (REDIS_NOTICE, "Retrying with SYNC ...");
        if (syncWrite (fd, "SYNC \ r \ n", 6, server.repl_syncio_timeout * 1000) == -1) {
            redisLog (REDIS_WARNING, "I / O error writing to MASTER:% s",
                strerror (errno));
            goto error;
        }
    }

    / * Prepare a suitable temp file for bulk transfer * /
    while (maxtries--) {
        snprintf (tmpfile, 256,
            "temp-% d.% ld.rdb", (int) server.unixtime, (long int) getpid ());
        dfd = open (tmpfile, O_CREAT | O_WRONLY | O_EXCL, 0644);
        if (dfd! = -1) break;
        sleep (1);
    }
    if (dfd == -1) {
        redisLog (REDIS_WARNING, "Opening the temp file needed for MASTER <-> SLAVE synchronization:% s", strerror (errno));
        goto error;
    }

    / * Setup the non blocking download of the bulk file. * /
    if (aeCreateFileEvent (server.el, fd, AE_READABLE, readSyncBulkPayload, NULL)
            == AE_ERR)
    {
        redisLog (REDIS_WARNING,
            "Can't create readable event for SYNC:% s (fd =% d)",
            strerror (errno), fd);
        goto error;
    }

    server.repl_state = REDIS_REPL_TRANSFER;
    server.repl_transfer_size = -1;
    server.repl_transfer_read = 0;
    server.repl_transfer_last_fsync_off = 0;
    server.repl_transfer_fd = dfd;
    server.repl_transfer_lastio = server.unixtime;
    server.repl_transfer_tmpfile = zstrdup (tmpfile);
    return;

error:
    close (fd);
    server.repl_transfer_s = -1;
    server.repl_state = REDIS_REPL_CONNECT;
    return;
}
         In replication, the concept of a cacheMaster is required to temporarily cache the information of the master client. It is generally used when the master and slave are suddenly disconnected, and can be quickly restored next time the master-slave synchronization is performed:
/ * Cache client information * /
void replicationCacheMaster (redisClient * c) {
    listNode * ln;

    redisAssert (server.master! = NULL && server.cached_master == NULL);
    redisLog (REDIS_NOTICE, "Caching the disconnected master state.");

    / * Remove from the list of clients, we don't want this client to be
     * listed by CLIENT LIST or processed in any way by batch operations. * /
    // Remove this client first
    ln = listSearchKey (server.clients, c);
    redisAssert (ln! = NULL);
    listDelNode (server.clients, ln);

    / * Save the master. Server.master will be set to null later by
     * replicationHandleMasterDisconnection (). * /
    // Save as slow...

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.