Redis master-slave Replication

Source: Internet
Author: User
 

Redis currently provides an asynchronous master-slave replication version system. Redis provides several methods to do this. Master-slave replication mainly corresponds to the redis/replication. c file. The source code framework consists of three parts: Master, slave, and replication.

In fact, I personally think it is more difficult to master-slave replication, that is, the Master/Slave Data Transmission Mode after each restart.

First, slave actively connects to its master.

int connectWithMaster(void) {    int fd;    fd = anetTcpNonBlockConnect(NULL,server.masterhost,server.masterport);    if (fd == -1) {        redisLog(REDIS_WARNING,"Unable to connect to MASTER: %s",            strerror(errno));        return REDIS_ERR;    }    if (aeCreateFileEvent(server.el,fd,AE_READABLE|AE_WRITABLE,syncWithMaster,NULL) ==            AE_ERR)    {        close(fd);        redisLog(REDIS_WARNING,"Can‘t create readable event for SYNC");        return REDIS_ERR;    }    server.repl_transfer_lastio = server.unixtime;    server.repl_transfer_s = fd;    server.repl_state = REDIS_REPL_CONNECTING;    return REDIS_OK;}

Aecreatefileevent (server. El, FD, AE _readable | AE _writable, syncwithmaster, null) is to register a readable and writable event. Note that the event processing function is syncwithmaster.

The rep_state state is changed to redis_prepl_connecting. the replication of the relevant server is adjusted accordingly.

Now let's go to the syncwithmaster and see:

if (server.repl_state == REDIS_REPL_CONNECTING) {     redisLog(REDIS_NOTICE,"Non blocking connect for SYNC fired the event.");     /* Delete the writable event so that the readable event remains      * registered and we can wait for the PONG reply. */     aeDeleteFileEvent(server.el,fd,AE_WRITABLE);     server.repl_state = REDIS_REPL_RECEIVE_PONG;     /* Send the PING, don‘t check for errors at all, we have the timeout      * that will take care about this. */     syncWrite(fd,"PING\r\n",6,100);     return; }

First, the client slave sends a ping to the server master. This is a response with timeout. The status is changed to redis_repl_receive_pong, which means that the master will take the corresponding action when it receives the request. For the slave end, the next step is redis_repl_receive_pong. Actually, it is ready to accept a value.

buf[0] = ‘\0‘;        if (syncReadLine(fd,buf,sizeof(buf),            server.repl_syncio_timeout*1000) == -1)        {            redisLog(REDIS_WARNING,                "I/O error reading PING reply from master: %s",                strerror(errno));            goto error;        }

This is the core task of Pong state: Read it and determine whether there is corresponding content.

if (syncWrite(fd,"SYNC\r\n",6,server.repl_syncio_timeout*1000) == -1) {        redisLog(REDIS_WARNING,"I/O error writing to MASTER: %s",            strerror(errno));        goto error;    }
  ……if (aeCreateFileEvent(server.el,fd, AE_READABLE,readSyncBulkPayload,NULL)            == AE_ERR)

Send a sync to maseter and register a readsyncbulkpayload in the readable state. Wait and check whether the corresponding bit is set after this event function:

server.repl_state = REDIS_REPL_TRANSFER;    server.repl_transfer_size = -1;    server.repl_transfer_read = 0;    server.repl_transfer_last_fsync_off = 0;    server.repl_transfer_fd = dfd;    server.repl_transfer_lastio = server.unixtime;    server.repl_transfer_tmpfile = zstrdup(tmpfile);

Repl_transfer_size setting-1 indicates that the file size received from the master is-1. The status changes to repl_transfer. Now go to readsyncbulkpayload to see how this function is accepted:

server.repl_transfer_size = strtol(buf+1,NULL,10);

First, determine the size of the file to be sent by the other party and read the Buf into the corresponding file written to RDB.

left = server.repl_transfer_size - server.repl_transfer_read;    readlen = (left < (signed)sizeof(buf)) ? left : (signed)sizeof(buf);    nread = read(fd,buf,readlen);.............................................................................................................................................................................................write(server.repl_transfer_fd,buf,nread) != nread) server.repl_transfer_read += nread;/* Check if the transfer is now complete */    if (server.repl_transfer_read == server.repl_transfer_size) {        if (rename(server.repl_transfer_tmpfile,server.rdb_filename) == -1) {            redisLog(REDIS_WARNING,"Failed trying to rename the temp DB into dump.rdb in MASTER <-> SLAVE synchronization: %s", strerror(errno));            replicationAbortSyncTransfer();            return;        }        redisLog(REDIS_NOTICE, "MASTER <-> SLAVE sync: Loading DB in memory");        signalFlushedDb(-1);        emptyDb();        /* Before loading the DB into memory we need to delete the readable         * handler, otherwise it will get called recursively since         * rdbLoad() will call the event loop to process events from time to         * time for non blocking loading. */        aeDeleteFileEvent(server.el,server.repl_transfer_s,AE_READABLE);        if (rdbLoad(server.rdb_filename) != REDIS_OK) {            redisLog(REDIS_WARNING,"Failed trying to load the MASTER synchronization DB from disk");            replicationAbortSyncTransfer();            return;        }

If the two are equal, replace the read and transfer_size values with the rdb_filename values, clear the database emptydb (), destroy the readable event, and call rdbload to recreate a copy of The key_value database locally. [Master] All operations have been completed. The sending size must be limited by both parties. We can find the corresponding event from the master part:

For the master server, in addition to sending a ping from the client and expecting a reply from the host, what is actually useful for this master-slave replication is the slave server operation:

Write (FD, "Sync \ r \ n", Buf ). when this action is triggered: The Master will call synccommand () to complete the corresponding copy action: first, go to the synccommand () function to see what the situation is:

To complete the replication, the master enters an bgsave operation at the right time. Make sure that the RDB file is the latest file. For the master, first look at rdb_pid! =-1 if the condition is met, it indicates that the master is performing this operation. The master only needs to wait until the operation is completed to perform the corresponding action. If it is not sync, A bgsave operation is triggered. Then, for the main process, the status will be set to wait_bgsave_end. At this time, the synccommand is complete, and the replication operation has not started.

During the bgsavecommand operation, a function is called:UpdateslaveswaitingbgsaveIn this way, there will be no synchronization wait.

if ((slave->repldbfd = open(server.rdb_filename,O_RDONLY)) == -1 ||

Open the corresponding repldbfd and prepare to copy the file:

aeCreateFileEvent(server.el, slave->fd, AE_WRITABLE, sendBulkToSlave, slave) == AE_ERR)

Register a sendbulktoslave. Send it to salve. Note the following:

Note: 1) how to set the sending buffer size is the same as that of slave? 2) whether sending is a synchronous process or an asynchronous Process

Redis_iobuf_len: 1024*16 this variable is the number of RDB read at a time or enter sendbulktoslave () to see:

{    redisClient *slave = privdata;    REDIS_NOTUSED(el);    REDIS_NOTUSED(mask);    char buf[REDIS_IOBUF_LEN];    ssize_t nwritten, buflen;    if (slave->repldboff == 0) {        /* Write the bulk write count before to transfer the DB. In theory here         * we don‘t know how much room there is in the output buffer of the         * socket, but in practice SO_SNDLOWAT (the minimum count for output         * operations) will never be smaller than the few bytes we need. */        sds bulkcount;        bulkcount = sdscatprintf(sdsempty(),"$%lld\r\n",(unsigned long long)            slave->repldbsize);        if (write(fd,bulkcount,sdslen(bulkcount)) != (signed)sdslen(bulkcount))        {            sdsfree(bulkcount);            freeClient(slave);            return;        }        sdsfree(bulkcount);    }    lseek(slave->repldbfd,slave->repldboff,SEEK_SET);    buflen = read(slave->repldbfd,buf,REDIS_IOBUF_LEN);    if (buflen <= 0) {        redisLog(REDIS_WARNING,"Read error sending DB to slave: %s",            (buflen == 0) ? "premature EOF" : strerror(errno));        freeClient(slave);        return;    }    if ((nwritten = write(fd,buf,buflen)) == -1) {        redisLog(REDIS_VERBOSE,"Write error sending DB to slave: %s",            strerror(errno));        freeClient(slave);        return;    }    slave->repldboff += nwritten;    if (slave->repldboff == slave->repldbsize) {        close(slave->repldbfd);        slave->repldbfd = -1;        aeDeleteFileEvent(server.el,slave->fd,AE_WRITABLE);        slave->replstate = REDIS_REPL_ONLINE;        if (aeCreateFileEvent(server.el, slave->fd, AE_WRITABLE,            sendReplyToClient, slave) == AE_ERR) {            freeClient(slave);            return;        }        redisLog(REDIS_NOTICE,"Synchronization with slave succeeded");    }}

If repldoff = 0, it indicates that it is the first initialization, that is, it will send a length data to the other party slave. This is the first time. Note that if write is successfully called, it will continue. Lseek (slave-> repldbfd, slave-> repldboff, seek_set); locates the corresponding position each time, which is very annoying to call a random operation on the disk, if the file is large, the performance will be greatly affected. Buflen = read (slave-> repldbfd, Buf, redis_iobuf_len); then read the memory and write the data. Because the event model is not closed, epoll will think that this event still needs to be executed: it is still ready, so continuing to call this function is essentially an asynchronous operation. Therefore, there will be no service interruption, but lseek is time-consuming. After the FD is successfully copied and the replstate state is marked as repl_online, the State is the command propagation state. Register a new functionSendreplytoclientOf course, del the previous function event. Therefore, each time the server sends a much smaller Buf than the slave end, the master-slave replication core is here.

 

Redis master-slave Replication

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.