Replication is the master server to synchronize data from the server, the implementation of Redis is of reference, especially for the master-slave synchronization requirements, the data is not particularly large, can be used in memory storage, where the clever thing is offset and buffer pairing use, so that the main server can quickly know what data should be synchronized Even in the face of multiple slave servers you can do the same process.
In the book, the old version and the new version in the copy process, the broken chain occurs when the different processing, to describe the process of replication.
The old version replicates mostly 2 ways:
1. From the server to actively send sync to the primary server, the primary server starts Bgsave, generates a DB file, to the server, from the server to load the db file, and then the master server will receive the process of the write commands are also synchronized to the slave server, from the server also execute these write commands, synchronization end.
The process of sync is very resource intensive, CPU, memory, bandwidth will be occupied at different times
2. The master server DB is the same, the primary server executes the Write command, the active synchronization to the slave server, called the command propagation
The old version of the bug: the 2nd, if the primary server and from the server broken chain, the primary server is not aware of, still carry out command propagation, the primary server on the write command to the slave server, and from the server can not be received, the final connection after the data is inconsistent.
Of course, after the chain can be directly to sync, but this is too expensive resources.
New version:
With the Psync command, after the broken chain, wait until the chain, from the server to send Psync to the primary server, the master server will be broken link time period of all the write commands to the slave server.
The +continue reply identifies the beginning part of the synchronization.
Implementation details of the two-part replication:
1. First, the master-slave server will note the number of bytes synchronized, the so-called replication offset, the primary server records the synchronization of how many bytes to the slave server, from the server records received to the primary server synchronization how many bytes, 15-7 synchronized 10086 bytes of data
If the chain after the recovery, from the server to send Psync to the primary server, with the local replication offset, the master server compared to the local difference, it means that the master-slave data inconsistent, need to synchronize data.
2. The primary service discovers that the local offset is inconsistent with the offset from the server, which means synchronizing the data.
The primary server will have a fixed-length (1MB) FIFO replication backlog buffer, and each byte copied to the slave server will be stored in this buffer, each of which corresponds to an offset.
When sending Psync from the server, bringing its own offset, the primary server looks in the replication backlog buffer, if offset+1 is still in the buffer, not squeezed out of the queue (fixed-length FIFO, stores the latest command, if the queue is full, the oldest command is squeezed out of the queue), You can perform partial replication. Otherwise, all copies will be made, i.e. sync
2. The server run ID, from which the server stores the run ID of the primary server, which is globally unique. If the broken chain is re-linked, the master server discovers that the primary server from the server run ID (accompanying the Psync command) is not the same as its own, then the full replication, if the ID is consistent, you can choose to make partial replication.
[Redis Reading notes] The implementation of the third part of the multi-machine database replication