redis主從同步原理

來源:互聯網
上載者:User
1. 概述

整體過程概述如下:
1. 初始化
配置好主從後,無論slave是初次還是重新串連到master, slave都會發送PSYNC命令到master。
如果是重新串連,且滿足增量同步處理的條件(3.1中詳述),那麼redis會將記憶體緩衝隊列中的命令發給slave, 完成增量同步處理(Partial resynchronization)。否則進行全量同步。
2. 正常同步開始
任何對master的寫操作都會以redis命令的方式,通過網路發送給slave。 2. 全量同步(full resynchronization) 2.1 過程 slave發送PSYNC master執行bgsave產生RDB快照檔案,同時將這之後新的寫命令記入緩衝區 master向slave發送快照檔案,並繼續記錄寫命令 slave接收並儲存快照 slave將快照檔案載入記憶體 slave開始接收master中緩衝區的命令完成同步 2.2 執行個體

環境:
- master 127.0.0.1:7779
- slave 127.0.0.1:9303 進程號10967 只有一個key

strace -p 10967 -s 1024 -o redis.strace.full

然後串連到slave, 執行slaveof 127.0.0.1 7779,從strace檔案看到的同步過程中,slave側的動作如下(只摘重要部分)

/*從client執行slaveof命令*/read(6, "*3\r\n$7\r\nslaveof\r\n$9\r\n127.0.0.1\r\n$4\r\n7779\r\n", 16384) = 42/*返回給client OK*/write(6, "+OK\r\n", 5)/*串連到master*/connect(7, {sa_family=AF_INET, sin_port=htons(7779), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)/*以下判斷master是活著的*/write(7, "PING\r\n", 6) read(7, "+", 1)                         = 1read(7, "P", 1)                         = 1read(7, "O", 1)                         = 1read(7, "N", 1)                         = 1read(7, "G", 1)                         = 1read(7, "\r", 1)                        = 1read(7, "\n", 1)                        = 1/*同步開始,向master發PSYNC*/write(7, "PSYNC ? -1\r\n", 12)          = 12/*master告訴salve要執行全量同步*/read(7, "+", 1)                         = 1read(7, "F", 1)                         = 1read(7, "U", 1)                         = 1read(7, "L", 1)                         = 1read(7, "L", 1)                         = 1read(7, "R", 1)                         = 1read(7, "E", 1)                         = 1read(7, "S", 1)                         = 1read(7, "Y", 1)                         = 1read(7, "N", 1)                         = 1read(7, "C", 1)                         = 1/*開啟本地臨時rdb檔案*/open("temp-1472206877.10967.rdb", O_WRONLY|O_CREAT|O_EXCL, 0644) = 8/*接收master發來的rdb檔案*/read(7, "REDIS0006\376\0\0\4name\4xuan\376\1\r\16HOTEL_JUMP_NUM\33\33\0\0\0\30\0\0\0\4\0\0\320\325\2\220\6\6\365\2\320\334\230(\7\6\370\377\377\336\260\222\330\261\317\371\345", 77) = 77/*將接收的rdb寫入臨時rdb*/write(8, "REDIS0006\376\0\0\4name\4xuan\376\1\r\16HOTEL_JUMP_NUM\33\33\0\0\0\30\0\0\0\4\0\0\320\325\2\220\6\6\365\2\320\334\230(\7\6\370\377\377\336\260\222\330\261\317\371\345", 77) = 77/*臨時rdb檔案重新命名*/rename("temp-1472206877.10967.rdb", "dump.rdb") = 0/*開啟本地rdb檔案*/open("dump.rdb", O_RDONLY) = 9/* 從rdb檔案載入資料到slave*/read(9, "REDIS0006\376\0\0\4name\4xuan\376\1\r\16HOTEL_JUMP_NUM\33\33\0\0\0\30\0\0\0\4\0\0\320\325\2\220\6\6\365\2\320\334\230(\7\6\370\377\377\336\260\222\330\261\317\371\345", 4096) = 77/*sync成功完成,記錄日誌*/open("/tmp/redis.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 8fstat(8, {st_mode=S_IFREG|0644, st_size=7627, ...}) = 0write(8, "[10967] 26 Aug 18:21:17.450 * MASTER <-> SLAVE sync: Finished with success\n", 75) = 75

整個過程,與2.1所述一樣,只是因為我們在同步過程中沒對master做操作,所以strace沒有體現出2.1中的第6步。

slave的redis.log也反應了上面的過程。

[10967] 26 Aug 18:21:17.250 * SLAVE OF 127.0.0.1:7779 enabled (user request)[10967] 26 Aug 18:21:17.410 * Connecting to MASTER 127.0.0.1:7779[10967] 26 Aug 18:21:17.413 * MASTER <-> SLAVE sync started[10967] 26 Aug 18:21:17.415 * Non blocking connect for SYNC fired the event.[10967] 26 Aug 18:21:17.418 * Master replied to PING, replication can continue...[10967] 26 Aug 18:21:17.421 * Partial resynchronization not possible (no cached master)[10967] 26 Aug 18:21:17.432 * Full resync from master: 1d13fbd06f644eeb4b50d65f11e65bffd9e596f6:43774[10967] 26 Aug 18:21:17.444 * MASTER <-> SLAVE sync: receiving 77 bytes from master[10967] 26 Aug 18:21:17.446 * MASTER <-> SLAVE sync: Flushing old data[10967] 26 Aug 18:21:17.447 * MASTER <-> SLAVE sync: Loading DB in memory[10967] 26 Aug 18:21:17.450 * MASTER <-> SLAVE sync: Finished with success
3. 增量同步處理(partial resynchronization) 3.1 增量同步處理的條件

幾個重要概念:
- 記憶體緩衝隊列(in-memory backlog):用於記錄串連斷開時master收到的寫操作
- 複製位移量(replication offset):master, slave都有一個位移,記錄當前同步記錄的位置
- master伺服器id(master run ID):master唯一標識,2.2的redis.log中的1d13fbd06f644eeb4b50d65f11e65bffd9e596f6,就是一個master伺服器id。

現網路連接斷開後,slave將嘗試重連master。當滿足下列條件時,重連後會進行增量同步處理:
1. slave記錄的master伺服器id和當前要串連的master伺服器id相同
2. slave的複製位移量比master的位移量靠前。比如slave是1000, master是1100
3. slave的複製位移量所指定的資料仍然儲存在主伺服器的記憶體緩衝隊列中 3.2 同步過程

確認執行增量同步處理後,redis會將記憶體緩衝隊列中的命令通過網路發給slave, 完成增量同步處理 3.3 執行個體

環境:
- master 10.136.30.144:7779
- slave 10.136.31.213 9303 有一個key “h”

首先我們strace slave的進程,然後,為了類比網路斷線,我們在master機器上增加iptables規則,扔掉了所有發往slave的包。

/sbin/iptables -A OUTPUT -d 10.136.31.213 -j DROP

然後,在master上刪除key h

del h

最後,我們刪除iptables規則,類比出網路恢複的狀況。

/sbin/iptables -F

我們先來看slave的日誌

[25667] 26 Aug 15:29:33.241 # Connection with master lost.[25667] 26 Aug 15:29:33.241 * Caching the disconnected master state.[25667] 26 Aug 15:29:33.241 * Connecting to MASTER 10.136.30.144:7779[25667] 26 Aug 15:29:33.241 * MASTER <-> SLAVE sync started[25667] 26 Aug 15:29:54.240 # Error condition on socket for SYNC: Connection timed out[25667] 26 Aug 15:29:54.262 * Connecting to MASTER 10.136.30.144:7779[25667] 26 Aug 15:29:54.263 * MASTER <-> SLAVE sync started[25667] 26 Aug 15:30:15.270 # Error condition on socket for SYNC: Connection timed out[25667] 26 Aug 15:30:15.726 * Connecting to MASTER 10.136.30.144:7779[25667] 26 Aug 15:30:15.726 * MASTER <-> SLAVE sync started[25667] 26 Aug 15:30:36.728 # Error condition on socket for SYNC: Connection timed out[25667] 26 Aug 15:30:37.272 * Connecting to MASTER 10.136.30.144:7779[25667] 26 Aug 15:30:37.279 * MASTER <-> SLAVE sync started[25667] 26 Aug 15:30:37.282 * Non blocking connect for SYNC fired the event.[25667] 26 Aug 15:30:37.289 * Master replied to PING, replication can continue...[25667] 26 Aug 15:30:37.293 * Trying a partial resynchronization (request 1d13fbd06f644eeb4b50d65f11e65bffd9e596f6:29265).[25667] 26 Aug 15:30:37.300 * Successful partial resynchronization with master.[25667] 26 Aug 15:30:37.302 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.

slave發現與master斷開後,一直嘗試重新串連master,直到串連成功後嘗試增量同步處理(partial resynchronization)並最終完成了增量同步處理。

starce的結果同樣反應了上面的過程,摘要如下:

/*重新串連master*/connect(6, {sa_family=AF_INET, sin_port=htons(7779), sin_addr=inet_addr("10.136.30.144")}, 16) = -1 EINPROGRESS (Operation now in progress)/*以下判斷master是活著的*/write(6, "PING\r\n", 6) read(6, "+", 1)                         = 1read(6, "P", 1)                         = 1read(6, "O", 1)                         = 1read(6, "N", 1)                         = 1read(6, "G", 1)                         = 1read(6, "\r", 1)                        = 1read(6, "\n", 1)                        = 1/*slave嘗試增量同步處理,master表示同意*/write(6, "PSYNC 1d13fbd06f644eeb4b50d65f11"..., 54) = 54read(6, "+", 1)                         = 1read(6, "C", 1)                         = 1read(6, "O", 1)                         = 1read(6, "N", 1)                         = 1read(6, "T", 1)                         = 1read(6, "I", 1)                         = 1read(6, "N", 1)                         = 1read(6, "U", 1)                         = 1read(6, "E", 1)                         = 1read(6, "\r", 1)                        = 1read(6, "\n", 1)                        = 1/*讀取斷線期間的增量命令: del h*/read(6, "*1\r\n$4\r\nPING\r\n*2\r\n$3\r\ndel\r\n$1\r\nh"..., 16384) = 188
4. 備忘 本文主要描述reids2.8及以上版本的同步過程,2.8之前的版本會略有不同。 參考 http://redis.io/topics/replication
相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.