Redis持久化實踐及災難恢複類比(下)

來源:互聯網
上載者:User

二、災難恢複類比

既然持久化的資料的作用是用於重啟後的資料恢複,那麼我們就非常有必要進行一次這樣的災難恢複類比了。 據稱如果資料要做持久化又想保證穩定性,則建議留空一半的實體記憶體。因為在進行快照的時候,fork出來進行dump操作的子進程會佔用與父進程一樣的記憶體,真正的copy-on-write,對效能的影響和記憶體的耗用都是比較大的。
目前,通常的設計思路是利用Replication機制來彌補aof、snapshot效能上的不足,達到了資料可持久化。

即Master上Snapshot和AOF都不做,來保證Master的讀寫效能,而Slave上則同時開啟Snapshot和AOF來進行持久化,保證資料的安全性。

首先,修改Master上的如下配置:
$ sudo vim /opt/redis/etc/redis.conf

#save 900 1 #禁用Snapshot#save 300 10#save 60 10000appendonly no #禁用AOF

接著,修改Slave上的如下配置:
$ sudo vim /opt/redis/etc/redis.conf

save 900 1 #啟用Snapshotsave 300 10save 60 10000appendonly yes #啟用AOFappendfilename appendonly.aof #AOF檔案的名稱# appendfsync always appendfsync everysec #每秒鐘強制寫入磁碟一次# appendfsync no   no-appendfsync-on-rewrite yes   #在日誌重寫時,不進行命令追加操作auto-aof-rewrite-percentage 100 #自動啟動新的日誌重寫過程auto-aof-rewrite-min-size 64mb  #啟動新的日誌重寫過程的最小值

分別啟動Master與Slave
$ /etc/init.d/redis start

啟動完成後在Master中確認未啟動Snapshot參數
redis 127.0.0.1:6379> CONFIG GET save
1) "save"
2) ""

然後通過以下指令碼在Master中產生25萬條資料:
dongguo@redis:/opt/redis/data/6379$ cat redis-cli-generate.temp.sh

#!/bin/bashREDISCLI="redis-cli -a slavepass -n 1 SET"ID=1while(($ID<50001))do  INSTANCE_NAME="i-2-$ID-VM"  UUID=`cat /proc/sys/kernel/random/uuid`  PRIVATE_IP_ADDRESS=10.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`\  CREATED=`date "+%Y-%m-%d %H:%M:%S"`  $REDISCLI vm_instance:$ID:instance_name "$INSTANCE_NAME"  $REDISCLI vm_instance:$ID:uuid "$UUID"  $REDISCLI vm_instance:$ID:private_ip_address "$PRIVATE_IP_ADDRESS"  $REDISCLI vm_instance:$ID:created "$CREATED"  $REDISCLI vm_instance:$INSTANCE_NAME:id "$ID"  ID=$(($ID+1))done

dongguo@redis:/opt/redis/data/6379$ ./redis-cli-generate.temp.sh

在資料的產生過程中,可以很清楚的看到Master上僅在第一次做Slave同步時建立了dump.rdb檔案,之後就通過增量傳輸命令的方式給Slave了。
dump.rdb檔案沒有再增大。
dongguo@redis:/opt/redis/data/6379$ ls -lh
total 4.0K
-rw-r--r-- 1 root root 10 Sep 27 00:40 dump.rdb

而Slave上則可以看到dump.rdb檔案和AOF檔案在不斷的增大,並且AOF檔案的增長速度明顯大於dump.rdb檔案。
dongguo@redis-slave:/opt/redis/data/6379$ ls -lh
total 24M
-rw-r--r-- 1 root root 15M Sep 27 12:06 appendonly.aof
-rw-r--r-- 1 root root 9.2M Sep 27 12:06 dump.rdb

等待資料插入完成以後,首先確認當前的資料量。
redis 127.0.0.1:6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:27623run_id:e00757f7b2d6885fa9811540df9dfed39430b642uptime_in_seconds:1541uptime_in_days:0lru_clock:650187used_cpu_sys:69.28used_cpu_user:7.67used_cpu_sys_children:0.00used_cpu_user_children:0.00connected_clients:1connected_slaves:1client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33055824used_memory_human:31.52Mused_memory_rss:34717696used_memory_peak:33055800used_memory_peak_human:31.52Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:0changes_since_last_save:250000bgsave_in_progress:0last_save_time:1348677645bgrewriteaof_in_progress:0total_connections_received:250007total_commands_processed:750019expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:246vm_enabled:0role:masterslave0:10.6.1.144,6379,onlinedb1:keys=250000,expires=0

當前的資料量為25萬條key,佔用記憶體31.52M。

然後我們直接Kill掉Master的Redis進程,類比災難。
dongguo@redis:/opt/redis/data/6379$ sudo killall -9 redis-server

我們到Slave中查看狀態:
redis 127.0.0.1:6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:1627uptime_in_days:0lru_clock:654181used_cpu_sys:29.69used_cpu_user:1.21used_cpu_sys_children:1.70used_cpu_user_children:1.23connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047696used_memory_human:31.52Mused_memory_rss:34775040used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:3308bgsave_in_progress:0last_save_time:1348718951bgrewriteaof_in_progress:0total_connections_received:4total_commands_processed:250308expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:694vm_enabled:0role:slaveaof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0master_host:10.6.1.143master_port:6379master_link_status:downmaster_last_io_seconds_ago:-1master_sync_in_progress:0master_link_down_since_seconds:25slave_priority:100db1:keys=250000,expires=0

可以看到master_link_status的狀態已經是down了,Master已經不可訪問了。
而此時,Slave依然運行良好,並且保留有AOF與RDB檔案。

下面我們將通過Slave上儲存好的AOF與RDB檔案來恢複Master上的資料。

首先,將Slave上的同步狀態取消,避免主庫在未完成資料恢複前就重啟,進而直接覆蓋掉從庫上的資料,導致所有的資料丟失。
redis 127.0.0.1:6379> SLAVEOF NO ONE
OK

確認一下已經沒有了master相關的配置資訊:
redis 127.0.0.1:6379> INFO

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:1961uptime_in_days:0lru_clock:654215used_cpu_sys:29.98used_cpu_user:1.22used_cpu_sys_children:1.76used_cpu_user_children:1.42connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047696used_memory_human:31.52Mused_memory_rss:34779136used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:0bgsave_in_progress:0last_save_time:1348719252bgrewriteaof_in_progress:0total_connections_received:4total_commands_processed:250311expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:1119vm_enabled:0role:masteraof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0db1:keys=250000,expires=0

在Slave上複製資料檔案:
dongguo@redis-slave:/opt/redis/data/6379$ tar cvf /home/dongguo/data.tar *
appendonly.aof
dump.rdb

將data.tar上傳到Master上,嘗試恢複資料:
可以看到Master目錄下有一個初始化Slave的資料檔案,很小,將其刪除。
dongguo@redis:/opt/redis/data/6379$ ls -l
total 4
-rw-r--r-- 1 root root 10 Sep 27 00:40 dump.rdb
dongguo@redis:/opt/redis/data/6379$ sudo rm -f dump.rdb

然後解壓縮資料檔案:
dongguo@redis:/opt/redis/data/6379$ sudo tar xf /home/dongguo/data.tar
dongguo@redis:/opt/redis/data/6379$ ls -lh
total 29M
-rw-r--r-- 1 root root 18M Sep 27 01:22 appendonly.aof
-rw-r--r-- 1 root root 12M Sep 27 01:22 dump.rdb

啟動Master上的Redis;
dongguo@redis:/opt/redis/data/6379$ sudo /etc/init.d/redis start
Starting Redis server...

查看資料是否恢複:
redis 127.0.0.1:6379> INFO

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:16959run_id:6e5ba6c053583414e75353b283597ea404494926uptime_in_seconds:22uptime_in_days:0lru_clock:650292used_cpu_sys:0.18used_cpu_user:0.20used_cpu_sys_children:0.00used_cpu_user_children:0.00connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047216used_memory_human:31.52Mused_memory_rss:34623488used_memory_peak:33047192used_memory_peak_human:31.52Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:0changes_since_last_save:0bgsave_in_progress:0last_save_time:1348680180bgrewriteaof_in_progress:0total_connections_received:1total_commands_processed:1expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:0vm_enabled:0role:masterdb1:keys=250000,expires=0

可以看到25萬條資料已經完整恢複到了Master上。

此時,可以放心的恢複Slave的同步設定了。
redis 127.0.0.1:6379> SLAVEOF 10.6.1.143 6379
OK

查看同步狀態:
redis 127.0.0.1:6379> INFO

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:2652uptime_in_days:0lru_clock:654284used_cpu_sys:30.01used_cpu_user:2.12used_cpu_sys_children:1.76used_cpu_user_children:1.42connected_clients:2connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33056288used_memory_human:31.52Mused_memory_rss:34766848used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:0bgsave_in_progress:0last_save_time:1348719252bgrewriteaof_in_progress:1total_connections_received:6total_commands_processed:250313expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:12217vm_enabled:0role:slaveaof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0master_host:10.6.1.143master_port:6379master_link_status:upmaster_last_io_seconds_ago:0master_sync_in_progress:0slave_priority:100db1:keys=250000,expires=0

master_link_status顯示為up,同步狀態正常。

在此次恢複的過程中,我們同時複製了AOF與RDB檔案,那麼到底是哪一個檔案完成了資料的恢複呢。
實際上,當Redis伺服器掛掉時,重啟時將按照以下優先順序恢複資料到記憶體:
1. 如果只配置AOF,重啟時載入AOF檔案恢複資料;
2. 如果同時 配置了RDB和AOF,啟動是只載入AOF檔案恢複資料;
3. 如果只配置RDB,啟動是將載入dump檔案恢複資料。

也就是說,AOF的優先順序要高於RDB,這也很好理解,因為AOF本身對資料的完整性保障要高於RDB。

在此次的案例中,我們通過在Slave上啟用了AOF與RDB來保障了資料,並恢複了Master。

但在我們目前的線上環境中,由於資料都設定有到期時間,採用AOF的方式會不太實用,過於頻繁的寫操作會使AOF檔案增長到異常的龐大,大大超過了我們實際的資料量,這也會導致在進行資料恢複時耗用大量的時間。
因此,可以在Slave上僅開啟Snapshot來進行本地化,同時可以考慮將save中的頻率調高一些或者調用一個計劃任務來進行定期bgsave的快照儲存,來儘可能的保障本地化資料的完整性。
在這樣的架構下,如果僅僅是Master掛掉,Slave完整,資料恢複可達到100%。
如果Master與Slave同時掛掉的話,資料的恢複也可以達到一個可接受的程度。


聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.