Redis Persistence Practice and disaster recovery simulation (bottom)

Source: Internet
Author: User
Tags epoll redis uuid redis server

second, disaster recovery simulation

Since persistent data is used for data recovery after a reboot, it is very necessary to perform such a disaster recovery simulation. It is said that if the data is to be persisted and stability is to be ensured, half of the physical memory is recommended to be left blank. Because at the time of the snapshot, the sub-processes that fork out the dump operation will occupy the same memory as the parent process, the real copy-on-write, the performance impact and the memory consumption are relatively large.
At present, the usual design idea is to use the replication mechanism to compensate for aof, snapshot performance deficiencies, to achieve data persistence.

That is, master on snapshot and aof do not do, to ensure that master read and write performance, and slave on the same time to open snapshot and aof to persist, to ensure the security of data.

First, modify the following configuration on master:
$ sudo vim/opt/redis/etc/redis.conf

#save 1 #禁用Snapshot
#save
#save 10000

appendonly no #禁用AOF

Next, modify the following configuration on the slave:
$ sudo vim/opt/redis/etc/redis.conf

Save 1 #启用Snapshot Save
10000

appendonly Yes #启用AOF
appendfilename appendonly.aof #AOF File name
# Appendfsync always 
appendfsync everysec #每秒钟强制写入磁盘一次
# Appendfsync No  
 
No-appendfsync-on-rewrite Yes   #在日志重写时, no command append operation
auto-aof-rewrite-percentage #自动启动新的日志重写过程
Auto-aof-rewrite-min-size 64MB  #启动新的日志重写过程的最小值

Start master and slave, respectively
$/etc/init.d/redis Start

Confirm that the snapshot parameter is not started in master after boot is complete
Redis 127.0.0.1:6379> CONFIG GET Save
1) "Save"
2) ""

Then generate 250,000 data in master with the following script:
dongguo@redis:/opt/redis/data/6379$ Cat redis-cli-generate.temp.sh

#!/bin/bash

rediscli= "redis-cli-a slavepass-n 1 SET"
id=1 while

(($ID <50001))
do
  Instance_ Name= "i-2-$ID-vm"
  uuid= ' Cat/proc/sys/kernel/random/uuid ' private_ip_address=10
  . ' echo ' $RANDOM% 255 + 1 "| BC '. ' echo ' $RANDOM% 255 + 1 "| BC '. ' echo ' $RANDOM% 255 + 1 "| BC ' \
  created= ' Date "+%y-%m-%d%h:%m:%s" '

  $REDISCLI vm_instance: $ID: instance_name "$INSTANCE _name"
  $ REDISCLI vm_instance: $ID: UUID "$UUID"
  $REDISCLI vm_instance: $ID:p rivate_ip_address "$PRIVATE _ip_address"
  $REDISCLI vm_instance: $ID: Created "$CREATED"

  $REDISCLI vm_instance: $INSTANCE _name:id "$ID"

  id=$ (($ id+1)) Done

dongguo@redis:/opt/redis/data/6379$./redis-cli-generate.temp.sh

In the process of generating data, it is clear to see that master only creates the Dump.rdb file on the first slave synchronization, and then gives slave by the way of incremental transfer of commands.
Dump.rdb file no longer grows.
dongguo@redis:/opt/redis/data/6379$ LS-LH
Total 4.0K
-rw-r--r--1 root root Sep 00:40 Dump.rdb

On the slave, you can see that dump.rdb files and aof files are growing, and the aof file growth rate is significantly larger than the Dump.rdb file.
dongguo@redis-slave:/opt/redis/data/6379$ LS-LH
Total 24M
-rw-r--r--1 root root 15M Sep 12:06 appendonly.aof
-rw-r--r--1 root root 9.2M Sep 12:06 Dump.rdb

After the data is inserted, the current amount of data is acknowledged first.
Redis 127.0.0.1:6379> Info

redis_version:2.4.17 redis_git_sha1:00000000 redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version : 4.4.5 process_id:27623 run_id:e00757f7b2d6885fa9811540df9dfed39430b642 uptime_in_seconds:1541 uptime_in_days:0 LRU
_clock:650187 used_cpu_sys:69.28 used_cpu_user:7.67 used_cpu_sys_children:0.00 used_cpu_user_children:0.00
Connected_clients:1 connected_slaves:1 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory:33055824 used_memory_human:31.52m used_memory_rss:34717696 used_memory_peak:33055800 used_memory_peak_ human:31.52m mem_fragmentation_ratio:1.05 mem_allocator:jemalloc-3.0.0 loading:0 aof_enabled:0 changes_since_last_ save:250000 bgsave_in_progress:0 last_save_time:1348677645 bgrewriteaof_in_progress:0 total_connections_received : 250007 total_commands_processed:750019 expired_keys:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_ channels:0 pubsub_patterns:0 latest_fork_usec:246 vm_enabled:0 role:master slave0:10.6.1.144,6379,online
Db1:keys=250000,expires=0
 

The current amount of data is 250,000 key, which consumes 31.52M of memory.

Then we kill the master Redis process directly, simulating the disaster.
dongguo@redis:/opt/redis/data/6379$ sudo killall-9 redis-server

Let's check the status in slave:
Redis 127.0.0.1:6379> Info

redis_version:2.4.17 redis_git_sha1:00000000 redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version : 4.4.5 process_id:13003 run_id:9b8b398fc63a26d160bf58df90cf437acce1d364 uptime_in_seconds:1627 uptime_in_days:0 LRU
_clock:654181 used_cpu_sys:29.69 used_cpu_user:1.21 used_cpu_sys_children:1.70 used_cpu_user_children:1.23
Connected_clients:1 connected_slaves:0 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory:33047696 used_memory_human:31.52m used_memory_rss:34775040 used_memory_peak:33064400 used_memory_peak_ human:31.53m mem_fragmentation_ratio:1.05 mem_allocator:jemalloc-3.0.0 loading:0 aof_enabled:1 changes_since_last_
save:3308 bgsave_in_progress:0 last_save_time:1348718951 bgrewriteaof_in_progress:0 total_connections_received:4
total_commands_processed:250308 expired_keys:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:694 vm_enabled:0 role:slave aof_current_size:17908619 aof_base_size:16787337 aof_pending_rewrite:0 aof_buffer_length:0 aof_pending_bio_fsync:0 master_host:10.6.1.143 master_port : 6379 master_link_status:down master_last_io_seconds_ago:-1 master_sync_in_progress:0 master_link_down_since_
 Seconds:25 slave_priority:100 db1:keys=250000,expires=0

You can see that the status of Master_link_status is down and master is inaccessible.
At this point, the slave still works well and retains aof and RDB files.

Below we will restore the data on master through the saved aof and Rdb files on the slave.

First, the synchronization state on the slave is canceled to prevent the main library from restarting before the incomplete data is restored, thus directly overwriting the data from the library, resulting in all data loss.
Redis 127.0.0.1:6379> slaveof NO One
Ok

Confirm that there is no master-related configuration information:
Redis 127.0.0.1:6379> INFO

redis_version:2.4.17 redis_git_sha1:00000000 redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version : 4.4.5 process_id:13003 run_id:9b8b398fc63a26d160bf58df90cf437acce1d364 uptime_in_seconds:1961 uptime_in_days:0 LRU
_clock:654215 used_cpu_sys:29.98 used_cpu_user:1.22 used_cpu_sys_children:1.76 used_cpu_user_children:1.42
Connected_clients:1 connected_slaves:0 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory:33047696 used_memory_human:31.52m used_memory_rss:34779136 used_memory_peak:33064400 used_memory_peak_ human:31.53m mem_fragmentation_ratio:1.05 mem_allocator:jemalloc-3.0.0 loading:0 aof_enabled:1 changes_since_last_ save:0 bgsave_in_progress:0 last_save_time:1348719252 bgrewriteaof_in_progress:0 total_connections_received:4 Total _commands_processed:250311 expired_keys:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:0 pubsub_ patterns:0 latest_fork_usec:1119 vm_enabled:0 role:master aof_current_size:17908619 aof_base_size:16787337 aof_pending_rewrite:0 aof_buffer_length:0 aof_pending_bio_fsync:0 db1:keys=250000,expires=0
 

To copy data files on slave:
dongguo@redis-slave:/opt/redis/data/6379$ Tar Cvf/home/dongguo/data.tar *
Appendonly.aof
Dump.rdb

Upload the Data.tar to master and try to recover the data:
You can see that the master directory has an initialized slave data file, which is small and is deleted.
dongguo@redis:/opt/redis/data/6379$ ls-l
Total 4
-rw-r--r--1 root root Sep 00:40 Dump.rdb
dongguo@redis:/opt/redis/data/6379$ sudo rm-f dump.rdb

Then unzip the data file:
dongguo@redis:/opt/redis/data/6379$ sudo tar Xf/home/dongguo/data.tar
dongguo@redis:/opt/redis/data/6379$ LS-LH
Total 29M
-rw-r--r--1 root root 18M Sep 01:22 appendonly.aof
-rw-r--r--1 root root 12M Sep 01:22 Dump.rdb

Start Redis on master;
dongguo@redis:/opt/redis/data/6379$ Sudo/etc/init.d/redis Start
Starting Redis Server ...

To see if the data is restored:
Redis 127.0.0.1:6379> INFO

redis_version:2.4.17 redis_git_sha1:00000000 redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version : 4.4.5 process_id:16959 run_id:6e5ba6c053583414e75353b283597ea404494926 uptime_in_seconds:22 uptime_in_days:0 lru_ clock:650292 used_cpu_sys:0.18 used_cpu_user:0.20 used_cpu_sys_children:0.00 used_cpu_user_children:0.00 connected_ Clients:1 connected_slaves:0 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory : 33047216 used_memory_human:31.52m used_memory_rss:34623488 used_memory_peak:33047192 Used_memory_peak_human
: 31.52M mem_fragmentation_ratio:1.05 mem_allocator:jemalloc-3.0.0 loading:0 aof_enabled:0 changes_since_last_save:0 bgsave_in_progress:0 last_save_time:1348680180 bgrewriteaof_in_progress:0 total_connections_received:1 Total_ Commands_processed:1 expired_keys:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:0 pubsub_
 patterns:0 latest_fork_usec:0 vm_enabled:0 Role:master db1:keys=250000,expires=0

You can see that 250,000 of the data has been fully restored to master.

At this point, you can safely restore the synchronization settings of the slave.
Redis 127.0.0.1:6379> slaveof 10.6.1.143 6379
Ok

To view the synchronization status:
Redis 127.0.0.1:6379> INFO

redis_version:2.4.17 redis_git_sha1:00000000 redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version : 4.4.5 process_id:13003 run_id:9b8b398fc63a26d160bf58df90cf437acce1d364 uptime_in_seconds:2652 uptime_in_days:0 LRU
_clock:654284 used_cpu_sys:30.01 used_cpu_user:2.12 used_cpu_sys_children:1.76 used_cpu_user_children:1.42
Connected_clients:2 connected_slaves:0 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory:33056288 used_memory_human:31.52m used_memory_rss:34766848 used_memory_peak:33064400 used_memory_peak_ human:31.53m mem_fragmentation_ratio:1.05 mem_allocator:jemalloc-3.0.0 loading:0 aof_enabled:1 changes_since_last_ save:0 bgsave_in_progress:0 last_save_time:1348719252 bgrewriteaof_in_progress:1 total_connections_received:6 Total _commands_processed:250313 expired_keys:0 evicted_keys:0 keyspace_hits:0 keyspace_misses:0 pubsub_channels:0 pubsub_ patterns:0 latest_fork_usec:12217 vm_enabled:0 role:slave aof_current_size:17908619 aof_base_size:16787337 aof_pending_rewrite:0 aof_buffer_length:0 aof_pending_bio_fsync:0 master_host:10.6.1.143 master_port : 6379 master_link_status:up master_last_io_seconds_ago:0 master_sync_in_progress:0 slave_priority:100 db1:keys=
 250000,expires=0

The master_link_status is displayed as up and the sync status is normal.

In the process of this recovery, we also copied the aof and RDB files, then which one of the files completed the recovery of the data.
In fact, when the Redis server is hung up, the data is restored to memory at the following priority level when restarted:
1. If only configure AOF, load aof file to recover data when reboot;
2. If both RDB and AOF are configured at the same time, startup is only load aof file recovery data;
3. If only the RDB is configured, startup is the load dump file to recover the data.

That is, the priority of the AOF is higher than the RDB, which is well understood, because the AOF itself guarantees the integrity of the data above the RDB.

In this case, we secured the data by enabling the AOF and Rdb on the slave, and restored the master.

However, in our current online environment, because the data are set to expire time, the use of aof is not very practical, too frequent write operations will cause the AOF file to grow to an unusually large, much more than the actual amount of our data, which will also lead to data recovery in a significant amount of time.
Therefore, you can only open snapshot on slave to localize, and you can consider the frequency of Save or call a scheduled task to perform periodic bgsave of snapshot storage to ensure the integrity of the localized data as much as possible.
In such a framework, if only master hangs, slave complete, data recovery can reach 100%.
If master and slave are hung at the same time, data recovery can also be achieved in an acceptable degree.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.