Redis data backup and restart recovery

Source: Internet
Author: User
Tags redis server

I. Discussion and understanding of redis persistence

Currently, there are two methods for redis Persistence: RDB and Aof.

First, we should clarify the usage of persistent data. The answer is to recover data after restart.
Redis is a memory database. Both RDB and aof are measures to ensure data recovery.
Therefore, when redis uses RDB and aof for recovery, it will read the RDB or aof file and reload it To the memory.

RDB is the snapshot storage, which is the default persistence method.
It can be understood as the semi-persistent mode, that is, the data is periodically stored to the disk according to certain policies.
The corresponding data file is dump. RDB. the snapshot cycle is defined by the Save parameter in the configuration file.
The following are the default snapshot settings:

Save 900 1 # When one keys data is changed, refresh to disk in 900 seconds. save 300 10 # When 10 keys data is changed, refresh to disk in 300 seconds save 60 10000 # When 10000 pieces of keys data are changed, refresh to disk in 60 seconds

The RDB file of redis will not be broken because the write operation is performed in a new process.
When a new RDB file is generated, the child process generated by redis first writes the data to a temporary file, and then renames the temporary file to the RDB file through an atomic rename system call.
In this way, redis RDB files are always available at any time when a fault occurs.

At the same time, the redis RDB file is also a part of the internal implementation of redis master-slave synchronization.
The first implementation of slave synchronization to the master is:
Slave sends a synchronization request to the master. The master first dumps the RDB file, then transfers the full RDB file to slave, and then the master forwards the cached command to slave. The first synchronization is completed.
The second and future synchronization implementation is:
The master directly sends the snapshots of the variables to each slave in real time.
However, no matter what causes the slave and master to be disconnected and reconnected, the above two steps will be repeated.
Redis master-slave replication is based on the persistence of memory snapshots. As long as there is a slave, there will be a memory snapshot.

It is obvious that RDB has its shortcomings, that is, once a database problem occurs, the data stored in our RDB file is not completely new.
Data from the last RDB file generation to redis downtime is all lost.

Aof (append-only file) is more persistent than RDB.
When the aof persistence method is used, redis will append every write command received to the file through the write function, similar to the BINLOG of MySQL.
When redis is restarted, it re-executes the write commands saved in the file to recreate the entire database content in the memory.
The corresponding setting parameters are:
$ Vim/opt/redis/etc/redis_62.16.conf

Appendonly yes # enable the aof persistence method appendfilename appendonly. aof # aof file name, default: appendonly. aof # appendfsync always # force data to be written to the disk immediately after receiving the write command. This is the most guaranteed and complete persistence, but the speed is also the slowest. It is generally not recommended. Appendfsync everysec # forcibly writes data to a disk every second. It makes a good compromise in terms of performance and persistence and is recommended. # Appendfsync no # write with full dependency on OS, which is usually about 30 seconds. The best performance, but the least guarantee of persistence, is not recommended.

The complete persistence method of aof also brings about another problem. Persistent files will become larger and larger.
For example, if we call the incr test command 100 times, all the 100 commands must be saved in the file, but 99 commands are redundant.
Because it is enough to restore the database status to save a set test 100 in the file.
To Compress aof persistent files, redis provides the bgrewriteaof command.
After receiving this command, redis will save the data in memory to a temporary file in a similar way as a snapshot, and finally replace the original file, to control the growth of aof files.
Because it is a process of simulating snapshots, the old aof file is not read when the aof file is rewritten. Instead, a new aof file is overwritten by commands for the database content in the entire memory.
The corresponding setting parameters are:
$ Vim/opt/redis/etc/redis_62.16.conf

No-appendfsync-on-Rewrite yes # When the log is overwritten, The APPEND Command is not executed, but is only placed in the buffer zone to avoid conflicts with command appending on disk Io. Auto-Aof-rewrite-percentage 100 # When the current aof file size is twice the size of the aof file obtained from the previous log rewriting, a new log rewriting process is automatically started. Auto-Aof-rewrite-Min-size 64 MB # minimum value of the new log rewriting process initiated by the current aof file to avoid frequent rewriting due to the small file size when the reids is just started.

What do you choose? The following are official suggestions:
Generally, if you want to provide high data security, we recommend that you use two persistence methods at the same time.
If you can accept the several minutes of data loss caused by the disaster, you can only use RDB.
Many users only use aof, but we suggest that since RDB can take a complete snapshot of data from time to time and provide faster restart, it is best to use RDB.
Therefore, we hope to unify aof and RDB into a persistent model in the future (long-term plan.

In terms of data recovery:
The RDB startup time is shorter for two reasons:
First, there is only one record for each piece of data in the RDB file, and there may not be multiple operation records for one piece of data as in the aof log. Therefore, you only need to write each data entry once.
Another reason is that the storage format of RDB files is the same as the encoding format of redis data in the memory, and data encoding is not required, therefore, the CPU consumption is much smaller than that of aof logs.

2. Disaster Recovery Simulation
Since persistent data is used to recover data after restart, it is very necessary for us to simulate such a disaster recovery.
It is said that if data needs to be persisted and stability is to be ensured, it is recommended to leave half of the physical memory empty. The reason is that when a snapshot is taken, the child process that fork comes out to perform the dump operation will occupy the same memory as the parent process, and the real copy-on-write, the impact on performance and memory consumption are both relatively large.
At present, the general design idea is to use the replication mechanism to make up for aof and snapshot performance deficiencies, achieving data persistence.
That is, snapshot and aof on the master node are not used to ensure the Read and Write Performance of the master node, while snapshot and aof on the slave are enabled for persistence at the same time to ensure data security.

First, modify the following configurations on the master:
$ Sudo Vim/opt/redis/etc/redis_62.16.conf

# Save 900 1 # disable snapshot # Save 300 10 # Save 60 10000 appendonly no # disable aof

Next, modify the following configurations on the slave:
$ Sudo Vim/opt/redis/etc/redis_62.16.conf

Save 900 1 # enable snapshotsave 300 10 save 60 10000 appendonly yes # enable aofappendfilename appendonly. aof # aof file name # appendfsync alwaysappendfsync everysec # force write once per second # appendfsync no-appendfsync-on-Rewrite yes # during log rewriting, auto-Aof-rewrite-percentage 100 # automatically start the new log rewriting process auto-Aof-rewrite-Min-size 64 MB # start the new log rewriting process minimum value

Start master and slave respectively
$/Etc/init. d/redis start

Confirm the snapshot parameter in the master after the startup is complete.
Redis 127.0.0.1: 6379> config get save
1) "save"
2 )""

Then, use the following script to generate 0.25 million pieces of data in the master:
[Email protected]:/opt/redis/data/6379 $ cat redis-cli-generate.temp.sh

#!/bin/bashREDISCLI="redis-cli -a slavepass -n 1 SET"ID=1while(($ID<50001))do  INSTANCE_NAME="i-2-$ID-VM"  UUID=`cat /proc/sys/kernel/random/uuid`  PRIVATE_IP_ADDRESS=10.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`  CREATED=`date "+%Y-%m-%d %H:%M:%S"`  $REDISCLI vm_instance:$ID:instance_name "$INSTANCE_NAME"  $REDISCLI vm_instance:$ID:uuid "$UUID"  $REDISCLI vm_instance:$ID:private_ip_address "$PRIVATE_IP_ADDRESS"  $REDISCLI vm_instance:$ID:created "$CREATED"  $REDISCLI vm_instance:$INSTANCE_NAME:id "$ID"  ID=$(($ID+1))done

[Email protected]:/opt/redis/data/6379 $./redis-cli-generate.temp.sh

During the data generation process, we can clearly see that the dump. RDB file is created only during the first slave synchronization on the master, and then the slave is delivered through the Incremental Transmission command.
The dump. RDB file does not increase.
[Email protected]:/opt/redis/data/6379 $ LS-lH
Total 4.0 K
-RW-r -- 1 Root 10 Sep 27 00:40 dump. RDB

On the slave, we can see that the dump. RDB file and aof file are constantly increasing, and the aof file growth speed is significantly higher than the dump. RDB file.
[Email protected]:/opt/redis/data/6379 $ LS-lH
Total 24 m
-RW-r -- 1 Root 15 m Sep 27 12:06 appendonly. aof
-RW-r -- 1 Root 9.2 m Sep 27 :06 dump. RDB

After data insertion is complete, check the current data volume.
Redis 127.0.0.1: 6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:27623run_id:e00757f7b2d6885fa9811540df9dfed39430b642uptime_in_seconds:1541uptime_in_days:0lru_clock:650187used_cpu_sys:69.28used_cpu_user:7.67used_cpu_sys_children:0.00used_cpu_user_children:0.00connected_clients:1connected_slaves:1client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33055824used_memory_human:31.52Mused_memory_rss:34717696used_memory_peak:33055800used_memory_peak_human:31.52Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:0changes_since_last_save:250000bgsave_in_progress:0last_save_time:1348677645bgrewriteaof_in_progress:0total_connections_received:250007total_commands_processed:750019expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:246vm_enabled:0role:masterslave0:10.6.1.144,6379,onlinedb1:keys=250000,expires=0

The current data volume is 0.25 million keys, occupying 31.52 MB of memory.

Then we can kill the redis process of the master to simulate a disaster.
[Email protected]:/opt/redis/data/6379 $ sudo killall-9 redis-Server

Check the status in slave:
Redis 127.0.0.1: 6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:1627uptime_in_days:0lru_clock:654181used_cpu_sys:29.69used_cpu_user:1.21used_cpu_sys_children:1.70used_cpu_user_children:1.23connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047696used_memory_human:31.52Mused_memory_rss:34775040used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:3308bgsave_in_progress:0last_save_time:1348718951bgrewriteaof_in_progress:0total_connections_received:4total_commands_processed:250308expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:694vm_enabled:0role:slaveaof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0master_host:10.6.1.143master_port:6379master_link_status:downmaster_last_io_seconds_ago:-1master_sync_in_progress:0master_link_down_since_seconds:25slave_priority:100db1:keys=250000,expires=0

We can see that the status of master_link_status is down and the master is no longer accessible.
At this time, slave is still running well, and aof and RDB files are retained.

Next we will restore the data on the master through the aof and RDB files saved on the slave.

First, cancel the synchronization status on the slave to prevent the master database from restarting before data recovery is complete, and overwrite the data on the slave database directly, resulting in all data loss.
Redis 127.0.0.1: 6379> slaveof no one
OK

Check that the master configuration information is no longer available:
Redis 127.0.0.1: 6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:1961uptime_in_days:0lru_clock:654215used_cpu_sys:29.98used_cpu_user:1.22used_cpu_sys_children:1.76used_cpu_user_children:1.42connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047696used_memory_human:31.52Mused_memory_rss:34779136used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:0bgsave_in_progress:0last_save_time:1348719252bgrewriteaof_in_progress:0total_connections_received:4total_commands_processed:250311expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:1119vm_enabled:0role:masteraof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0db1:keys=250000,expires=0

Copy data files on slave:
[Email protected]:/opt/redis/data/6379 $ tar CVF/home/dongguo/data.tar *
Appendonly. aof
Dump. RDB

Upload data.tar to the master and try to restore the data:
You can see that there is a slave data file under the Master Directory, which is very small and deleted.
[Email protected]:/opt/apsaradb for redis/data/6379 $ LS-l
Total 4
-RW-r -- 1 Root 10 Sep 27 00:40 dump. RDB
[Email protected]:/opt/redis/data/6379 $ sudo Rm-F dump. RDB

Decompress the data file:
[Email protected]:/opt/redis/data/6379 $ sudo tar xf/home/dongguo/data.tar
[Email protected]:/opt/redis/data/6379 $ LS-lH
Total 29 m
-RW-r -- 1 Root 18 m Sep 27 0:22 appendonly. aof
-RW-r -- 1 Root 12 m Sep 27 0:22 dump. RDB

Start redis on the master node;
[Email protected]:/opt/redis/data/6379 $ sudo/etc/init. d/redis start
Starting redis server...

Check whether data is restored:
Redis 127.0.0.1: 6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:16959run_id:6e5ba6c053583414e75353b283597ea404494926uptime_in_seconds:22uptime_in_days:0lru_clock:650292used_cpu_sys:0.18used_cpu_user:0.20used_cpu_sys_children:0.00used_cpu_user_children:0.00connected_clients:1connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33047216used_memory_human:31.52Mused_memory_rss:34623488used_memory_peak:33047192used_memory_peak_human:31.52Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:0changes_since_last_save:0bgsave_in_progress:0last_save_time:1348680180bgrewriteaof_in_progress:0total_connections_received:1total_commands_processed:1expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:0vm_enabled:0role:masterdb1:keys=250000,expires=0

We can see that 0.25 million pieces of data have been completely restored to the master.

At this time, you can safely restore the slave synchronization settings.
Redis 127.0.0.1: 6379> slaveof 10.6.1.143 6379
OK

View synchronization status:
Redis 127.0.0.1: 6379> info

redis_version:2.4.17redis_git_sha1:00000000redis_git_dirty:0arch_bits:64multiplexing_api:epollgcc_version:4.4.5process_id:13003run_id:9b8b398fc63a26d160bf58df90cf437acce1d364uptime_in_seconds:2652uptime_in_days:0lru_clock:654284used_cpu_sys:30.01used_cpu_user:2.12used_cpu_sys_children:1.76used_cpu_user_children:1.42connected_clients:2connected_slaves:0client_longest_output_list:0client_biggest_input_buf:0blocked_clients:0used_memory:33056288used_memory_human:31.52Mused_memory_rss:34766848used_memory_peak:33064400used_memory_peak_human:31.53Mmem_fragmentation_ratio:1.05mem_allocator:jemalloc-3.0.0loading:0aof_enabled:1changes_since_last_save:0bgsave_in_progress:0last_save_time:1348719252bgrewriteaof_in_progress:1total_connections_received:6total_commands_processed:250313expired_keys:0evicted_keys:0keyspace_hits:0keyspace_misses:0pubsub_channels:0pubsub_patterns:0latest_fork_usec:12217vm_enabled:0role:slaveaof_current_size:17908619aof_base_size:16787337aof_pending_rewrite:0aof_buffer_length:0aof_pending_bio_fsync:0master_host:10.6.1.143master_port:6379master_link_status:upmaster_last_io_seconds_ago:0master_sync_in_progress:0slave_priority:100db1:keys=250000,expires=0

Master_link_status is displayed as up, and the synchronization status is normal.

During the recovery process, we copied both the aof and RDB files. Which file has completed data recovery?
In fact, when the redis server fails, data will be restored to the memory according to the following priority during restart:
1. If only aof is configured, load the aof file to restore data upon restart;
2. If both RDB and aof are configured, only the aof file is loaded to restore data;
3. If only RDB is configured, the dump file is loaded to restore data.

That is to say, aof has a higher priority than RDB, which is also easy to understand, because aof itself guarantees data integrity than RDB.

In this case, we enabled aof and RDB on slave to ensure data and restore the master.

However, in our current online environment, because the data has an expiration time, the aof method is not very practical, and too frequent write operations will increase the aof file size to an exception, this greatly exceeds our actual data volume, which also consumes a lot of time for data recovery.
Therefore, you can enable only snapshot on the slave for localization. You can also consider increasing the frequency in save or calling a scheduled task to regularly store bgsave snapshots, to guarantee the integrity of local data as much as possible.
In this architecture, if only the master node fails and the slave is complete, the data recovery can reach 100%.
If both the master and slave fail, the data can be restored to an acceptable level.

 

From: http://blog.csdn.net/gzh0222/article/details/8482525

Redis data backup and restart recovery

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.