Original link translator: Alexandar Mahone
This article describes the Redis persistence from a technical level and is recommended for all readers to read. If you want to learn more about Redis persistence and durability, we recommend that you read the Redis persistence disclosure.
Redis Persistence
Provides a variety of different levels of persistence:
- RDB persistence can generate a point-in-time snapshot of a dataset (Point-in-time snapshot) within a specified time interval.
- The AOF persists all the write commands that the server performs and restores the dataset by re-executing the commands when the server starts. The commands in the AOF file are all saved in the Redis protocol format, and the new command is appended to the end of the file. Redis can also override the AOF file in the background (rewrite) so that the volume of the AOF file does not exceed the actual size required to save the dataset State. Redis can also use both AOF persistence and RDB persistence. In this case, when the Redis restarts, it takes precedence over the AOF file to restore the dataset because the AOF file saves a dataset that is typically more complete than the data set saved by the RDB file. You can even turn off the persistence feature so that the data exists only when the server is running.
It is important to understand the similarities and differences between the RDB persistence and AOF persistence, which are described in detail in the following subsections, and are described in the same and different aspects of these two persistence features.
Advantages of the RDB:
- An RDB is a compact file that represents the Redis data for an instant point. The Rdb file is suitable for backup. For example, you might want to archive an RDB file for the last 24 hours per hour, saving an RDB snapshot for nearly 30 days every day. This allows you to easily recover different versions of the data set for disaster tolerance.
- The RDB is ideal for disaster recovery, as a compact single file that can be transferred to a remote data center or Amazon S3 (possibly encrypted).
- The RDB maximizes the performance of Redis because the only thing that needs to be done when the Redis parent process is persisted is to start (fork) a subprocess and complete all remaining work by the child process. The parent process instance does not need to perform operations such as disk IO.
- The RDB is faster than aof when restarting an instance of a large data set.
Disadvantages of the RDB
- When you need to minimize data loss when Redis stops working (for example, a power outage), the RDB may not be very good. You can configure different savepoint (save point) to save the Rdb file (for example, after at least 5 minutes and 100 writes to the dataset, but you can have multiple savepoint). However, you typically create an RDB snapshot every 5 minutes or more, so once Redis stops working for any reason that doesn't shut down properly, you'll have to prepare for data loss in the last few minutes.
- An RDB needs to call the fork () child process frequently to persist to disk. If the data set is large, the fork () is time consuming, and the result is that Redis stops the service client for milliseconds or even seconds when the dataset is very large and the CPU performance is not strong enough. AOF also needs fork (), but you can adjust how often you rewrite the log without compromising (trade-off) persistence (durability).
Advantages of AOF:
- Using AOF Redis will be more durable (durable): You can have many different Fsync strategies: no Fsync, Fsync per second, fsync on each request. With the default Fsync policy per second, write performance is still good (Fsync is done by a background thread, the main thread continues to work hard to write requests), even if you only lose one second of write data.
- The AOF log is an append file, so there is no need to locate and there is no damage when power is lost. Even if for some reason the end of the file is a half-written command (disk full or other reason), the Redis-check-aof tool can be easily repaired.
- When the aof file becomes large, Redis automatically overrides it in the background. Overrides are absolutely secure because Redis continues to append to the old file, creating a completely new file with the minimum set of operations required to create the current dataset, and once the second file is created, Redis switches the two files and begins appending to the new file.
- The AoF file contains one operation after another, stored in a format that is easy to understand and parse. You can also easily export a aof file. For example, even if you accidentally use the Flushall command to clear everything, if you do not perform the rewrite at this time, you can still save your dataset, you just have to stop the server, delete the last command, and then restart Redis.
Disadvantages of AOF:
- For the same data set, the AoF file is usually larger than the equivalent RDB file.
- AOF may be slower than an RDB, depending on the exact fsync strategy. Usually the Fsync is set to one time per second and the performance is still high, and if the Fsync is turned off, it is as fast as an RDB even under high load. However, an RDB can provide a good maximum delay guarantee even in the case of a large write load.
- In the past, we experienced some rare bugs for special commands (for example, blocking commands like Brpoplpush), which prevented the data from being restored to the time it was saved when it was loaded. These bugs are rare, and we have tested them in the test suite, automatically creating complex datasets randomly, and then loading them to check if everything is OK, but such bugs are almost impossible to see in rdb persistence. To make it clearer: Redis aof is incrementally updating an already existing state, like MySQL or MongoDB, while an RDB snapshot creates everything from scratch again and again, conceptually more robust. However, 1) note that every time Redis rewrites aof, it starts from scratch with the real data in the current dataset, and is more immune to bugs than the aof files that have been appended (or if you rewrite the old aof file instead of reading the data in memory). 2) We have not received a report that the user has detected a crash in the real world.
RDB and AOF, which one should I use?
In general, if you want to achieve data security that is comparable to PostgreSQL, you should use two persistence features at the same time.
If you are very concerned about your data, but can still tolerate data loss within a few minutes, you can use only the RDB persistence.
There are many users who use aof alone, but we do not encourage this, because it is very convenient to take an RDB snapshot very easily for database backups, faster to boot, and to avoid bugs in the AOF engine.
Note: For these reasons, in the future we may unify aof and RDB as a single persistence model (long-term plan).
The following sections describe the details of the two persistence models.
RDB Snapshot
By default, Redis saves a snapshot of the dataset to disk, a binary file named Dump.rdb. You can set the data set to be saved by Redis for at least m data set changes in n seconds, or you can call the Save or Bgsave command manually.
For example, this configuration will let Redis automatically dump the dataset to disk at least 1000 key changes per 60 seconds:
Save 60 1000
This policy is called a snapshot.
How the Snapshot works:
When Redis needs to save the Dump.rdb file, the server performs the following actions:
- Redis calls fork (), with both parent and child processes.
- The child process writes the dataset to a temporary RDB file.
- When a child process finishes writing to the new Rdb file, Redis replaces the original Rdb file with the new Rdb file and deletes the old Rdb file.
This way of working enables Redis to benefit from the write-time replication (copy-on-write) mechanism.
Append files only AOF
The snapshot feature is not very durable (durable): If Redis causes downtime for some reason, the server loses the data that was recently written and is still not saved to the snapshot. While the durability of data is not the most important consideration for some programs, the snapshot feature is less applicable for programs that pursue full durability.
Starting with version 1.1, Redis adds a completely durable way to persist: AOF persistence.
You can open the AOF feature by modifying the configuration file:
appendonly yes
From now on, whenever Redis executes a command that changes the dataset (such as set), the command is appended to the end of the AOF file. When Redis restarts, the program can achieve the purpose of rebuilding the dataset by re-executing the commands in the AOF file.
Log rewriting
As you can guess, the aof file will get bigger and larger as the write operation executes continuously. For example, if you add a counter 100 times, your dataset will have only one key to store the final value, but there are 100 records in AOF. 99 of these records are not required to rebuild the current state.
So Redis supports an interesting feature: rebuilding aof in the background without affecting the service client. Whenever you send bgrewriteaof, Redis will write a new aof file containing the shortest sequence of commands needed to rebuild the current in-memory dataset. If you are using Redis 2.2 aof, you will need to run the bgrewriteaof command from time to times. Redis 2.4 can automatically trigger log rewriting (see the sample configuration file in Redis 2.4 for more information).
How is aof persistent?
You can configure how long Redis will fsync data to disk once. There are three options:
- Each time a new command is appended to the AOF file, the Fsync is executed once: very slow and very secure.
- Fsync per second: fast enough (almost as long as the RDB is persisted), and only 1 seconds of data are lost in the case of a failure.
- Never Fsync: Give the data to the operating system for processing. Faster and less secure choice.
The recommended (and also the default) measure is Fsync per second, and this fsync strategy can take into account speed and security. The strategy of always fsync is very slow in practice, even after Redis 2.0 has made improvements to the related programs-frequent calls to Fsync are destined to make this strategy impossible to get up to.
What if something goes wrong with the AOF file?
The server may crash when the program is writing to the AOF file (this should not compromise data consistency) and Redis will not load the corrupted AOF file. When this happens, you can fix the AOF file with the error in the following ways:
- Create a backup of the existing AOF file.
- Use the REDIS-CHECK-AOF program that came with Redis to fix the original aof file.
- $ redis-check-aof–fix
- Optionally, use Diff-u to compare the backup of the repaired AOF file and the original AOF file to see the differences between the two files.
- Restart the Redis server, wait for the server to load the repaired AOF file, and perform data recovery.
How it works
Log rewriting takes the same write-time replication mechanism as a snapshot. Here is the procedure:
- Redis calls fork (). So we have the father and son two processes.
- The child process begins writing aof to a temporary file.
- The parent process accumulates new changes in a memory buffer (and writes new changes to the old aof file, so even if the rewrite fails we are safe).
- When the child process finishes overwriting the file, the parent process receives a signal that appends the memory buffer to the end of the file created by the child process.
- Get! Now Redis atomically renames the old file as new, and then begins appending new data to the new file.
How do I convert from RDB persistence to aof persistence?
The Redis 2.0 and Redis 2.2 processing processes are different, and it's easy to guess that the Redis 2.2 process is simpler and does not require a restart.
When Redis >=2.2
- Create a backup of the most recent RDB file.
- Save the backup in a secure location.
- Launch the following command.
- $redis-cli config set appendonly yes.
- $redis-cli config set save "".
- Confirm that the database contains the same keys.
- Confirm that the write operation was correctly appended to the AoF file.
The first config command turns on aof. Redis blocks to generate an initial dump file, and then opens the file to prepare for writing, starting the append write operation.
The second Config command is used to turn off snapshot persistence. This step is optional if you want to open both of these persistence methods at the same time.
Important: Remember to edit your redis.conf file to turn on aof, otherwise when you restart the server, your configuration changes will be lost, and the server will use the old configuration.
When Redis2.0
- Create a backup of the most recent RDB file;
- Store the backup in a secure location;
- Stop all write operations on the database;
- Launch the REDIS-CLI bgrewriteaof command to create the aof file;
- Stop Redis Server After the aof file is generated;
- Edit redis.conf open aof persistence;
- Restart Redis Server;
- Confirm that the database contains the same keys;
- Confirm that the write operation was correctly appended to the AoF file.
Interaction between the AOF and the RDB
Versions above Redis2.4 Ensure that aof overrides are not triggered when an RDB snapshot is created, or bgsave operations are not allowed when aof overrides, to avoid heavy disk I/O operations at the same time for Redis daemon processes.
When an Rdb snapshot is created, an explicitly initiated log rewrite operation for the user using Bgrewriteaof server responds immediately to an OK status code informing the user that the operation will be executed back, when and only when the snapshot creation is complete, the rewrite operation begins to execute.
In cases where both aof and RDB are used, the Redis restart takes precedence over the aof file to refactor the original dataset.
Backing up Redis data
Before you start this section, it's important to remember that you must back up your database . Disk corruption, lost instances in the cloud, and more: No backup means a huge risk of losing data.
Redis is very friendly with data backup because you can copy an RDB file while the database is running: The Rdb file is not modified once it is generated, the file is generated into a temporary file, and when the new snapshot is complete, the file name is modified atomically using rename (2) to the target file.
This means that it is completely safe to copy an RDB file while the server is running. Here are our suggestions:
- Create a scheduled task (cron job) that creates an Rdb snapshot to a directory every one hours, and a snapshot of each day in a different directory.
- Be sure to use the Find command to delete the old snapshot every time the script runs: For example, you can save a snapshot of every hour in the last 48 hours, one two months per day. Note Add datetime information when you name the snapshot.
- Transfer your RDB snapshot to your data center at least once a day, or at least to the physical machine running your Redis instance.
Disaster recovery
Disaster recovery and data backup are essentially the same process in Redis, and disaster recovery transfers these backups to multiple external datacenters. This way, even if some catastrophic events affect the primary datacenter running Redis and generating snapshots, the data is also secure.
Since many Redis users are in the start-up phase and don't have much budget, we'll introduce some of the most interesting disaster recovery techniques without spending too much.
- Amazon S3 and some similar services are a great way to help you with your disaster recovery system. It's easy to simply transfer your daily or hourly rdb snapshot to S3 in an encrypted manner. You can use it
gpg -c
to encrypt your data (in symmetric encryption mode). Be sure to keep your password in a different secure place (for example, to the most important person in your organization). It is recommended to use multiple storage services to improve data security.
- Use the SCP (part of SSH) to transfer your snapshot to the remote server. This is a fairly simple and secure way: Get a small VPS from your location, install SSH, generate a password-free SSH client key, and add it to the Authorized_keys file on your VPS (translator Note: This is SSH trust, You can use the Ssh-keygen command to generate a public private key in a Linux system. You can automatically transfer the backup files without entering a password. To achieve better results, it is best to have at least two vps from different providers.
It is easy to fail if the system is not handled correctly. At a minimum, be sure to verify the size of the file after the transfer is complete (to match the file you copied), and use the SHA1 digital signature If you use a VPS.
You will also need a separate alarm system for some reason that causes the transfer backup process to fail when the alarm occurs.
original articles, reproduced please specify: reproduced from the Concurrent programming network –ifeve.com This article link address: "Redis Official document" persistence
Redis Official Document Persistence