Redis Research (11)-data persistence, redis Research Data
Redis's strong performance is largely due to the fact that it stores all the data in the memory. In order to ensure that Redis will not lose data after the restart, data needs to be synchronized from the memory to the hard disk in some form. This process is persistent.
Redis supports two methods of Persistence: RDB and AOF. Either of them can be used independently or in combination.
1. RDB Mode
RDB persistence is completed through snapshots. when certain conditions are met, Redis automatically takes snapshots of all data in the memory and stores them on the hard disk. The Snapshot creation condition can be customized by the user in the configuration file. It consists of two parameters: time and the number of modified keys. When the number of keys changed within the specified time is greater than the specified value, a snapshot is taken. RDB is the default persistence method adopted by Redis. Three conditions have been preset in the configuration file:
save 900 1save 300 10save 60 10000
The save parameter specifies the snapshot conditions. Multiple conditions can exist, and the conditions are in the "or" relationship. As mentioned above, save 900 1 means that at least one key is changed within 15 minutes (900 seconds) and a snapshot is taken.
To disable automatic snapshots, you only need to delete all the save parameters.
Redis stores the snapshot file in the dump. rdb file of the current directory by default. You can specify the storage path and file name of the snapshot file by configuring the dir and dbfilename parameters.
Clarifying the snapshot implementation process in Redis is of great help for us to understand the features of snapshot files. The Snapshot process is as follows.
(1) Redis uses the fork function to copy a copy of the current process (parent process) (child process );
(2) The parent process continues to receive and process commands sent from the client, and the child process starts to write data in the memory to a temporary file in the hard disk;
(3) After the child process writes all the data, it replaces the old RDB file with the temporary file. So far, the snapshot operation is completed.
During fork execution, the operating system (Unix-like operating system) will use the copy-on-write policy, that is, when the fork function occurs, the Parent and Child processes share the same memory data, when a parent process needs to change a piece of data (such as executing a write command), the operating system copies the piece of data to ensure that the data of the child process is not affected, therefore, the new RDB file stores memory data that executes fork for a moment.
Through the above process, we can find that Redis will not modify the RDB file during the snapshot process. Only after the snapshot is complete will the old file be replaced with the new one, that is, the RDB file is complete at any time. This allows us to back up Redis databases by backing up RDB files on a regular basis. RDB files are compressed (you can configure the rdbcompression parameter to disable compression and reduce CPU usage) in binary format. Therefore, the occupied space is smaller than the data size in the memory, which is more conducive to transmission.
In addition to automatic snapshots, you can also manually send the SAVE or BGSAVE command for Redis to execute snapshots. The difference between the two commands is that the former is a snapshot operation performed by the main process and other requests are blocked, the latter takes snapshots through the fork sub-process.
After Redis is started, it reads the RDB snapshot file and loads the data from the hard disk to the memory. The time varies depending on the data size, structure, and server performance.
It usually takes 20 to 20 minutes to load a snapshot file with 10 million string-type keys and a size of 1 GB into the memory ~ 30 seconds.
Persistence is achieved through RDB. Once Redis exits unexpectedly, all data changed after the last snapshot is lost. This requires developers to control possible data losses to an acceptable range by setting automatic snapshot conditions in combination based on specific application scenarios. If the data is so important that it cannot bear any losses, you can consider using AOF for persistence.
2. AOF Method
By default, apsaradb for Redis does not enable the AOF (append only file) method for persistence. You can enable it using the appendonly parameter:
appendonly yes
After AOF persistence is enabled, every time you execute a command that will change the data in Redis, Redis will write the command into the AOF file on the hard disk. The storage location of the AOF file is the same as that of the RDB file. Both are set using the dir parameter. The default file name is appendonly. aof. You can modify it using the appendfilename parameter:
append file name appendonly.aof
The following describes the specific implementation of AOF persistence. Assume that the following four commands are executed when AOF persistence is Enabled:
SET foo 1SET foo 2SET foo 3GET foo
Redis will write the first three commands to the AOF file. The contents of the AOF file are as follows:
*2$6SELECT$10*3$3set$3foo$11*3$3set$3foo$12*3$3set$3foo$13
It can be seen that the AOF file is a plain text file, and its content is exactly the content of the original communication protocol sent by the Redis client to Redis (for ease of reading, the actual command section is shown in bold here ), it can be seen that Redi s records only the first three commands. However, one problem is that the first two commands are redundant because the execution results of these two commands will be overwritten by the third command. As more and more commands are executed, the size of the AOF file will increase, even if the actual data in the memory may not be much. Naturally, we hope that Redi s can automatically optimize the AOF file. In the above example, it is to delete the first two useless records and keep only the third record. In fact, Redis also does this. Whenever a certain condition is reached, Redis will automatically rewrite the AOF file. This condition can be set in the configuration file:
auto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mb
The auto-aof-rewrite-percentage parameter indicates that when the current AOF file size exceeds of the aof file size during the last rewrite, it will be overwritten again, if it has not been overwritten before, it is based on the AOF file size at startup.
The auto-aof-rewrite-min-size parameter limits the minimum AOF file size that can be overwritten, generally, when the AOF file is small, we do not care much about redundant commands.
In addition to enabling Redis to automatically execute the rewrite, we can also use the bgrewri teaof command to manually execute the AOF rewrite.
In the preceding example, the AOF file overwrites the following content:
*2$6SELECT$10*3$3SET$3foo$13
The redundant command has been deleted. The rewriting process is only related to the data in the memory and has nothing to do with the previous AOF file. This is very similar to RDB, except that the file formats of the two are completely different.
At startup, Redis will execute commands in the AOF file one by one to load the data in the hard disk into the memory, which is slower than RDB.
Note that AOF records commands in the aof file every time you perform operations to change the database content. However, due to the operating system's cache mechanism, the data is not written to the hard disk, but to the system's hard disk cache. By default, the system performs a synchronization operation every 30 seconds to write the content in the hard disk cache to the hard disk, if the system exits abnormally during the 30 seconds, data in the hard disk cache will be lost. Generally, applications that enable AOF persistence cannot tolerate such losses. This requires Redis to actively require the system to synchronize cached content to the hard disk after writing the AOF file.
In Redis, we can use the appendfsync parameter to set the synchronization time:
# appendfsync alwaysappendfsync everysec# appendfsync no
By default, Redis adopts the everysec rule, that is, a synchronization operation is performed every second. Always indicates that synchronization is performed every time the write is executed, which is the safest and slowest way. No indicates that the synchronization operation is not performed actively, but is completely performed by the operating system (that is, once every 30 seconds). This is the fastest but safest way. Generally, the default value everysec is enough to ensure both performance and security.
Redis allows both AOF and RDB to be enabled, which ensures data security and makes it easy to perform operations such as backup (RDB. After Redis is restarted, Redis will use the AOF file to restore data, because the AOF method may lose less data for persistence.