Redis memory optimization, persistence, and master-slave Replication
Configure the memory optimization parameters of the apsaradb for Redis database, advantages and disadvantages of each persistence mode and the principle and configuration of master-slave replication I. How do common memory optimization methods and parameters fully depend on the memory performance of redis, therefore, we need to know how to control and save memory.
First, do not enable the VM option of Redis, that is, the virtual memory function, this was originally a persistent policy for Redis to replace the memory and disk for storing data that exceeds the physical memory. However, the memory management cost is very high, so we need to disable the VM function, please check your redis. in the conf file, vm-enabled is no.
Next, we 'd better set up redis. maxmemory option in conf, which indicates that Redis starts to reject subsequent write requests after using the amount of physical memory, this parameter can effectively protect your Redis against swap caused by excessive physical memory usage, which seriously affects performance and even crashes.
In addition, Redis provides a set of parameters for different data types to control memory usage. We know that RedisHash is a HashMap inside the value. If the number of members of this Map is small, A compact format similar to one-dimensional linear format is used to store the Map, which saves the memory overhead of a large number of pointers. this parameter is controlled in redis. the conf configuration file contains the following two items:
Hash-max-zipmap-entries 64
Hash-max-zipmap-value 512
It means that when the value Map contains no more than a few members, it will be stored in a linear compact format. The default value is 64. That is, if the value contains less than 64 members, it will use linear compact storage, if this value is exceeded, it is automatically converted to a real HashMap.
Hash-max-zipmap-value indicates that when the length of each member value in the Map is no more than a few bytes, a linear compact storage is used to save space.
If any of the above two conditions exceeds the set value, it will be converted into a real HashMap, which will no longer save memory. Is this value a greater value, the better? Of course, the answer is no, the advantage of HashMap is that the time complexity of search and operation is O (1), while the time complexity of O (n) is given up when Hash is used for one-dimensional storage. If the number of members is small, otherwise, the performance will be seriously affected. Therefore, we need to weigh the setting of this value, which is the most fundamental balance between the time cost and the space cost.
Similar parameters include:
List-max-ziplist-entries 512
Note: The number of nodes in the list data type follows the compact storage format of pointer removal.
List-max-ziplist-value 64
Note: The number of bytes smaller than the node value of the list data type adopts the compact storage format.
Set-max-intset-entries 512
NOTE: If all the internal data of the set data type is of the numeric type, and the following nodes are stored in a compact format.
The internal implementation of Redis does not optimize the memory allocation too much. To a certain extent, there will be memory fragments, but in most cases this will not become the performance bottleneck of Redis, however, if most of the data stored in Redis is numeric, Redis uses a sharedinteger internally to save the memory allocation overhead, that is, when the system starts, it first allocates ~ N so many numeric objects are placed in a pool. If the stored data happens to be data within the value range, the object is taken directly from the pool, in addition, the system can share the data by referencing the count, which saves memory and improves performance to a certain extent when a large number of values are stored in the system, the setting of this parameter value n needs to modify a macro in the source code to define REDIS_SHARED_INTEGERS. The default value is 10000. You can modify the value according to your own needs, and then re-compile it.
2. Persistent redis is a persistent memory database. That is to say, redis often needs to synchronize data in the memory to the disk to ensure persistence. Redis supports two persistence Methods: Snapshotting (snapshot) and Append-only file (aof.
Snapshotting
Snapshots are the default persistence method. In this way, data in the memory is written to the binary file as a snapshot. The default file name is dump. rdb. You can configure and set Automatic snapshot persistence. We can configure redis to automatically create snapshots if more than m keys are modified in n seconds. The following is the default snapshot storage Configuration:
Save 900 1 # if more than one key is modified within 900 seconds, a snapshot is initiated to save 300 10 # if more than 10 keys are modified, the snapshot is initiated to save 60 10000 #60 seconds. If more than 10000 keys are modified, the snapshot is saved.
You can also run the following command to enable redis to perform snapshotting:
Redis-cli-h ip-p port bgsave
The save and bgsave commands are used to save snapshots. The save operation saves snapshots in the main thread. Because redis uses a main thread to process all client requests, this method will block all client requests, so it is not recommended.
The Snapshot generation process is roughly as follows:
Redis calls fork. Now there are sub-processes and parent processes. The parent process continues to process client requests, and the sub-process is responsible for writing memory content to temporary files. Because the copy on write mechanism of the OS Parent and Child processes share the same physical page, when the parent process processes write requests, the OS creates a copy of the page to be modified by the parent process, instead of writing shared pages. Therefore, the data in the address space of the sub-process is a snapshot of the entire database at the time of fork. After the sub-process writes the snapshot to the temporary file, it replaces the original snapshot file with the temporary file, then the sub-process exits.
At the same time, snapshotting is insufficient because there is a time interval between two snapshot operations. Once a database problem occurs, the data saved in the snapshot file is not brand new, data from the last snapshot file generation to Redis downtime is all lost. If your business requires extremely high data accuracy, You have to adopt the aof persistence mechanism.
Aof
Aof is more persistent than snapshot, because when aof persistence is used, redis will append every write command received to the file through the write function (appendonly by default. aof ). When redis is restarted, it re-executes the write commands saved in the file to recreate the entire database content in the memory. Of course, because the OS caches write modifications in the kernel, it may not be immediately written to the disk. In this way, the persistence of the aof method may also lose some modifications. However, we can use the configuration file to tell redis when we want to force the OS to be written to the disk through the fsync function. There are three methods (default: fsync once per second ):
Appendonly yes // enable the aof persistence mode # appendfsync always // immediately write data to the disk upon receiving the write command, which is the slowest, but ensures full persistence, we do not recommend that you use appendfsync everysec // to write data to the disk once per second, which compromises performance and persistence. We recommend # appendfsync no // fully dependent on OS, which has the best performance, persistence is not guaranteed
The aof method also brings about another problem. Persistent files become larger and larger. For example, if we call the incrtest command for 100 times, all the 100 commands must be saved in the file. In fact, 99 of them are redundant. Because it is enough to restore the database status to save a set test100 in the file. To Compress aof persistent files. Redis provides the bgrewriteaof command. After receiving this command, redis will save the data in memory to a temporary file in a similar way as a snapshot, and finally replace the original file. The bgrewriteaof command is as follows:
Redis-cli-h ip-p port bgrewriteaof
The bgrewriteaof command execution process is as follows:
Redis calls fork. Now there are two processes: Parent and Child. The sub-process writes the commands for recreating the database status to the temporary file based on the database snapshot in the memory. The parent process continues to process client requests, in addition to writing commands to the original aof file. Cache the received write commands. This ensures that if the child process fails to rewrite, there will be no problems. When the child process writes the snapshot content to the temporary file as a command, the child process sends a signal to notify the parent process. Then, the parent process writes the cached write command to the temporary file. Now, the parent process can replace the old aof file with the temporary file and rename it, the subsequent write commands also start to append to the new aof file.
These two persistence methods have their own characteristics, and the snapshot has little impact on the relative performance. However, when a crash occurs, the data volume is greatly lost, while the aof data security is high, but the performance is greatly affected, this requires you to choose based on your business characteristics.
Iii. Master-slave Replication
Redis's master-slave replication policy is implemented through its persistent rdb file. The process is to dump the rdb file and transmit the full rdb file to slave, then, synchronize the dump operations to the slave in real time.
To use the Master/slave function, you must perform simple configuration on the slave side:
Slaveof master_ip master_port # If this machine is a redis slave instance, you can enable this setting. Slave-serve-stale-data no # If slave cannot be synchronized with the master, it is set to slave unreadable, so that the monitoring script can detect problems easily. If the master-slave (master)-slave mode is the same as the master-slave mode, you can configure the configuration file like the master-slave mode.
After the configuration is complete, start the slave end to perform master-slave replication. The master-slave replication process is roughly as follows:
The Slave end adds the slaveof command to the configuration file, so the Slave reads the configuration file at startup, and the initial state is REDIS_REPL_CONNECT; the Slave end reads the serverCron task (the timer trigger event inside Redis) connect to the Master, send the sync command, and then block the waiting for the master to send back to its memory snapshot file (the latest version of Redis does not need to block the Slave ); when the Master side receives the sync command, it simply determines whether there is a memory snapshot sub-process in progress. If no, it starts the memory snapshot immediately. If yes, It waits until it ends, after the snapshot is complete, the file will be sent to the Slave end. The Slave end will receive the memory snapshot file sent from the Master and save it to the local device. After receiving the snapshot, the memory table will be cleared, re-read the memory snapshot file sent from the Master, rebuild the data structure of the entire memory table, and set the final state to REDIS_REPL_CONNECTED. The Slave state machine flow is complete. During the snapshot File Sending process on the Master side, any commands that change the dataset will be received temporarily First, it is saved in the sending cache Queue (list Data Structure) connected to the Slave network. After the snapshot is complete, it is sent to the Slave in sequence, and then the commands received are the same for processing, and the status is set to REDIS_REPL_ONLINE.
The entire replication process is completed, as shown in the following figure:
From the above replication process, we can find that when the Slave database is connected to the Master database of the Master database, the Master will take a memory snapshot and then send the entire snapshot file to the Slave, that is, there is no replication location like MySQL, that is, no incremental replication. If one master node is connected to multiple slave instances, the master performance will be affected.
Iv. Data Backup Policy
The specific backup policy can be flexible. For example, it can be roughly as follows:
To improve the performance of the master node, disable the persistence mechanism of the master node, that is, using the bgsave command to take snapshots at a scheduled time at a low traffic volume in the early morning, save the snapshot file to the backup server; enable the aof mechanism on the slave side, compress data with bgrewriteaof at regular intervals, and save the compressed data file to the backup server; regularly check whether the data on the master and slave is consistent. When the master node fails and needs to be restored, if the backup snapshot of the master node is used for restoration, the backup dump is directly performed. copy the rdb to the corresponding path and restart it. To restore it from the slave end, you need to execute a snapshot on the slave end, copy the snapshot file to the master path, and then restart it. However, it should be noted that when the master restarts, the slave data will be washed out, so the slave end should back up the data before the master restarts.
Persistent disk I/O mode and Problems
People with online Redis O & M experience will find that Redis uses a lot of physical memory, but it is unstable or even crashes if it does not exceed the actual total physical memory capacity, some people think that the snapshot-based persistent fork system call results in a doubling of memory usage, which is inaccurate, because the copy-on-write mechanism called by fork is based on the operating system page, that is, only dirty pages that have been written will be copied, however, the general system won't write all pages in a short period of time, causing replication. What causes Redis to crash?
The answer is that Redis uses BufferIO for persistence. The so-called Buffer IO means that Redis uses the physical memory Page Cache for writing and reading persistent files, most database systems use DirectIO to bypass this layer of PageCache and maintain a data Cache. When Redis's persistent files are too large (especially snapshot files) and read and write them, the data in the disk files will be loaded into the physical memory as a layer of Cache for the file by the operating system. The data in this layer of Cache is actually stored repeatedly with the data managed in the Redis memory, although the kernel will remove PageCache when the physical memory is tight, the kernel may think that a piece of PageCache is more important and let your process start Swap, at this time, your system will begin to become unstable or crash. Experience is that when your Redis physical memory usage exceeds 3/5 of the total memory capacity, it will start to be more dangerous.
1. the snapshot method is persistent to the disk.
Automatic persistence Rule Configuration
Save 900 1
Save 300 10
Save 60 10000
The preceding configuration rules are as follows:
# In the example below the behaviour will be to save:
# After 900 sec (15 min) if at least 1 key changed
# After 300 sec (5 min) if at least 10 keys changed
# After 60 sec if at least 10000 keys changed
Redis can also disable automatic persistence, comment out these save configurations, or save ""
If an error occurs when saving data to the disk in the background, the write operation will stop.
Stop-writes-on-bgsave-erroryes
Use LZF to compress rdb files, which consumes CPU but reduces disk usage.
Rdbcompression yes
When saving rdb and loading rdb files, the test can prevent errors. However, if you have to pay about 10% of the performance, you can disable it to improve the performance.
Rdbchecksum yes
Exported rdb file name
Dbfilename dump. rdb
Set the working directory. The rdb file will be written to this directory, and the append only file will also be stored in this directory.
Dir ./
Redis automatic snapshots are saved to the disk or bgsave is called. The background process is completed. Other clients can still read and write redis servers. Storing snapshots on the disk in the background will occupy a large amount of memory. Calling save to save data in the memory to the disk will block client requests until the storage is complete.
When the shutdown command is called, The Redis server will first call save. After all data is persisted to the disk, it will actually exit.
For data loss problems:
If the server crashes, all data after the last snapshot will be lost. Therefore, when setting the Save rule, you must set the allowed range according to the actual business.
If data-sensitive services are used, use appropriate logs in the program to restore data after the server crash.
2. Append-only file persistence
The other method is incremental, which will cause data changes and be persisted to the file. When redis is restarted, the data will be restored through the Operation Command.
After each write operation command is executed, data is written to server. aofbuf.
# Appendfsync always
Appendfsync everysec
# Appendfsync no
When configured as always, data in server. aofbuf is returned to the client after being written to a file. This ensures that data is not lost, but frequent IO operations will reduce performance.
Everysec writes data once per second, which may result in loss of operations within one second.
The biggest problem with aof is that appendfile will become very large over time. Therefore, we need the bgrewriteaof command to reorganize the file and only keep the latest kv data.