- Master Write memory snapshot, save command dispatch Rdbsave function, will block the work of the main thread, when the snapshot is large, the performance impact is very large, will intermittently pause service, so master should not write memory snapshot.
- Master AoF Persistence, if you do not rewrite the aof file, the performance impact of this persistence is minimal, but the aof file will continue to grow, aof file over the General Assembly to affect the recovery speed of master restart.
- Master calls bgrewriteaof rewrite aof file, aof in the time of rewriting will occupy a large amount of CPU and memory resources, resulting in service load is too high, there is a short service pause phenomenon.
Here is a case of my actual project, presumably this is the case: a master,4 slave, no sharding mechanism, only read and write separation, Master is responsible for write operations and AOF log backups, aof files about 5g,slave read operations, When Master calls Bgrewriteaof, the master and slave load suddenly spikes, master's write request basically does not respond, lasted for about 5 minutes, slave read request is half unable to respond in time, The server load graphs for master and slave are as follows:
Master Server Load:
Slave Server load:
The above situation would not and should not have happened, because the previous master of this machine is slave, there is a shell timed task on the daily 10 o'clock in the Morning call bgrewriteaof rewrite aof file, and later because the master machine down, The backup of this slave cut into the master, but this timed task forgot to delete, led to the above tragic situation occurred, the reason or find a few days to find. Setting the configuration of No-appendfsync-on-rewrite to Yes can alleviate this problem, set to Yes to indicate that a new write operation is not fsync during rewrite, exists in memory temporarily, and then writes after rewrite is complete. It is best not to turn on the master aof backup feature.
- The performance problem of Redis master-slave replication, the first time slave to master synchronous implementation is: Slave to master the synchronization request, master first dump the Rdb file, and then the Rdb file to the full amount to Slave, The master then forwards the cached commands to the slave and completes the initial synchronization. The second and subsequent synchronization implementations are: Master sends snapshots of the variables directly to each slave in real time. The above process is repeated for both slave and master disconnects for whatever reason. The master-slave replication of Redis is based on the persistence of memory snapshots, as long as there is slave there will be memory snapshots. Although Redis claims that master-slave replication is non-blocking, because of disk IO limitations, if the master snapshot file is larger, then dump takes a long time, in which master may be unable to respond to requests, that is, the service is interrupted, and the consequences for critical services are dire.
Above 1.2.3.4 The root cause of the problem is inseparable from the system IO bottleneck problem, that is, the hard disk read and write speed is not fast enough, the main process Fsync ()/write () operation is blocked.
5. Single point of failure problem, due to the current Redis master-slave replication is not mature enough, so there is a clear single point of failure problem, this can only be done on their own solution, such as: Active replication, Proxy implementation of the slave master replacement, this is also the Redis author is one of the priority tasks , the author's solution ideas are simple and elegant
What are the common performance issues and solutions for Redis?