Copyright notice: This article by Chen Long original article, reprint please indicate source:
Article original link: https://www.qcloud.com/community/article/127
Source: Tengyun https://www.qcloud.com/community
Redis is a high-performance memory database, this article focuses on the Redis in the master-slave mode encountered some problems and how to tune, especially in the cloud environment encountered some special problems, as to how redis use and data structure, can Baidu, online has a lot of information.
Main node
In the case of non-clustered environment, the use of Redis master-slave mode to ensure high availability of the business, so in this mode, read and write in the host, to ensure that the host high-performance must be on the host to minimize the IO operation at the same time and also take into account the network caused by the master-slave broken chain and the frequent fullsync, So the main points for host optimization are as follows:
Close the main node aof.
It is simple to close the main aof, which can be closed by the following command and executed on the main node.
config set appendonly no
Close the main node save.
The reason to turn off the Lord's save is to avoid business fluctuations caused by the bgsave of Save rules, Bgsave is very performance-intensive, Redis's default save rule is "900 1", "300 10", "," 60 10000 ", In the case of large write volume under this rule will cause the host frequent bgsave and result in a sharp decline in performance, you can command to config set save
shut down the host on the bgsave triggered by the write, the integrity of the data to the standby machine to complete, even if this can not completely eliminate bgsave, Bgsave is triggered when the slave is first connected or disconnected from the machine for too long
The problem of key number inconsistency after master-slave synchronization
Because Redis will only be in the Lord for regular key elimination and command to propagate to the slave, so in the number of keys and many keys with the expiration time, because the elimination mechanism problem will cause master and slave synchronization after the key number and host key number inconsistent (expired key will not sync to the slave) , and the most fundamental reason is that the host in the Servercron function in the elimination of the time by default will only eliminate 20 keys, the default value in the redis.h
#define ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP 20 /* Loopkups per loop. */
definition, the way to solve the problem is to modify the number of recompilation, Instead, modify the Hz attribute in the redis.conf to speed up the Servercron execution frequency
Send buffer full cause master-slave disconnection frequent fullsync problem
Redis maintains a send buffer for each linked client and limits the size (with soft and hard points), and Redis disconnects the link for self-protection when the send buffer is full (exceeding the set value), but the problem also arises. When the amount of write is very large and this value is set unreasonable will cause the master-slave frequent disconnection, and because the volume of the large number of new connections on the slave can not be partially synchronized and trigger full synchronization, so in order to avoid this problem can be based on Redis actual write data and network conditions synthesis to modify parameters client-output-buffer-limit
, Specific changes to the actual write volume and network conditions, depending on the set-up mode is:
config set client-output-buffer-limit "slave 4295000768 4295000768 0"
Slave represents the slave link, the normal client is normal, and the publication subscription client is: PubSub
Replication Backlog Buffer Repl-backlog-size
The replication backlog buffers The most recent write commands, which are created when there is a slave link, which defaults to 1M, which determines whether the slave disconnects after the relink is full or partial, and if the copy offset is partially synchronized within the replication backlog buffer, Less than or greater than the replication backlog buffer then the full amount of synchronization, according to the actual situation through the Config set command reset repl-backlog-size
Node decision
In high-availability systems, the node's dead and alive check is very important, detection logic to quickly detect problems and quickly switch, detection means is also multi-diversity, redis detection node dead and alive using the process of detection and service ping, process detection is to confirm the existence of the target process, However, the target process is not necessarily confirmed that the service is available, so add the way to ping the specified service node, in the actual use of the discovery of some nodes will be strange to switch, and to see the machine's memory, network, and IO are very low, in addition to some CPU cores are run full at the time of switchover, Then analyzing the Slowlog discovery of the switching node, the user submits a query that takes up to several minutes at that point, because Redis is single-threaded, because a time-consuming command causes the ping timeout to switch, and the optimization logic is the appropriate increase in the ping time and increase the number of pings, This process also has to be a choice, that is, to quickly find the problem, but not because of the high time-consuming command and false judgment to switch
From the junction point
From the node is mainly used to ensure data security, and after the main node died quickly restored to the main node and provide services, as from the node need to open the RDB and aof, and according to a certain time rule to put the user's rdb into the cold standby center,
When the primary node is promoted, the related settings are immediately restored to the same configuration as the master node
Redis Single-Master high-availability optimization