This article is from a series of articles by Simon Maynard, co-founder of bugsng. Based on the experience in using Redis over the past few years, the author systematically summarizes the Redis monitoring methods. There are a lot of things available, which is worth reading.
Original article: Redis Masterclass-Part 2, Monitoring
The most direct method for Redis monitoring is to use the info Command provided by the system. You only need to execute the following command to obtain the status report of the Redis system.
Redis-cli info
Memory usage
If the memory used by Redis exceeds the available physical memory size, Redis may be killed by OOM Killer. You can use the info command to monitor used_memory and used_memory_peak, set the threshold for memory usage, and set the corresponding alarm mechanism. Of course, alarm is only a means. What is important is that you have to plan in advance. When the memory usage is too large, what should you do to clear useless cold data, or migrate Redis to a more powerful machine.
Persistence
If your machine or Redis itself causes Redis to crash, your only life-saving straw may be the rdb file dumped by dump. Therefore, it is also important to monitor Redis dump files. You can monitor rdb_last_save_time to learn about the last dump data operation time. You can also monitor rdb_changes_since_last_save to know how much data will be lost if a fault occurs at this time.
Master-slave Replication
If you have set the master-slave replication mode, you 'd better monitor whether the replication is normal, mainly to monitor master_link_status In the info output. If the value is up, the synchronization is normal. If it is down, you need to pay attention to some other diagnostic information output. For example:
Role: slave
Master_host: 192.168.1.128
Master_port: 6379
Master_link_status: down
Master_last_io_seconds_ago:-1
Master_sync_in_progress: 0
Master_link_down_since_seconds: 1356900595
Fork Performance
When Redis persistently transmits data to the disk, it performs a fork operation and uses the fork copy on write mechanism to implement the memory image at the lowest cost. However, although the memory is copy on write, the virtual memory table needs to be allocated at the moment of fork, So fork will cause the master thread to get stuck for a short time (stop all read/write operations ), this lagging time is related to the memory usage of the current Redis instance. Generally, the fork operation time of Redis in GB is within milliseconds. You can monitor the latest_fork_usec output by info to learn how much time the last fork operation has caused.
Consistent Configuration
Redis supports using the config set operation to modify the configuration of the running instance, which is convenient, but it also causes a problem. The configuration that is dynamically modified using this command will not be synchronized to your configuration file. So when you restart Redis for some reason, the configuration changes you made using config set will be lost, so we 'd better ensure that every time you use config set to modify the configuration, you can also change the configuration file. To prevent human errors, we recommend that you monitor the configuration and use the config get command to obtain the current configuration at runtime. the configuration values in conf are compared. If the two sides are not correct, an alarm is triggered.
Slow log
Redis provides the SLOWLOG command to obtain the latest slow logs. Redis slow logs exist directly in the memory, so its slow log overhead is not large. In actual applications, we use the crontab task to execute the SLOWLOG command to obtain the slow log, store the slow log to the file, and use Kibana to generate a real-time performance chart for performance monitoring.
It is worth mentioning that the time for Redis slow log recording only includes the time for Redis to execute a command, not the time for IO, for example, the time for receiving client data and sending client data. In addition, the slow log of apsaradb for Redis is slightly different from that of other databases. The occasional slow log of 100 ms in other databases may be normal, because generally, the database is executed in multiple threads concurrently, the performance of executing a command in a thread may not represent the overall performance, but for Redis, it is single-threaded. Once a slow log appears, it may need to be paid attention to immediately, it is best to check the specific reason.