See note:https://www.zybuluo.com/sailorxiao/note/136014
Case scene
The line found a machine memory load is heavy, top found a redis process accounted for a large amount of memory, top content is as follows:
27190 root 20 0 18.6g 18g 600 S 0.3 59.2 926:17.83
Discover that Redis accounts for 18.6G of physical memory. Since Redis is just used to cache some program data, it feels amazing to perform Redis's info command, found that the actual data occupy only 112M, as follows:
# Memory used_memory:118140384 used_memory_human:112.67M used_memory_rss:19903766528 used_memory_peak:17871578336 used_memory_peak_human:16.64G used_memory_lua:31744 mem_fragmentation_ratio:168.48 mem_allocator:libc
So I used pmap-x 27190 to view the memory image information for the Redis process, the result is as follows:
27190: ./redis-server ../redis.conf Address Kbytes RSS Dirty Mode Mapping 0000000000400000 548 184 0 r-x-- redis-server 0000000000689000 16 16 16 rw--- redis-server 000000000068d000 80 80 80 rw--- [ anon ] 0000000001eb6000 132 132 132 rw--- [ anon ] 0000000001ed7000 19436648 19435752 19435752 rw--- [ anon ] 00007f5862cb2000 4 0 0 ----- [ anon ]
There is a large amount of memory dirty pages found. Now the cause of the problem is clear, that Redis's dirty pages consume a lot of memory and cause the system memory load to be too high. So why are there so many dirty pages in Redis?
Case analysis
Look at the definition of a Linux dirty page:
脏页是linux内核中的概念,因为硬盘的读写速度远赶不上内存的速度,系统就把读写比较频繁的数据事先放到内存中,以提高读写速度,这就叫高速缓存,linux是以页作为高速缓存的单位,当进程修改了高速缓存里的数据时,该页就被内核标记为脏页,内核将会在合适的时间把脏页的数据写到磁盘中去,以保持高速缓存中的数据和磁盘中的数据是一致的。
That is, dirty pages are caused by the fact that much of the data in memory is not being updated to disk. Look at the dirty page flush mechanism of the Linux system:
Http://blog.chinaunix.net/uid-17196076-id-2817733.html
Found flush with memory can be set (under/PROC/SYS/VM)
dirty_background_bytes/dirty_background_ratio: - 当系统的脏页到达一定值(bytes或者比例)后,启动后台进程把脏页刷到磁盘中,此时不影响内存的读写(当bytes设置存在时,ratio是自动计算的)dirty_bytes/dirty_ratio: - 当系统的脏页到达一定值(bytes或者比例)后,启动进程把脏页刷到磁盘中,此时内存的写可能会被阻塞(当bytes设置存在时,ratio是自动计算的)dirty_expire_centisecs: - 当内存和磁盘中的数据不一致存在多久以后(单位为百分之一秒),才会被定义为dirty,并在下一次的flush中被清理。不一致以磁盘中文件的inode时间戳为准dirty_writeback_centisecs: - 系统每隔一段时间,会把dirty flush到磁盘中(单位为百分之一秒)
Check the flush configuration of the current system, and find no problem, Dirty_background_ratio is 10%,dirty_ratio for 20%,dirty_writeback_centisecs for 5s,dirty_expire_ Centisecs is 30s, but why is Redis's dirty pages not flush to disk?
The general dirty page is to flush the in-memory data to the disk, then is the Redis persistence caused the dirty page? Review the following Redis configuration for these areas:
rdb持久化已经被关闭
# save 900 1 # save 300 10 # save 60 10000 # append持久化也被关闭 appendonly no # 最大内存设置、内存替换策略都是默认值 # maxmemory <bytes> # maxmemory-policy volatile-lru
As shown above, it is found that Redis itself has completely turned off persistence, just as the cache is used, and the default value for maximum memory usage (which means there is no limit), the memory elimination mechanism is VOLATILE-LRU. Look at the Redis documentation to see the corresponding elimination mechanism:
volatile-lru: 从已设置过期时间的数据集(server.db[i].expires)中挑选最近最少使用的数据淘汰(默认值) volatile-ttl: 从已设置过期时间的数据集(server.db[i].expires)中挑选将要过期的数据淘汰 volatile-random: 从已设置过期时间的数据集(server.db[i].expires)中任意选择数据淘汰 allkeys-lru: 从数据集(server.db[i].dict)中挑选最近最少使用的数据淘汰 allkeys-random: 从数据集(server.db[i].dict)中任意选择数据淘汰 no-enviction: 禁止驱逐数据
In the current usage environment, the program uses Redis as the cache, and the data is set to the expire timeout, waiting for Redis to be deleted after the expiration date. So dirty page reason, is not because of outdated data cleanup mechanism problems (such as cleaning is not timely, etc.)? Therefore, you need to look at the policies that Redis takes when deleting outdated data, and the reference information is as follows:
Memory release and expiration key deletion in Redis
Removal of the Redis expiration key
Redis expiration key removal mechanism:
惰性删除: - 到期后,不会自动删除,只会在每次读取键时进行检查,检查该键是否已经过期,如果过期,则进行删除动作。这样可以保证删除操作只会在非做不可的情况下进行定期删除: - 每隔一段时间执行一次删除操作,并通过限制删除操作执行的时长和频率,籍此来减少删除操作对 CPU 时间的影响。
redis使用的是惰性删除 + 定期删除的策略
Case Locator
Through the above analysis, the problem has been relatively clear, for the following reasons:
- For some reason, Redis uses more and more memory (possibly due to lazy deletion, resulting in more data expire, or other reasons, depending on the implementation within Redis)
- Redis does not actually read and write files because it is only a cache, so the operating system does not flush it to disk (because there is no place to flush)
- Because Redis does not have a maxmemory set, the default is the machine's memory size and Redis cleans itself (VOLATILE-LRU mechanism) only if the memory used by Redis itself reaches the machine memory size
- So the current Redis memory is getting bigger, and the dirty pages are getting more and more (most of them may be data that has expired)
Case resolution
In order to solve this problem, the more convenient and reasonable way is:
- Set the maxmemory size of the Redis reasonably to allow Redis to implement its own data cleansing
Track record of a large number of dirty page problems with Redis