Track record of a large number of dirty page problems with Redis

Last Update:2015-07-20 Source: Internet

Author: User

Tags allkeys

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

See note:https://www.zybuluo.com/sailorxiao/note/136014

Case scene

The line found a machine memory load is heavy, top found a redis process accounted for a large amount of memory, top content is as follows:

27190   root    20   0  18.6g   18g  600 S  0.3     59.2    926:17.83

Discover that Redis accounts for 18.6G of physical memory. Since Redis is just used to cache some program data, it feels amazing to perform Redis's info command, found that the actual data occupy only 112M, as follows:

    # Memory    used_memory:118140384    used_memory_human:112.67M　　used_memory_rss:19903766528　　used_memory_peak:17871578336　　used_memory_peak_human:16.64G　　used_memory_lua:31744　　mem_fragmentation_ratio:168.48　　mem_allocator:libc

So I used pmap-x 27190 to view the memory image information for the Redis process, the result is as follows:

　　27190:   ./redis-server ../redis.conf　　Address             Kbytes      RSS     Dirty           Mode        Mapping　　0000000000400000    548         184         0           r-x--       redis-server　　0000000000689000    16          16          16          rw---       redis-server　　000000000068d000    80          80          80          rw---       [ anon ]　　0000000001eb6000    132         132         132         rw---       [ anon ]　　0000000001ed7000    19436648    19435752    19435752    rw---       [ anon ]　　00007f5862cb2000    4           0           0           -----       [ anon ]

There is a large amount of memory dirty pages found. Now the cause of the problem is clear, that Redis's dirty pages consume a lot of memory and cause the system memory load to be too high. So why are there so many dirty pages in Redis?

Case analysis

Look at the definition of a Linux dirty page:

脏页是linux内核中的概念，因为硬盘的读写速度远赶不上内存的速度，系统就把读写比较频繁的数据事先放到内存中，以提高读写速度，这就叫高速缓存，linux是以页作为高速缓存的单位，当进程修改了高速缓存里的数据时，该页就被内核标记为脏页，内核将会在合适的时间把脏页的数据写到磁盘中去，以保持高速缓存中的数据和磁盘中的数据是一致的。

That is, dirty pages are caused by the fact that much of the data in memory is not being updated to disk. Look at the dirty page flush mechanism of the Linux system:
Http://blog.chinaunix.net/uid-17196076-id-2817733.html
Found flush with memory can be set (under/PROC/SYS/VM)

dirty_background_bytes/dirty_background_ratio：    - 当系统的脏页到达一定值（bytes或者比例）后，启动后台进程把脏页刷到磁盘中,此时不影响内存的读写（当bytes设置存在时，ratio是自动计算的）dirty_bytes/dirty_ratio：    - 当系统的脏页到达一定值（bytes或者比例）后，启动进程把脏页刷到磁盘中,此时内存的写可能会被阻塞（当bytes设置存在时，ratio是自动计算的）dirty_expire_centisecs：    - 当内存和磁盘中的数据不一致存在多久以后（单位为百分之一秒），才会被定义为dirty，并在下一次的flush中被清理。不一致以磁盘中文件的inode时间戳为准dirty_writeback_centisecs：    - 系统每隔一段时间，会把dirty flush到磁盘中（单位为百分之一秒）

Check the flush configuration of the current system, and find no problem, Dirty_background_ratio is 10%,dirty_ratio for 20%,dirty_writeback_centisecs for 5s,dirty_expire_ Centisecs is 30s, but why is Redis's dirty pages not flush to disk?

The general dirty page is to flush the in-memory data to the disk, then is the Redis persistence caused the dirty page? Review the following Redis configuration for these areas:

　　rdb持久化已经被关闭
　　# save 900 1　　# save 300 10　　# save 60 10000　　# append持久化也被关闭　　appendonly no　　# 最大内存设置、内存替换策略都是默认值　　# maxmemory <bytes>　　# maxmemory-policy volatile-lru

As shown above, it is found that Redis itself has completely turned off persistence, just as the cache is used, and the default value for maximum memory usage (which means there is no limit), the memory elimination mechanism is VOLATILE-LRU. Look at the Redis documentation to see the corresponding elimination mechanism:

　　volatile-lru：      从已设置过期时间的数据集（server.db[i].expires）中挑选最近最少使用的数据淘汰（默认值）　　volatile-ttl：      从已设置过期时间的数据集（server.db[i].expires）中挑选将要过期的数据淘汰　　volatile-random：   从已设置过期时间的数据集（server.db[i].expires）中任意选择数据淘汰　　allkeys-lru：       从数据集（server.db[i].dict）中挑选最近最少使用的数据淘汰　　allkeys-random：    从数据集（server.db[i].dict）中任意选择数据淘汰　　no-enviction：      禁止驱逐数据

In the current usage environment, the program uses Redis as the cache, and the data is set to the expire timeout, waiting for Redis to be deleted after the expiration date. So dirty page reason, is not because of outdated data cleanup mechanism problems (such as cleaning is not timely, etc.)? Therefore, you need to look at the policies that Redis takes when deleting outdated data, and the reference information is as follows:
Memory release and expiration key deletion in Redis
Removal of the Redis expiration key

Redis expiration key removal mechanism:

惰性删除：    -  到期后，不会自动删除，只会在每次读取键时进行检查，检查该键是否已经过期，如果过期，则进行删除动作。这样可以保证删除操作只会在非做不可的情况下进行定期删除：    - 每隔一段时间执行一次删除操作，并通过限制删除操作执行的时长和频率，籍此来减少删除操作对 CPU 时间的影响。
redis使用的是惰性删除 + 定期删除的策略

Case Locator

Through the above analysis, the problem has been relatively clear, for the following reasons:

For some reason, Redis uses more and more memory (possibly due to lazy deletion, resulting in more data expire, or other reasons, depending on the implementation within Redis)
Redis does not actually read and write files because it is only a cache, so the operating system does not flush it to disk (because there is no place to flush)
Because Redis does not have a maxmemory set, the default is the machine's memory size and Redis cleans itself (VOLATILE-LRU mechanism) only if the memory used by Redis itself reaches the machine memory size
So the current Redis memory is getting bigger, and the dirty pages are getting more and more (most of them may be data that has expired)

Case resolution

In order to solve this problem, the more convenient and reasonable way is:

Set the maxmemory size of the Redis reasonably to allow Redis to implement its own data cleansing

Track record of a large number of dirty page problems with Redis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More