Redis Optimization Experience

Source: Internet
Author: User
Tags compact

Memory Management Optimizations

The Redis hash is a hashmap within value, and if the map has fewer members, the map is stored in a compact format similar to one-dimensional linearity, which eliminates the memory overhead of a large number of pointers, which control the following 2 entries in the redis.conf configuration file:

Hash-max-zipmap-entries Hash-max-zipmap-value 512

When value is not more than the number of members within the map is stored in a linear compact format, the default is 64, that is, the value of 64 members of the following is the use of linear compact storage, more than this value automatically into a real hashmap.

Hash-max-zipmap-value means that when value does not exceed the number of bytes per member within the map, linear compact storage is used to save space.

Above 2 conditions any one condition exceeds the set value to convert to the real HashMap, also will not save the memory again, then this value is not set the bigger the better, the answer is of course negative, the HashMap advantage is to find and operate the time complexity is O (1), While discarding a hash using one-dimensional storage is the time complexity of O (n), if

The number of members is small, the impact is not significant, otherwise it will seriously affect performance, so to weigh this value of the setting, overall is the most fundamental time cost and space cost tradeoff.

List-max-ziplist-value List-max-ziplist-entries 512

The list data type node value size is less than how many bytes will be in compact storage format, list data type how many nodes below will use a compact storage format for pointers.

Memory Pre-allocation:

The Redis internal implementation does not optimize memory allocations (compared to memcache) to some extent with memory fragmentation, but in most cases this does not become a performance bottleneck for Redis, but if most of the data stored inside Redis is numeric, The Redis interior employs a shared integer to eliminate the overhead of allocating memory, which is to assign a number of numeric objects from 1~n to a pool when the system starts, and if the stored data happens to be data within that range, remove the object directly from the pool, And by reference counting way to share, so that the system stores a large number of values, but also to some extent save memory and improve performance, this parameter value n settings need to modify the source code a line of macro definition redis_shared_integers, the value is 10000 by default, Can be modified according to their own needs, modified and re-compiled on it.

Persistence mechanism:

Timed snapshot mode (snapshot):

This persistence is actually a timer event within Redis, every fixed time to check whether the current data has changed the number of times and time to meet the configured persistence trigger condition, if satisfied then through the operating system fork to create a child process, This child process, by default, shares the same address space as the parent process, where it can traverse the entire memory for storage operations, while the main process can still provide the service, when there is write by the operating system according to the Memory page (page) Copy-on-write for the unit to ensure that the parent-child process does not affect each other.

The main disadvantage of this persistence is that the timed snapshot represents a memory image for a period of time, so the system reboot loses all data between the last snapshot and the restart.

Based on statement append mode (AOF):

The AOF approach is actually similar to MySQL's statement-based Binlog, where each command that causes Redis memory data to change is appended to a log file, which means that the log file is the persistent data for Redis.

The main disadvantage of the AOF approach is that appending the log file can lead to an excessive volume, and if it is aof when the system restarts the recovery data, it can be very slow to load data, and dozens of g of data may take a few hours to load, although this time is not due to slow disk file reads. Instead, all commands that are read are executed in memory. In addition, because each command has to write log, so the use of aof, Redis read and write performance will be reduced.

Consider saving the data to a different Redis instance, each with a memory size of around 2G, to avoid putting eggs in a basket, to reduce the impact of cache invalidation on the system, and to speed up data recovery, but it also brings some complexity to the system design.

Redis Persistence crash Issue:

People who have experience with Redis on-line operations will find that Redis has a lot of physical memory usage, but it has not exceeded the actual physical memory of the total capacity of the instability or even crash, some people think that is based on the snapshot persistence of the fork system calls resulting in double memory consumption, this view is inaccurate, Because the copy-on-write mechanism of the fork call is based on the unit of the operating system page, that is, only the dirty pages that are written will be copied, but generally your system does not write all pages in a short period of time, resulting in replication, then what causes Redis to crash?

The answer is that the persistence of Redis uses buffer Io, which means that Redis writes and reads to persisted files using the page Cache of physical memory, and most database systems use direct IO to bypass this layer of page Cache and maintain a cache of the data itself, and when a Redis persistent file is too large (especially a snapshot file) and read and write to it, the data in the disk file is loaded into physical memory as the operating system caches a layer of the file. And this cache of data and Redis in memory management data is actually repeatedly stored, although the kernel in the physical memory is tight when the page cache to do the culling work, but the kernel probably think that a piece of page cache more important, and let your process start swap, Your system will start to appear unstable or crash. Our experience is that when your Redis physical memory is using more than 3/5 of the total memory capacity, it starts to be more dangerous.

Summarize:
    1. Choose the right data type based on your business needs and set the appropriate compact storage parameters for different scenarios.
    2. When a business scenario does not require data persistence, shutting down all persistence methods gives you the best performance and maximum memory usage.
    3. If you need to use persistence, do not use virtual memory and Diskstore mode, depending on whether you can tolerate restarting the loss of part of the data between snapshot and statement append mode.
    4. Do not let your redis machine use more than 3/5 of the physical memory of the actual memory.

The MaxMemory option in Redis.conf, which tells Redis how much physical memory is used to start rejecting subsequent write requests, which is good enough to protect your redis from the use of too much physical memory to cause swap, which can eventually seriously affect performance or even crash.

Vm-enabled No in redis.conf file

Common memory optimization methods and parameters

With some of our implementations above, we can see that the actual memory management cost of Redis is very high, that is, it takes up too much memory, and the author is very clear about this, so it provides a series of parameters and means to control and save memory, we discuss it separately.

First, the most important thing is not to turn on the redis VM option, the virtual Memory feature, which is a persistent strategy to swap out the memory and disk for a Redis store of data that exceeds physical memory, but its memory management costs are also very high. And we will later analyze this persistence strategy is immature, so to turn off the VM function, please check your redis.conf file vm-enabled to No.

Next, it is best to set the MaxMemory option in Redis.conf, which tells Redis how much physical memory is used to start rejecting subsequent write requests, which is a good way to protect your redis from using too much physical memory to cause swap. Eventually seriously impacting performance or even crashing.

In addition, Redis provides a set of parameters for different data types to control memory usage, and we analyzed in detail that Redis hash is value internally as a hashmap, and if the map has a smaller number of members, it will be stored in a compact format similar to one-dimensional linear, This eliminates the memory overhead of a large number of pointers, and this parameter controls the following 2 entries in the redis.conf configuration file:

Hash-max-zipmap-entries Hash-max-zipmap-value hash-max-zipmap-entries

The implication is that when value is not more than the number of members within the map is stored in a linear compact format, the default is 64, that is, the value of 64 members of the following is the use of linear compact storage, more than this value automatically into a real hashmap.

Hash-max-zipmap-value means that when value does not exceed the number of bytes per member within the map, linear compact storage is used to save space.

Above 2 conditions any one condition exceeds the set value to convert to the real HashMap, also will not save the memory again, then this value is not set the bigger the better, the answer is of course negative, the HashMap advantage is to find and operate the time complexity is O (1), While discarding a hash using one-dimensional storage is the time complexity of O (n), if

The number of members is small, the impact is not significant, otherwise it will seriously affect performance, so to weigh this value of the setting, overall is the most fundamental time cost and space cost tradeoff.

Also similar parameters are:

List-max-ziplist-entries 512

Description: The list data type is a compact storage format that uses a pointer to the following number of nodes.

Description: The list data type node value size is less than how many bytes are in a compact storage format.

Description: The Set data type internal data is stored in a compact format if it is all numeric and contains many nodes.

The last thing to say is that redis internal implementations do not optimize memory allocations to some extent, but in most cases this will not be a performance bottleneck for Redis, but if most of the data stored inside Redis is numeric, The Redis interior employs a shared integer to eliminate the overhead of allocating memory, which is to assign a number of numeric objects from 1~n to a pool when the system starts, and if the stored data happens to be data within that range, remove the object directly from the pool, And by reference counting way to share, so that the system stores a large number of values, but also to some extent save memory and improve performance, this parameter value n settings need to modify the source code a line of macro definition redis_shared_integers, the value is 10000 by default, Can be modified according to their own needs, modified and re-compiled on it.

The persistence mechanism of Redis

Since Redis supports very rich types of memory data structures, how to persist these complex memory organizations to disk is a challenge, so there are more differences in how redis is persisted than traditional databases, and Redis supports four persistence modes, namely:

    • Timed snapshot mode (snapshot)
    • How to append a file based on a statement (AOF)
    • Virtual Memory (VM)
    • Diskstore Way

In the design thinking, the first two are based on all the data in memory, that is, the small amount of data to provide disk landing function, the latter two ways is the author in the attempt to store data over physical memory, that is, the large amount of data storage, as of this article, the last two persistent mode is still in the experimental phase, And the VM way basically has been abandoned by the author, so the actual can be used in the production environment only the first two, in other words, Redis is currently only as a small amount of data storage (all data can be loaded in memory), massive data storage is not the domain that Redis excels at. Here are some examples of how to persist:

Timed snapshot mode (snapshot):

This persistence is actually a timer event within Redis, every fixed time to check whether the current data has changed the number of times and time to meet the configured persistence trigger condition, if satisfied then through the operating system fork to create a child process, This child process, by default, shares the same address space as the parent process, where it can traverse the entire memory for storage operations, while the main process can still provide the service, when there is write by the operating system according to the Memory page (page) Copy-on-write for the unit to ensure that the parent-child process does not affect each other.

The main disadvantage of this persistence is that the timed snapshot represents a memory image for a period of time, so the system reboot loses all data between the last snapshot and the restart.

Based on statement append mode (AOF):

The AOF approach is actually similar to MySQL's statement-based Binlog, where each command that causes Redis memory data to change is appended to a log file, which means that the log file is the persistent data for Redis.

The main disadvantage of the AOF approach is that appending the log file can lead to an excessive volume, and if it is aof when the system restarts the recovery data, it can be very slow to load data, and dozens of g of data may take a few hours to load, although this time is not due to slow disk file reads. Instead, all commands that are read are executed in memory. In addition, because each command has to write log, so the use of aof, Redis read and write performance will be reduced.

Virtual Memory Mode:

The virtual memory mode is a redis for user space data exchange in a policy, this way in the implementation of the effect is poor, the main problem is the code complex, restart slow, replication slow, etc., has been abandoned by the author.

Diskstore Way:

Diskstore Way is the author abandoned the virtual memory mode after the choice of a new way of implementation, that is, the traditional way of B-tree, is still in the experimental stage, the follow-up is available we can wait and see.

Redis Persistent disk IO mode and the problems it brings

People who have experience with Redis on-line operations will find that Redis has a lot of physical memory usage, but it has not exceeded the actual physical memory of the total capacity of the instability or even crash, some people think that is based on the snapshot persistence of the fork system calls resulting in double memory consumption, this view is inaccurate, Because the copy-on-write mechanism of the fork call is based on the unit of the operating system page, that is, only the dirty pages that are written will be copied, but generally your system does not write all pages in a short period of time, resulting in replication, then what causes Redis to crash?

The answer is that the persistence of Redis uses buffer Io, which means that Redis writes and reads to persisted files using the page Cache of physical memory, and most database systems use direct IO to bypass this layer of page Cache and maintain a cache of the data itself, and when a Redis persistent file is too large (especially a snapshot file) and read and write to it, the data in the disk file is loaded into physical memory as the operating system caches a layer of the file. And this cache of data and Redis in memory management data is actually repeatedly stored, although the kernel in the physical memory is tight when the page cache to do the culling work, but the kernel probably think that a piece of page cache more important, and let your process start swap, Your system will start to appear unstable or crash. Our experience is that when your Redis physical memory is using more than 3/5 of the total memory capacity, it starts to be more dangerous.

is the memory data graph of Redis after reading or writing to the snapshot file Dump.rdb:

Summarize:
    1. Choose the right data type based on your business needs and set the appropriate compact storage parameters for different scenarios.
    2. When a business scenario does not require data persistence, shutting down all persistence methods gives you the best performance and maximum memory usage.
    3. If you need to use persistence, do not use virtual memory and Diskstore mode, depending on whether you can tolerate restarting the loss of part of the data between snapshot and statement append mode.
    4. Do not let your redis machine use more than 3/5 of the physical memory of the actual memory.

Redis Tuning Experience

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.