Summary
Redis is a well-known Key-value memory database software and an excellent data structure service software. It supports five types of data structures, such as String, list, hash table, collection, and ordered set, while each data structure type supports different encoding methods for different application scenarios. This article mainly introduces the compression list coding, on the basis of understanding the coding principle of the compression list, introduces the application of Redis to the compression list, and finally analyzes the application of Redis compression list. Summary
Redis is a well-known Key-value memory database software and an excellent data structure service software. It supports five types of data structures, such as String, list, hash table, collection, and ordered set, while each data structure type supports different encoding methods for different application scenarios. This article mainly introduces the compression list coding, on the basis of understanding the coding principle of the compression list, introduces the application of Redis to the compression list, and finally analyzes the application of Redis compression list.
The principle and application of Redis compression list
A compression list is a data structure whose function is to store a series of data with its encoded information in a contiguous area of memory, which is physically contiguous and logically divided into components that aim is in a certain controllable time complex reading conditions as much as possible to reduce the unnecessary memory overhead, so as to achieve the effect of saving memory, so the introduction a bit iffy, we first look at its implementation principle, Redis3.2 version, the author of the implementation of the compression list in Ziplist.h and ziplist.c.
Compression list principle
I think that storing the data in memory according to certain rules can be described by the word "encoding", so the word "coding" is often used.
Overall coding
It says that the compression list is a contiguous area of memory, and this memory area is encoded roughly as follows:
Redis Compression List Memory encoding
Normal compression list memory encoding as shown, the entire memory block area is divided into five parts, the following describes five sections:
Zlbytes: Stores an unsigned integer, fixed at four bytes in length, used to store the bytes occupied by the compressed list, used when reallocating memory, and does not need to traverse the entire list to calculate the memory size.
Zltail: Stores an unsigned integer, fixed at four bytes in length, that represents the offset to the tail of the list, which is the distance from the starting position of the compressed list to the starting position of the specified list node.
Zllen: Compression list contains the number of nodes, fixed two bytes length, the source indicates that when the number of nodes is greater than the number of 2^16-2, the value will be invalid, you need to traverse the list to calculate the number of list nodes.
Entryx: The list node area, which is variable in length, consists of a list node next to each other.
Zlend: A fixed value of one byte length is 255, which indicates the end of the list.
List element encoding
The overall memory layout of the compressed list is described above, for the length of the four regions outside of the initial Entryx area is fixed, and then the coding of the Entryx area is looked at below.
Each list node consists of three parts:
Compression List node encoding
Each compressed List node area header contains two parts, part of which is called previous length, the other part is called encoding, and finally the subject content, called content, describes them separately below:
Previous length
is used to store the length of the previous node, so the compression list can be traversed from the tail to the head, i.e. the current node position minus the length of the previous node is the starting position of the previous node. Previous length may be 1 bytes or 5 bytes, if the length of the previous node is less than 254, then the node needs only one byte to represent the length of the previous node, if the previous node is longer than or equal to 254, then previous The first byte of length is 254, followed by four bytes to represent the length of the previous node of the current node. Doing so effectively reduces the amount of memory wasted.
Encoding
The encoding of the node is the content type and length of the node's contents, the encoding type has two kinds, one byte array is an integer, the encoding area length is 1 bytes, 2 bytes or 5 bytes long. The Redis author cleverly uses the first two bytes to represent the content type of the contents store and the length of the encoding area, first we look at the encoding content of the byte array type:
Content is the encoding contents of a byte array
Then look at the encoding content of the integer encoding type:
Content is a encoding of integers
Content
The content area is used to hold the contents of the node, and the content type and length of the node is determined by encoding, which shows that the content type of the contents currently has an integer type and a byte array type, and that under some conditions, content may be 0 in length.
Believe here, we all understand the principle of the compression list , the compression list is not to use an algorithm to compress the data, but the data according to a certain rule code in a contiguous area of memory, the purpose is to save memory. let's look at the application areas of the compression list in Redis.
Application of compression list in Redis
In Redis, the compression list encoding is widely applied to different data types, and the following tables are organized:
Application of data structure type and compression list in Redis
The above table summarizes the application of compression list coding in different Redis data types, and Redis supports five data structure types, of which three data structures apply compression lists under certain conditions, and what conditions are analyzed later, It is worth mentioning that the currently supported Geo (geographic location) of Redis also has an application for the compression list, which is not discussed here.
Redis Compression List Application analysis
The above section describes the principle and application of the Redis compression list, the following simple analysis, mainly from the attempt to answer some questions to analyze:Why Redis use a compressed list? What are the benefits of using a compression list? What else is the advantage of using a compressed list? Does the application of the compression list have any inspiration for using memory with us?
Redis relies on a switch and a threshold when deciding whether to apply a compressed list as the underlying encoding of the current data structure type for each data structure, whether it is a list, a hash table, or an ordered set, and the switch is used to determine whether we want to enable the compression list encoding, the threshold value generally speaking, the number of keys stored in the current structure does not reach a value (condition), or the length of the value is not reached a certain length (condition). any strategy has its own application scenario, and different scenarios are applied differently. Why does the current structure store data entry to a certain value using a compression list is not good? The new and deleted operations of the compression list have an average time complexity of O (n), and as the N increases, time is bound to increase , unlike a hash table where the access location can be found in the time Complexity of O (1), but the time complexity within a certain N is tolerated . The compression list, however, takes advantage of clever coding techniques, in addition to storing content as much as possible to reduce unnecessary memory overhead , storing data in contiguous memory areas, which is meaningful to redis itself, because Redis is an in-memory database software Finding ways to minimize memory overhead is something that redis designers must consider.
In addition, after careful thought, I think the advantage of using the compression list in addition to saving memory, but also to reduce the role of memory fragmentation, I call this behavior "merge storage", that is, a lot of small chunks of data stored in a larger memory area, imagine, if we are going to store the data are very small entries, We apply the memory separately for each data entry, and the result is that these entries are likely to be scattered across every corner of the memory and eventually lead to an increase in fragmentation, which is a headache.
Summarize
This article analyzes the application of Redis compression list on the basis of the principle and application of Redis compression list, the analysis is mainly doped with personal understanding and cognition, if there are different viewpoints or complementary viewpoints, we welcome the message discussion.
The principle and application analysis of Redis compression list