This article introduces the principle and implementation of ZIPMAP in Redis. (reprint please indicate CSDN blog for breaksoftware)
Basic structure
Zipmap is to implement the structure that holds the pair (string,string) data, which contains a header information, a series of string pairs (then a "string pair" called an "element" (ELE)) and a tail mark. The structure is represented by a graph:
The structure is not used in the Redis source code to express the structure. Because this structure is contiguous in memory, the tail mark end (constant 0xFF) in addition to the head and red background is a fixed 8-bit, and the rest is indeterminate.
Although the head information is fixed-length, its content expression is two-tier meaning. The head information is used to hold the number of elements in the message. For example, the Zipmap only 0x12 elements, the head of the content is 0x12. If the number of elements is 0x1234, then the content is 0xFE. So what is the cutoff ridge for this change? Is 0xFE. If the number of elements is less than 0xfe,head content is a numeric value. If the number of elements is greater than or equal to 0xfe,head content is 0xFE, indicating that the 8-bit does not represent the number of elements, whether this need to calculate the number of elements need to traverse the entire structure.
The length of the element is also indeterminate. This is a good understanding because the element holds a string pair, and the string length is not deterministic. So let's look at the structure of the element.
The length information for the key that was recorded at the beginning of the element content--keylen Struct. If the length is less than 0xFE, the structure is only 8 bits long, the content is the length value, and if it is greater than or equal to 0xFE, the structure is 40 bits long. The first 8 bits are 0xFE, indicating that the standard has not been able to represent the length. The latter 32 bits save the length value
Keydata contains the contents of a string, whose contents can contain null, but does not automatically append null at the end. The rule also complies with Valuedata.
The key content is followed by the length information of value. The organizational way and key length information are organized the same way.
Finally, the more magical free fields. Because the ZIPMAP will provide an interface that allows the user to change the value by key, if the value is shortened, there will be a certain amount of free space, and the length of the free space is the value of NULL. However, Zipmap left 8 bits of space for the free field, but the length of the value modified may be longer than 0xFF. Do not worry, because if zipmap found that if the spare length exceeds a certain value, the space will be shifted forward to save space, and this threshold is smaller than 0xFF
#define ZIPMAP_VALUE_MAX_FREE 4
After understanding the above structure, it becomes easy to read the following code.
Create Zipmap
Redis provides the following methods to create an empty Zipmap structure.
unsigned char * zipmapNew (void) {
unsigned char * zm = zmalloc (2);
zm [0] = 0; / * Length * /
zm [1] = ZIPMAP_END;
return zm;
}
Because there are no elements, the length information is 0. Followed by the end tag
Length information encoding Key length information and Value length information of elements in Zipmap need to be dynamically changed according to the value. If the value is less than 0xFE, only 8 bits represent the length, and the content is the length value. If the length is greater than 0xFE, there are 40 bits representing the length information. The first 8 bits are 0xFE and the last 32 bits are value content.
static unsigned int zipmapEncodeLength (unsigned char * p, unsigned int len) {
if (p == NULL) {
return ZIPMAP_LEN_BYTES (len);
} else {
if (len <ZIPMAP_BIGLEN) {
p [0] = len;
return 1;
} else {
p [0] = ZIPMAP_BIGLEN;
memcpy (p + 1, & len, sizeof (len));
memrev32ifbe (p + 1);
return 1 + sizeof (len);
}
}
}
If the first parameter is passed NULL, this function is used to determine how long space is needed to store the length information according to the second parameter. If the first parameter has a value, it is decided to set the corresponding value at different offset positions of the address according to the second parameter.
#define ZIPMAP_BIGLEN 254
#define ZIPMAP_END 255
#define ZIPMAP_LEN_BYTES (_l) (((_l) <ZIPMAP_BIGLEN)? 1: sizeof (unsigned int) +1)
Length information decoding
The length information decoding is the reverse operation of the encoding. It determines whether the content of the start address of the length information passed in is less than 0xFE. If it is, then the 8 bit is the length value; otherwise, it is shifted back by 8 bits, and the subsequent 32 bits are the length value.
static unsigned int zipmapDecodeLength (unsigned char * p) {
unsigned int len = * p;
if (len <ZIPMAP_BIGLEN) return len;
memcpy (& len, p + 1, sizeof (unsigned int));
memrev32ifbe (& len);
return len;
}
Calculate the length of Key length information and the overall length of Key content.
The calculation method is to calculate the length of the KeyLen Struct from the obtained KeyData length, and then add the two lengths
static unsigned int zipmapRawKeyLength (unsigned char * p) {
unsigned int l = zipmapDecodeLength (p);
return zipmapEncodeLength (NULL, l) + l;
}
Calculate Value length information, the overall length of Free and Value content corresponds to the calculation of the length
static unsigned int zipmapRawValueLength (unsigned char * p) {
unsigned int l = zipmapDecodeLength (p);
unsigned int used;
used = zipmapEncodeLength (NULL, l);
used + = p [used] + 1 + l;
return used;
}
When the used parameter is first assigned, it represents the length of the ValueLen Struct. So p [used] takes out the content of Free, which is the length of FreeData. Adding 1 to the operation of calculating sum is the length of the Free field-sizeof (char).
Calculating element length
As long as the above two methods are superimposed, it is the element length. The only workaround is to let the pointer point to the first address of the Value when calculating the value-related length.
static unsigned int zipmapRawEntryLength (unsigned char * p) {
unsigned int l = zipmapRawKeyLength (p);
return l + zipmapRawValueLength (p + l);
}
Calculate the shortest length required to save the element by the Key and Value lengths. Because the minimum length of KeyLen Struct and ValueLen Struct is one byte, the Free field takes one byte. So at least the length of the following algorithm is needed
static unsigned long zipmapRequiredLength (unsigned int klen, unsigned int vlen) {
unsigned int l;
l = klen + vlen + 3;
But if the length of Key or Value is greater than or equal to 0xFE, then 4 bytes are needed to represent the true length
if (klen> = ZIPMAP_BIGLEN) l + = 4;
if (vlen> = ZIPMAP_BIGLEN) l + = 4;
return l;
}
Find elements and calculate the total length of Zipmap Zipmap provides a method to accomplish two functions. Because both methods require traversal operations, they are simply put together.
static unsigned char * zipmapLookupRaw (unsigned char * zm, unsigned char * key, unsigned int klen, unsigned int * totlen) {
unsigned char * p = zm + 1, * k = NULL;
unsigned int l, llen;
If the Key field points to a string to be compared, then the first address of the element corresponding to the Key will be found by comparison. If totlen is not NULL, the total length of the Zipmap structure will be calculated incidentally. At the beginning, p advances 8 bits in the first address of Zipmap in order to remove the HEAD structure and directly point to the first address of the element. Similar method also
unsigned char * zipmapRewind (unsigned char * zm) {
return zm + 1;
}
Then get the Key length. If the Key length is not the same as the passed klen, you can be sure that the Key is different, so you don't need to compare the strings later. Considered an optimization
while (* p! = ZIPMAP_END) {
unsigned char free;
/ * Match or skip the key * /
l = zipmapDecodeLength (p);
llen = zipmapEncodeLength (NULL, l);
The comparison operation depends on whether the key length and content are consistent. If they are the same, depending on whether totlen is NULL, decide whether to continue the traversal. Because totlen is NULL, it means that there is no need to calculate the total length of the Zipmap. At this time, the first address of the element can be returned directly. If it is not NULL, record the first address of the currently found element into the k variable, so that in the subsequent traversal, you can know that the element has been found through the variable k, and no comparison operation is required.
if (key! = NULL && k == NULL && l == klen &&! memcmp (p + llen, key, l)) {
/ * Only return when the user does n’t care
* for the total length of the zipmap. * /
if (totlen! = NULL) {
k = p;
} else {
return p;
}
}
If there is no match, the length of the value-related information is calculated. Then jump to the first address of the next element.
p + = llen + l;
/ * Skip the value as well * /
l = zipmapDecodeLength (p);
p + = zipmapEncodeLength (NULL, l);
free = p [0];
p + = l + 1 + free; / * +1 to skip the free byte * /
}
Finally, calculate the total length of the Zipmap based on whether totlen is NULL. Add 1 to the calculation to supplement the length of the HEAD structure.
if (totlen! = NULL) * totlen = (unsigned int) (p-zm) +1;
return k;
}
Zipmap also provides the following method to calculate the total length of the structure, of course, it is just a wrapper for zipmapLookupRaw
size_t zipmapBlobLen (unsigned char * zm) {
unsigned int totlen;
zipmapLookupRaw (zm, NULL, 0, & totlen);
return totlen;
}
Detect the existence of the element The method of detection is to encapsulate the zipmapLookupRaw, and then determine whether it finds the first address of the element
int zipmapExists (unsigned char * zm, unsigned char * key, unsigned int klen) {
return zipmapLookupRaw (zm, key, klen, NULL)! = NULL;
}
Get Value by Key
First determine that the Key is in the Zipmap by using the zipmapLookupRaw method. If it doesn't exist, it returns NULL, if it exists, let the pointer point to Value
int zipmapGet (unsigned char * zm, unsigned char * key, unsigned int klen, unsigned char ** value, unsigned int * vlen) {
unsigned char * p;
if ((p = zipmapLookupRaw (zm, key, klen, NULL)) == NULL) return 0;
p + = zipmapRawKeyLength (p);
* vlen = zipmapDecodeLength (p);
* value = p + ZIPMAP_LEN_BYTES(* vlen) + 1;
return 1;
}
Note that this function also returns the length of Value, so Value can store data containing NULL.
Traversing Zipmap Before traversing, you need to call zipmapRewind to make the pointer point to the first address of the element, and then call the following method
unsigned char * zipmapNext (unsigned char * zm, unsigned char ** key, unsigned int * klen, unsigned char ** value, unsigned int * vlen) {
if (zm [0] == ZIPMAP_END) return NULL;
if (key) {
* key = zm;
* klen = zipmapDecodeLength (zm);
* key + = ZIPMAP_LEN_BYTES (* klen);
}
zm + = zipmapRawKeyLength (zm);
if (value) {
* value = zm + 1;
* vlen = zipmapDecodeLength (zm);
* value + = ZIPMAP_LEN_BYTES (* vlen);
}
zm + = zipmapRawValueLength (zm);
return zm;
}
The above method determines whether the information needed to return Key and Value pointers is NULL. Among them, the operation of adding 1 to the 10th line actually removes one byte occupied by Free.
The method to call the above function traversal is:
unsigned char * i = zipmapRewind (my_zipmap);
while ((i = zipmapNext (i, & key, & klen, & value, & vlen))! = NULL) {
printf ("% d bytes key at $ p \ n", klen, key);
printf ("% d bytes value at $ p \ n", vlen, value);
}
Get the number of elements
We said before when we introduced the infrastructure. If the number of elements is less than 0xFE, the value stored in the first address of the structure is the number of elements. If it is greater than or equal to 0xFE, the entire structure is traversed
unsigned int zipmapLen (unsigned char * zm) {
unsigned int len = 0;
if (zm [0] <ZIPMAP_BIGLEN) {
len = zm [0];
} else {
unsigned char * p = zipmapRewind (zm);
while ((p = zipmapNext (p, NULL, NULL, NULL, NULL))! = NULL) len ++;
/ * Re-store length if small enough * /
if (len <ZIPMAP_BIGLEN) zm [0] = len;
}
return len;
}
Redistributing Zipmap Space In addition to reallocating space, the redistribution operation also sets the terminator.
static inline unsigned char * zipmapResize (unsigned char * zm, unsigned int len) {
zm = zrealloc (zm, len);
zm [len-1] = ZIPMAP_END;
return zm;
}
Delete an element Before deleting an element, you need to find the starting address of the element
unsigned char * zipmapDel (unsigned char * zm, unsigned char * key, unsigned int klen, int * deleted) {
unsigned int zmlen, freelen;
unsigned char * p = zipmapLookupRaw (zm, key, klen, & zmlen);
if (p) {
If found, calculate the length of this element
freelen = zipmapRawEntryLength (p);
Then, after the element, except for the terminator 0xFE, move forward to the starting address of the element
memmove (p, p + freelen, zmlen-((p-zm) + freelen + 1));
Then reallocate the space of the Zipmap structure to save space. The zipmapResize method also assists in setting the terminator.
zm = zipmapResize (zm, zmlen-freelen);
Then determine whether the number of elements is within 0xFE. If it is, reduce the value corresponding to the first address of the Zipmap structure by 1.
/ * Decrease zipmap length * /
if (zm [0] <ZIPMAP_BIGLEN) zm [0]-;
if (deleted) * deleted = 1;
} else {
if (deleted) * deleted = 0;
}
return zm;
}
Adding and Modifying Elements If the Key passed in through the zipmapSet method is in a Zipmap, it is required to modify the Value corresponding to the Key; if it is not in the Zipmap, it is a new element.
First calculate the minimum space required to store the string pair by the length of the Key and Value
unsigned char * zipmapSet (unsigned char * zm, unsigned char * key, unsigned int klen, unsigned char * val, unsigned int vlen, int * update) {
unsigned int zmlen, offset;
unsigned int freelen, reqlen = zipmapRequiredLength (klen, vlen);
unsigned int empty, vempty;
unsigned char * p;
Then determine whether the Key is in the Zipmap and calculate the total length of the Zipmap structure
freelen = reqlen;
if (update) * update = 0;
p = zipmapLookupRaw (zm, key, klen, & zmlen);
If the Key does not exist, you need to add an element. At this time, you need to re-allocate more space for the Zipmap, and point the pointer of the element to be added to the end character-that is, to add an element at the end. Let the number of elements increase
if (p == NULL) {
/ * Key not found: enlarge * /
zm = zipmapResize (zm, zmlen + reqlen);
p = zm + zmlen-1;
zmlen = zmlen + reqlen;
/ * Increase zipmap length (this is an insert) * /
if (zm [0] <ZIPMAP_BIGLEN) zm [0] ++;
}
If the Key exists, the Value is updated. At this time, the total length of the found elements needs to be calculated. If the total length is shorter than the minimum required length after modification, the Zipmap space needs to be reallocated. And shift the content after the original element backwards.
else {
/ * Key found. Is there enough space for the new value? * /
/ * Compute the total length: * /
if (update) * update = 1;
freelen = zipmapRawEntryLength (p);
if (freelen <reqlen) {
/ * Store the offset of this key within the current zipmap, so
* it can be resized. Then, move the tail backwards so this
* pair fits at the current position. * /
offset = p-zm;
zm = zipmapResize (zm, zmlen-freelen + reqlen);
p = zm + offset;
/ * The +1 in the number of bytes to be moved is caused by the
* end-of-zipmap byte. Note: the * original * zmlen is used. * /
memmove (p + reqlen, p + freelen, zmlen- (offset + freelen + 1));
zmlen = zmlen-freelen + reqlen;
freelen = reqlen;
}
}
If the total length of the current element is longer than the minimum required length after modification, it means that the length of the Value string is shorter. At this time, it is necessary to calculate whether the free space is larger than ZIPMAP_VALUE_MAX_FREE, and if it is larger, it is necessary to reduce the Zipmap structure space.
empty = freelen-reqlen;
if (empty> = ZIPMAP_VALUE_MAX_FREE) {
/ * First, move the tail <empty> bytes to the front, then resize
* the zipmap to be <empty> bytes smaller. * /
offset = p-zm;
memmove (p + reqlen, p + freelen, zmlen- (offset + freelen + 1));
zmlen-= empty;
zm = zipmapResize (zm, zmlen);
p = zm + offset;
vempty = 0;
} else {
vempty = empty;
}
If the free space is short, don't do memory reallocation. This is the origin of Free and FreeData in the previous structure. This also shows that the data in FreeData is uncertain-that is, it is part of the previous content.
Finally, put the Key, Value and Free fields into the corresponding spaces
/ * Just write the key + value and we are done. * /
/ * Key: * /
p + = zipmapEncodeLength (p, klen);
memcpy (p, key, klen);
p + = klen;
/ * Value: * /
p + = zipmapEncodeLength (p, vlen);
* p ++ = vempty;
memcpy (p, val, vlen);
return zm;
}
Redis source code analysis-string map