Redis is a KV database with a complete set of memory. It does not have some data in the disk and some data in the memory. Therefore, it is very important to estimate and save memory in advance. this article describes how to estimate the memory capacity and save memory under the jemalloc memory alloc based on the most common data structures of string and zipmap.
Let's talk about jemalloc first. It is said that FreeBSD's default malloc distributor, area, thread-Cache function and tmalloc solve Firefox memory problems. redis was introduced in version 2.4, which mentioned in the antirez blog that it saved 30% of memory usage. compared with glibc's malloc, an additional 4-byte memory block needs to be appended to each memory. jemalloc can use the je_malloc_usable_size function to obtain the actual memory size pointed to by the pointer, in this way, every key or value in redis can be reduced by 4 bytes.
The following is jemalloc size class categories. On the left is the memory range applied by the user, and on the right is the size of the actually applied memory. This table will be used later.
1-4 size class: 4
5-8 size class: 8
9-16 size class: 16
17-32 size class: 32
33-48 size class: 48
49-64 size class: 64
65-80 size class: 80
81-96 size class: 96
97-112 size class: 112
113-128 size class: 128
129-192 size class: 192
193-256 size class: 256
257-320 size class: 320
321-384 size class: 384
385-448 size class: 448
449-512 size class: 512
513-768 size class: 768
769-1024 size class: 1024
1025-1280 size class: 1280
1281-1536 size class: 1536
1537-1792 size class: 1792
1793-2048 size class: 2048
2049-2304 size class: 2304
2305-2560 size class: 2560 string
The string type seems simple, but there are several optimizations. First, let's look at the data structure added by a simple set command.
A set Hello world command will generate four objects, one dictentry (12 bytes) and one SDS for storing the key, there is also a redisobject (12 bytes), and a SDS that stores string. in addition to a string, the SDS object also contains an SDS header and an additional byte as the string ending with a total of 9 bytes.
SDS. c
==========
51 SDS sdsnewlen (const void * init, size_t initlen ){
52 struct sdshdr * Sh;
53
54 SH = zmalloc (sizeof (struct sdshdr) + initlen + 1 );
SDS. h
========
39 struct sdshdr {
40 int Len;
41 int free;
42 char Buf [];
43
}; According to the jemalloc size class table, the memory requested by this command is 16 (dictetnry) + 16 (redisobject) + 16 ("hello ") + 16 ("world"), 64 bytes in total. note: If the length of the key or value string + 9 bytes exceeds 16 bytes, the actual memory size applied is 32 bytes.
Common string Optimization Methods
Try to make value a pure number
In this way, the string will be converted to the int type to reduce memory usage.
Redis. c
==========
37 void setcommand (redisclient * c ){
38 C-> argv [2] = tryobjectencoding (c-> argv [2]);
39 setgenericcommand (C, 0, C-> argv [1], C-> argv [2], null );
40}
Object. c ========
275 o-> encoding = redis_encoding_int;
276 sdsfree (o-> PTR );
277 o-> PTR = (void *) value; we can see that SDS is released and numbers are stored in Pointer bits. Therefore, for set Hello 1111111, only 48 bytes of memory is needed.
Adjust redis_shared_integers
If the value number is smaller than the macro redis_shared_integers (10000 by default), this redisobject is also saved and the share object is used when redis server is started.
Object. c
========
269 If (server. maxmemory = 0 & value> = 0 & value <redis_shared_integers &&
270 pthread_equal (pthread_self (), server. mainthread )){
271 decrrefcount (O );
272 incrrefcount (shared. integers [value]);
273 return shared. integers [value];
274} such a set Hello 111 only requires 32 bytes, saving redisobject. Therefore, for applications whose values are small numbers, adjusting the redis_shared_integers macro appropriately can effectively save memory.
Out of Kv, the bucket of dict gradually becomes larger and also consumes memory. The element of the bucket is a pointer (dictentry **), the bucket size is greater than the number of keys rounded up to the power of 2. For 1 W keys, if 16384 buckets are required after rehash.
Start the string-type Capacity Estimation test. The script is as follows:
#! /Bin/bash
Redis-cli info | grep used_memory:
For (START = 10000; Start <30000; Start ++ ))
Do
Redis-cli set a $ start baaaaaaaa $ Start>/dev/null
Done
Redis-cli info | grep used_memory: Based on the above summary, we get the string formula.
Memory size of string type = number of key values * (dictentry size + redisobject size + SDS size containing key + SDS size containing value) + number of buckets * 4
Below is our estimated value
>>> 20000*(16 + 16 + 16 + 32) + 32768*4
1731072 run the test script
Hoterran @~ /Projects/redis-2.4.1 $ bash redis-mem-test.sh
Used_memory: 564352
Used_memory: 2295424 calculate the difference value
>>> 2295424-564352
1731072 is 1731072, which indicates that the estimation is very accurate. ^_^
Zipmap
This article has explained the zipmap effect, which can greatly save memory usage. for a common subkey and value, only three additional bytes (keylen, valuelen, freelen) are required for storage, in addition, the hash key only needs two additional bytes (ZM header and tail) to store the number and terminator of the subkey.
Zipmap memory size = number of hashkeys * (dictentry size + redisobject size + SDS size containing keys + total size of subkey) + number of buckets * 4
Start the Capacity Estimation test. There are 100 hashkeys. Each hashkey contains 300 subkeys. Here, the length of key + value is 5 bytes.
#! /Bin/bash
Redis-cli info | grep used_memory:
For (START = 100; Start <200; Start ++ ))
Do
For (start2 = 100; start2 <400; start2 ++ ))
Do
Redis-cli hset test $ start a $ start2 "1">/dev/null
Done
Done
Redis-cli info | grep used_memory: the subkey is applied for at the same time. The size is 300*(5 + 3) + 2 = 2402 bytes, according to the above jemalloc size class, we can see that the actual applied memory is 2560. in addition, the bucket of 100hashkey is 128. so the total estimated size is
>>> 100*(16 + 16 + 16 + 2560) + 128*4
261312 run the above script
Hoterran @~ /Projects/redis-2.4.1 $ bash redis-mem-test-zipmap.sh
Used_memory: 555916
Used_memory: 817228 calculate the difference value
>>> 817228-555916
261312 is exactly the same, and the estimation is very accurate.
In addition, zipmap is a defect in zipmap. The zmlen used to record the number of subkeys has only one byte. It cannot be recorded after more than 254 subkeys. You need to review the whole zipmap to obtain the number of subkeys. now we usually set hash_max_zipmap_entries to 1000, so that the hset efficiency will be very poor after more than 254 subkeys.
354 if (ZM [0] <zipmap_biglen ){
355 Len = ZM [0]; // if the value is smaller than 254, the result is directly returned.
356} else {
357 unsigned char * P = zipmaprewind (ZM); // traverse zipmap
358 while (P = zipmapnext (p, null ))! = NULL) Len ++;
359
360/* re-store length if small enough */
361 if (LEN <zipmap_biglen) ZM [0] = Len;
362} setting zmlen to 2 bytes (65534 subkeys can be stored) can solve this problem. Today I chatted with antirez, which will undermine RDB compatibility, this feature was postponed to version 3.0, and this defect may be one of the reasons that Weibo's redis machine consumes too much CPU.
Article Source: http://blog.nosqlfan.com/html/3430.html