redis3.0 in Depth (1)

Source: Internet
Author: User

April 1, Redis3.0-stable officially released. A long-lost cluster model was introduced, and several optimizations were carried out at the same time. In this article, we compare 3.0 and 2.8.19 from the source level and explain the details of optimization in detail. Due to the limited capacity and time, I will only compare the source parts I have read, and do not involve cluster related content.

1. Embedded String

Reduces memory reads due to cache miss, further increases cache hit rate and, in some scenarios, dramatically increases speed.

1) 2.8in Redis, all objects are represented by RobJ,Include Key and value:
typedef struct REDISOBJECT {     unsigned type:4;     unsigned encoding:4;     unsigned lru:redis_lru_bits; /* LRU time (relative to Server.lruclock) */     int refcount;     void *ptr;} RobJ;
Specifies a type by type, and the generic pointer ptr points to the specific object. For a string, ptr points directly to the memory corresponding to the string. Because of the indirect Pointer Association, in general, robj and strings are not in contiguous memory, and two memory operations are required to read the string. 2) 3.0redis3.0, if the string length is less than 39, the ebeded string is used to assign robj and string to a contiguous memory. Due to the principle of locality, the RobJ and string contents are read into the cache at the time of reading, so that only one memory read can be done.
    <MM>    //Allocate a piece of memory to accommodate RobJ, SDS header, string and ' robj '    //</MM>    robj *o = zmalloc (sizeof (+sizeof) ( struct SDSHDR) +len+1);    struct SDSHDR *sh = (void*) (o+1);    O->type = redis_string;    o->encoding = Redis_encoding_embstr;    O->ptr = sh+1;    O->refcount = 1;    O->LRU = Lru_clock ();    Sh->len = len;    Sh->free = 0;    if (PTR) {        //<MM>        //Copy string contents        //</MM>        memcpy (sh->buf,ptr,len);        Sh->buf[len] = ' + ';    } else {        memset (sh->buf,0,len+1);    }    return o;
The reason for the length limit at 39 is that Redis uses Jemalloc, which is allocated as a block of memory in 64 bytes. RobJ (16 bytes), the SDS Head (16 bytes) and the end of the string ' \ ' will occupy 25 bytes.
All keys in Redis are string types, so this optimization greatly increases the cache hit rate of Redis.in actual use, you can limit the size of the key to 39 bytes, make full use of the cache, improve performance. 2. AOF RewriteIn the final step of completing rewrite, the Redis master process needs to append the delta AoF diff rewrite period to the AoF file, a relatively heavy disk IO operation that blocks the event loop, increases latency, and causes service jitter. 1) 2.8The rewrite process:-The main process fork child process, rewrite by the subprocess, the main process continues the service request. At the same time, the main process initializes a aof rewrite buffer, which is used to collect Delta AoF diff during rewrite.-After the child process finishes rewrite, the main process will wait for the child process and reap it. At this point, the dataset that started rewrite has generated a copy of the AoF file, and the main process needs to append the aof rewrite buffer to the end of the aof file.Because the rewrite process is long, the cumulative aof rewrite buffer will be larger, the main process to append write operations, will generate disk operations, blocking the event loop, at this time Redis is not service, will affect the business.
2) 3.0A pipe is established between the parent and child processes for communication. During the rewrite of the child process, the parent process will continue to send AOF diff through the pipe to the child process, and the child process will collect the AOF rewrite buffer continuously. When the child process finishes rewrite. Notifies the parent process to stop sending AoF diff. The child process then appends the collected aof rewrite buffer to the last of the rewritten aof file.After the parent process has finished reaping the child process, the remaining rewrite buffer is appended to the AoF file (this rewrite buffer is relatively small).Improvement points:-Most disk operations are done by a child process, and the parent process only needs to perform disk operations with small amounts of data-The output of the AOF rewrite buffer is dispersed to the processing of each command, reducing the delay without causing a large jitter of 3. Improvement of LRU approximate algorithmIf MaxMemory is configured, during each command processing, if more memory is consumed than Maxmemory,redis, some key will be kicked out according to the LRU algorithm to free up memory. The LRU algorithm used by Redis is approximate and does not maintain an LRU chain to accurately represent the order of precedence.In RobJ, the property LRU represents the time at which the object was last accessed. At the same time, the global variable redisserver.lruclock represents the current LRU clock, and the Lruclock is constantly updated in Servercron (performed once per millisecond). When an object is created or accessed, the LRU property of the object is updated with Lruclock.
#define Redis_lru_bits 24typedef struct Redisobject {     unsigned type:4;     unsigned encoding:4;     unsigned lru:redis_lru_bits; /* LRU time (relative to Server.lruclock) */     int refcount;     void *ptr;} RobJ;
LRU occupies 24 bit, the maximum is 2^24-1, the unit is seconds. Then, the LRU valid range is 0.5 years (2^24/365/86400), when a key is not accessed for six months, its LRU will be returned to 0, while Miss kicked out.
2) 2.8Calculation method of Lruclock:
    Server.lruclock = (server.unixtime/redis_lru_clock_resolution) &                                                Redis_lru_clock_max;
redis_lru_clock_resolutionIndicates the accuracy of the LRU and sets the seconds.The logic of the LRU kick is carried out in the Freememoryifneed function.
                for (k = 0; k < server.maxmemory_samples; k++) {SDS thiskey;                    Long Thisval;                    RobJ *o; <MM>//Random selection of a KV pair//</MM> de = Dictgetrandomkey (d                    ICT);                    ThisKey = Dictgetkey (DE); /* When policy is VOLATILE-LRU we need a additional lookup * to locate the real key, as Dict is set t o db->expires. */if (Server.maxmemory_policy = = redis_maxmemory_volatile_lru) de = dictfind (db                    ->dict, ThisKey);                    o = Dictgetval (DE); <MM>//Get its LRU value//</MM> Thisval = Estimateobjecti                    Dletime (o); <MM>//Select the longest no access key//</MM>/* Higher idle time is better CANdidate for deletion */if (Bestkey = = NULL | | thisval > Bestval) {bestkey =                        ThisKey                    Bestval = Thisval; }                }
Kick out logic is relatively simple, randomly select Maxmemory_samples objects, select the lowest LRU value as the key to be kicked out.Maxmemory_samples can be configured, default is 3.
3) 3.0Lruclock calculation method:
(Mstime ()/redis_lru_clock_resolution) & Redis_lru_clock_max;
redis_lru_clock_resolutionIs 1000, that is, the precision is milliseconds.
to improve the accuracy of the LRU approximation algorithm, an attribute Eviction_pool is added to the redisdb to represent a candidate pool of key to be kicked out.
/* Redis Database representation. There is multiple databases identified * by integers from 0 (the default database) up to the max configured * database. The database number is the ' ID ' field in the structure.                 */typedef struct REDISDB {dict *dict;              /* The keyspace for this DB */dict *expires;        /* Timeout of keys with a Timeout set */Dict *blocking_keys;           /* Keys with clients waiting for data (blpop) */Dict *ready_keys;         /* Blocked keys that received a PUSH */dict *watched_keys;    /* Watched keys for multi/exec CAS */struct evictionpoolentry *eviction_pool;                     /* Eviction Pool of keys */int ID;          /* Database ID */Long long avg_ttl; /* Average TTL, just for stats */} REDISDB; 
The Eviction_pool structure is as follows, containing a key and its corresponding LRU time.
#define Redis_eviction_pool_size 16struct evictionpoolentry {    unsigned long long idle;    /* Object idle time. */    SDS key;                    /* Key name. */};
The Eviction_pool is organized into an array of length 16, and is sorted by idle from small to large. Look at LRU kicking out the logic, also in the Freememoryifneed function:
                struct Evictionpoolentry *pool = db->eviction_pool;                    while (Bestkey = = NULL) {//<MM>//Fill Eviction_pool, randomly select 16 key fills at the first time, After each call, you only need to fill a key//</MM> evictionpoolpopulate (Dict, Db->d                    ICT, Db->eviction_pool); /* Go backward from the best to worst element to evict. */for (k = redis_eviction_pool_size-1; k >= 0; k--) {if (Pool[k].key = = NUL                        L) continue;                        de = Dictfind (Dict,pool[k].key); /* Remove the entry from the pool.                        */Sdsfree (Pool[k].key); /* Shift all elements on it right to left. */Memmove (pool+k,pool+k+1, sizeof (pool[0]) * (redis_eviction_pool_size-k-                        1));         /* Clear the element on the right which is empty                * Since we shifted one position to the left.                        */Pool[redis_eviction_pool_size-1].key = NULL;                        Pool[redis_eviction_pool_size-1].idle = 0; /* If The key exists, is our pick. Otherwise it is * a ghost and we need to try the next element.                            */if (DE) {Bestkey = Dictgetkey (DE);                        Break                        } else {/* Ghost ... */continue; }                    }                }
Fillwhen Eviction_pool, 16 keys are randomly selected and added to the pool according to the insertion sort. After filling, select the last element of the pool (idle max) as the kick-out object. Improvement points:-precision changed to milliseconds, more precise -Avoid the need to iterate the object multiple times each time the LRU kicks out 4. INCR Command 1) 2.8To save memory, Redis stores long (only 8 bytes) for strings that can be integers. And Redis has an integer constant pool, for integers within [0, 10000] that directly reference the objects in the constant pool.
    OldValue = value;    if ((incr < 0 && oldvalue < 0 && incr < (llong_min-oldvalue)) | |            (incr > 0 && oldvalue > 0 && incr > (llong_max-oldvalue))) {        Addreplyerror (c, "increment or decrement would overflow");        return;    }    Value is the original, plus the increment    value + = incr;    Based on value, a new string type of RobJ is created,    //If a constant pool is hit and no new objects are created, only more than 10000 will be created.    new = Createstringobjectfromlonglong (value);    Need a hash lookup, add a new object or overwrite the original object    if (o)        dboverwrite (c->db,c->argv[1],new);    else        Dbadd (c->db,c->argv[1],new);
Once the INCR command is called, a hash lookup will exist. Also, for situations greater than 10000, you need to create a new robj2) 3.0 3.0 is optimized for scenarios that are larger than 10000and do not hit the constant pool, and can avoid hash lookups and object creation.
    Calculates the new value    + = incr;    if (o && o->refcount = = 1 && o->encoding = = Redis_encoding_int &&            (Value < 0 | | value >= redis_shared_integers) &&             value >= long_min && value <= Long_max)    {       // If the object's ENCODING is Redis_encoding_int and is not in the range of the constant pool       //And the reference count is less than 1, the object's value is changed directly to       new = O;       O->ptr = (void*) ((long) value);    else {       //hit constant pool, or reference count is not unique, as in the previous way       new = Createstringobjectfromlonglong (value);       if (o) {           dboverwrite (c->db,c->argv[1],new);       } else {           dbadd (c->db,c->argv[1],new);       }    }
In the absence of a constant pool and a reference count of 1, the value of the object is directly modified, no hash lookup is required, and a new object is created. The rest of the situation, but also go the original process.only the reference count of 1 o'clock is optimized to avoid the inconsistency with other logical shared objects (such as key, re-encoding, hash lookup will fail).



redis3.0 in Depth (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.