redis3.0 in-depth detailed (1)

Source: Internet
Author: User
Tags diff redis

April 1, Redis3.0-stable officially released. The cluster model is introduced, and several optimizations are made at the same time. This article, from the source level of 3.0 and 2.8.19 comparison, detailed explanation of the optimization details. Due to limited ability and time, I will only compare the source parts I have read, and not involve cluster related content. 1. Embedded String

Reduce memory reading due to cache miss, further improve cache hit rate, in some scenarios, a large increase in speed. 1 2.8 Redis, all objects are represented by RobJ, including key and value:

typedef struct REDISOBJECT {
     unsigned type:4;
     unsigned encoding:4;
     unsigned lru:redis_lru_bits; /* LRU time (relative to Server.lruclock) */
     int refcount;
     void *ptr;
} RobJ;
Type specified by type, and generic pointer ptr points to a specific object. For a string, the PTR directly points to the memory corresponding to the string. Because of the indirect Pointer Association, ROBJ and strings are typically not in contiguous memory, and two memory operations are required to read the string. 2) 3.0redis3.0, if the string length is less than 39, a ebeded string is used to assign the RobJ and string to a contiguous memory. Because of the local principle, when reading, the RobJ and the string contents will be read to the cache, so that only once memory read it.
    <MM>
    //allocating a piece of memory to accommodate RobJ, SDS header, string and '
    robj '//</MM>
    *o = zmalloc (sizeof (robj) +sizeof ( struct SDSHDR) +len+1);
    struct SDSHDR *sh = (void*) (o+1);

    O->type = redis_string;
    o->encoding = Redis_encoding_embstr;
    O->ptr = sh+1;
    O->refcount = 1;
    O->LRU = Lru_clock ();

    Sh->len = len;
    Sh->free = 0;
    if (PTR) {
        //<MM>
        //Copy string content
        //</MM>
        memcpy (sh->buf,ptr,len);
        Sh->buf[len] = ' I ';
    } else {
        memset (sh->buf,0,len+1);
    }
    return o;
The length limit is 39 because Redis uses Jemalloc to allocate 64 bytes to a block of memory. RobJ (16 bytes), the head of the SDS (16 bytes) and the ' The end of the string ' will occupy 25 bytes.
All keys in Redis are string types, so this optimization will significantly increase the cache hit rate for Redis. In actual use, you can limit the size of the key to 39 bytes, make full use of cache, improve performance. 2. AoF RewriteIn the final step of completing rewrite, redis the main process needs to append the delta AoF diff rewrite period to the AoF file, which is a heavier disk IO operation that blocks the event loop, increases latency, and causes service jitter. 1) 2.8Rewrite process:-The main process fork subprocess, the child process rewrite, the main process continues the service request. At the same time, the main process initializes a aof rewrite buffer, which is used to collect delta AoF diff for the rewrite period. -After the child process completes the rewrite, the main process will wait for the child process and harvest it. At this point, the dataset that started rewrite has generated a aof file, and the main process needs to append aof rewrite buffer to the end of the aof file. Because the rewrite process is long, the cumulative aof rewrite buffer will be relatively large, the main process to append write operations, will produce disk operations, blocking the event cycle, at this time the Redis can not service, will affect the business.
2) 3.0Establish pipe between parent and child processes for communication. During the rewrite process, the parent process sends the AOF diff through the pipe to the subprocess, and the child process is constantly collected into aof rewrite buffer. When the child process completes rewrite. Notifies the parent process to stop sending AoF diff. The child process then appends the collected aof rewrite buffer to the end of the rewritten aof file. After the parent process finishes harvesting the child process, the remaining rewrite buffer is appended to the AoF file (the rewrite buffer is relatively smaller). Improvement point:-Most of the disk operations are completed by the child process, the parent process requires only small data disk operations-AOF rewrite buffer output will be scattered to the processing of each command, reduce latency, do not cause large jitter 3. LRU Approximation Algorithm ImprovementIf MaxMemory is configured, during each command process, if memory exceeds Maxmemory,redis, some keys are kicked out according to the LRU algorithm to free up memory. The LRU algorithm used in Redis is approximate and does not maintain a LRU chain to accurately represent the sequence. In RobJ, a property LRU represents the time when the object was last accessed. At the same time, the global variable redisserver.lruclock represents the current LRU clock, and Lruclock is constantly updated in Servercron (once per millisecond). When an object is created or accessed, the LRU property of the object is updated with Lruclock.
#define REDIS_LRU_BITS
typedef struct REDISOBJECT {
     unsigned type:4;
     unsigned encoding:4;
     unsigned lru:redis_lru_bits; /* LRU time (relative to Server.lruclock) */
     int refcount;
     void *ptr;
} RobJ;
The LRU occupies 24 bit, the maximum value is 2^24-1, the unit is seconds. So, the effective range of LRU is 0.5 years (2^24/365/86400), when a key has not been visited for half a year, its LRU will return to 0, and missed kicking out.
2) 2.8Calculation method of Lruclock:
    Server.lruclock = (server.unixtime/redis_lru_clock_resolution) &
                                                Redis_lru_clock_max;
The redis_lru_clock_resolution represents the precision of the LRU and is set to seconds. The logic that LRU kicks out in the Freememoryifneed function.
                for (k = 0; k < server.maxmemory_samples; k++) {SDS thiskey;
                    Long Thisval;

                    RobJ *o; <MM>//Random selection of a kv//</MM> de = Dictgetrandomke
                    Y (dict);
                    ThisKey = Dictgetkey (DE); /* When policy is VOLATILE-LRU we need a additional lookup * to locate the real key, as Dict is set To Db->expires. */if (Server.maxmemory_policy = = redis_maxmemory_volatile_lru) de = Dictfind (
                    Db->dict, ThisKey);
                    o = Dictgetval (DE); <MM>//Get its LRU value//</MM> Thisval = Estimateobje

                    Ctidletime (o); <MM>//Select the longest not accessed key//</MM>/* higher Idle time are better candidate for deletion */if (Bestkey = NULL | | thisval > Bestval) {
                        Bestkey = ThisKey;
                    Bestval = Thisval;
 }
                }
Kick out logic is relatively simple, randomly select Maxmemory_samples object, select one of the LRU value of the smallest as to be kicked out of the key. Maxmemory_samples can be configured, the default is 3.
3) 3.0Lruclock calculation method:
(Mstime ()/redis_lru_clock_resolution) & Redis_lru_clock_max;
The redis_lru_clock_resolution is 1000, that is, the precision is milliseconds.
In order to improve the accuracy of the LRU approximation algorithm, an attribute Eviction_pool is added to the redisdb to represent a candidate pool for the key to be kicked out.
/* Redis database representation. There are multiple databases identified
 * by integers-0 (the default database) up to the max configured
 * dat Abase. The database number is the ' ID ' field in the structure. * *
typedef struct REDISDB {
    dict *dict;                 /* The keyspace for this DB *
    /dict *expires;              /* Timeout of keys with a Timeout set *
    /dict *blocking_keys;        /* Keys with clients waiting for data (blpop) * *
    dict *ready_keys;           * Blocked keys that received a PUSH * *
    dict *watched_keys;         /* Watched keys for multi/exec CAS
    /struct evictionpoolentry *eviction_pool;    /* Eviction Pool of keys *
    /int id;                     /* Database ID *
    /Long long avg_ttl;          /* Average TTL, just for stats *
/} redisdb;
The EVICTION_POOL structure is as follows, containing a key and its corresponding LRU time.
#define Redis_eviction_pool_size
struct Evictionpoolentry {
    unsigned long long idle;    /* Object idle time. * *
    SDS key;                    /* Key name. */
};
The Eviction_pool is organized into an array of length 16, and is sorted by idle from small to large. Look at the LRU kick out of the logic, also in the Freememoryifneed function:
                struct Evictionpoolentry *pool = db->eviction_pool;
                    while (Bestkey = = NULL) {//<MM>//Fill Eviction_pool, randomly selects 16 key fills at the first time, After each call, only one key//</MM> evictionpoolpopulate (Dict, db-) will be populated.
                    >dict, Db->eviction_pool); /* Go backward from best to worst element to evict. * for (k = redis_eviction_pool_size-1 k >= 0; k--) {if (Pool[k].key = = N
                        ULL) continue;

                        de = Dictfind (Dict,pool[k].key); /* Remove the entry from the pool.
                        * * Sdsfree (Pool[k].key); /* Shift all elements on it right to left. * * Memmove (pool+k,pool+k+1, sizeof (POOL[0)) * (redis_eviction_pool_size-
                        K-1)); /* Clear the element on the right which is empty * Since we shifted one position to the left.
                        * * Pool[redis_eviction_pool_size-1].key = NULL;

                        Pool[redis_eviction_pool_size-1].idle = 0; /* If The key exists, is our pick. Otherwise it is * a ghost and we need to try the next element.
                            */if (DE) {Bestkey = Dictgetkey (DE);
                        Break
                        else {/* Ghost ... */continue;
 }
                    }
                }
When filling Eviction_pool, randomly selects 16 keys and adds them to the pool according to the insertion sort. After filling, select the last element of pool (idle Max) as the kick out object. Improvement points:-precision to milliseconds, more accurate-avoid every time LRU kicked out, multiple iterations to choose to kick out the object 4. incr command 1) 2.8Redis in order to conserve memory, strings that can be integers are stored directly in long (8 bytes). And Redis has an integer constant pool, which refers directly to an object in a constant pool for integers in [0, 10000].
    OldValue = value;
    if ((incr < 0 && oldvalue < 0 && incr < (llong_min-oldvalue)) | |
            (incr > 0 && oldvalue > 0 && incr > (llong_max-oldvalue)) {
        Addreplyerror (c, "increment or decrement would overflow");
        return;
    }
    Value is the original, plus increment
    value + = incr;
    Based on value, creates a new string type of RobJ,
    //If a constant pool is hit, no new objects are created, and only greater than 10000 is created.
    new = Createstringobjectfromlonglong (value);
    A hash lookup is required to add a new object or overwrite an existing object
    if (o)
        dboverwrite (c->db,c->argv[1],new);
    else
        Dbadd (c->db,c->argv[1],new);
Once the INCR command is invoked, a hash lookup is present. Also, for situations larger than 10000, you need to create a new robj 2) 3.03.0 The scenarios that are greater than 10000 and do not hit the constant pool are optimized to avoid hash lookups and object creation.
    Computes the new value
    = = incr;

    if (o && o->refcount = 1 && o->encoding = redis_encoding_int &&
            (Value < 0 | | value >= redis_shared_integers) &&
             value >= long_min && value <= Long_max)
    {
       // If the object's ENCODING is Redis_encoding_int and is not within the range of the constant pool
       //And the reference count is less than 1, the value of the object is changed directly to
       new = O;
       O->ptr = (void*) ((long) value);
    } else {
       //hit constant pool, or reference count is not unique,
       new = Createstringobjectfromlonglong (value) in the previous way;
       if (o) {
           dboverwrite (c->db,c->argv[1],new);
       } else {
           dbadd (c->db,c->argv[1],new);
       }
    }
In cases where the constant pool is not hit and the reference count is 1, the value of the object is modified directly without the need for a hash lookup and a new object to be created. The rest of the situation, but also go the original process. The reason that only the reference count is 1 o'clock optimization is to avoid sharing objects with other logic, resulting in inconsistencies (such as key, after recoding, hash lookup will fail).



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.