Redis Source Analysis (Intset)

Source: Internet
Author: User
Tags redis

SOURCE version: 4.0.1
SOURCE location: intset.h: Definition of data structure intset.c: creation, additions and deletions, etc. 1. Introduction to the collection of integers

Intset is one of the redis memory data structures, and it is adlist specific to the previously available data such as SDS, Skiplist, Dict, Redis, and is used to implement the REDIS set structure (when the element is small and numeric) and features: The element type can only be a number. There are three types of elements: int16_t, int32_t, int64_t. Elements are ordered and cannot be duplicated. Like Intset and SDS, memory is contiguous, just as an array. 2. Data structure definition

typedef struct INTSET {
    uint32_t encoding;  Encoding type int16_t, int32_t, int64_t
    uint32_t length;    Length Maximum length: 2^32
    int8_t contents[];  Flexible array
} Intset;
3. Create, insert (expand Capacity), find (binary search), delete

Take the following example to look at the various operations of Intset:

(You need to add the Intset.h header file to your server.c, and then modify the main function to the following code)

int main (int argc, char **argv) {uint8_t ret;
    uint8_t success;
    int64_t value;
    int16_t int16_a = 2 * 128;
    int16_t int16_b = 2 * 256;


    int32_t Int32_c = 2 * 65536;
    printf ("----------intset insert----------\ n");

    Intset *is = Intsetnew ();
    is = Intsetadd (IS, int16_a, &success);
    if (success = = 0) {printf ("Add int16_a fail\n");
    else {printf ("Add int16_a success,");

    printf ("Is encoding:%d, length:%d, bloblen:%zu\n", Is->encoding, Intsetlen (are), Intsetbloblen (IS));
    is = Intsetadd (IS, Int32_c, &success);
    if (success = = 0) {printf ("Add Int32_c fail\n");
    else {printf ("Add int32_c success,");

    printf ("Is encoding:%d, length:%d, bloblen:%zu\n", Is->encoding, Intsetlen (are), Intsetbloblen (IS));
    is = Intsetadd (IS, Int16_b, &success);
    if (success = = 0) {printf ("Add Int16_b fail\n");
    else {printf ("Add int16_b success,");

  }  printf ("Is encoding:%d, length:%d, bloblen:%zu\n", Is->encoding, Intsetlen (are), Intsetbloblen (IS));
    printf ("----------Intset found----------\ n");
    RET = Intsetfind (is, int16_b);
    if (ret = 1) {printf ("Int16_b is found\n");

    printf ("----------intset get----------\ n");
    RET = Intsetget (IS, 0, &value);
    if (ret!= 0) {printf ("Int16_a Get value is%lld\n", value);

    printf ("----------intset remove----------\ n");
    is = Intsetremove (IS, Int16_b, &success);
    if (success = = 1) {printf ("Int16_b is Success remove\n");

    printf ("Is encoding:%d, length:%d, bloblen:%zu\n", Is->encoding, Intsetlen (are), Intsetbloblen (IS));

    Zfree (IS);
return 0; Out >----------intset inserts----------Add int16_a success, is Encoding:2, length:1, bloblen:10 add Int32_c success , are Encoding:4, Length:2, bloblen:16 add int16_b success, is Encoding:4, Length:3, bloblen:20----------Intset found----- -----int16_B is found----------Intset get----------int16_a get value are 256----------intset remove----------Int16_b is success re Move is Encoding:4, Length:2, bloblen:16
3.1 CreatingIntset *is = Intsetnew () creates an empty intset named is, with the following code:
/* Create an empty intset. * *
intset *intsetnew (void) {
    Intset *is = zmalloc (sizeof));  Allocation space
    is->encoding = Intrev32ifbe (intset_enc_int16);  Initial interview Create a default element size of 2 bytes
    is->length = 0;
    return is;
}
3.2 InsertNext we call Intsetadd () insert three consecutive data, and its code is as follows:
/* Insert An integer in the Intset/Intset *intsetadd (Intset *is, int64_t value, uint8_t *success) {uint8_t Valenc
    = _intsetvalueencoding (value);
    uint32_t POS;

    if (success) *success = 1; /* Upgrade encoding if necessary.
     If we need to upgrade, we know the * This value should is either appended (if > 0) or prepended (if < 0), * Because it lies outside the range of existing values. */if (Valenc > Intrev32ifbe (is->encoding)) {/* always succeeds, so we don ' t need to curry *succes
    S. */return Intsetupgradeandadd (Is,value);
         else {/* Abort if the value is already present in the set. * This call would populate "POS" with the right position to insert * The value as it cannot be found.
            */if (Intsetsearch (Is,value,&pos)) {if (success) *success = 0;
        return is;
        is = Intsetresize (Is,intrev32ifbe (is->length) +1); if (Pos < Intrev32ifbe (Is->length)) Intsetmovetail (is,pos,pos+1);
    } _intsetset (Is,pos,value);
    Is->length = Intrev32ifbe (Intrev32ifbe (is->length) +1);
return is; }

The process of the entire function is as follows: uint8_t Valenc = _intsetvalueencoding (value), according to the length of value to obtain its corresponding encoding, save to Valenc. if (Valenc > Intrev32ifbe (is->encoding)), if Valenc > is->encoding indicates that the current encoding is too small, the overall increase in encoding size is required.
Perform intsetupgradeandadd () to complete the expansion operation. If Valenc <= is->encoding.
Performs a lookup Intsetsearch (Is,value,&pos), and if the element is found, the success is set to 0, indicating that the insertion failed, that the element already exists. If it is not found, the POS indicates where the element should be inserted, the size of an element is Intsetresize (IS,INTREV32IFBE (is->length) +1) for the IS, and intsetmovetail is used if necessary (IS, POS,POS+1) moves the elements. _intsetset (Is,pos,value) inserts the element into the Intset. Is->length = Intrev32ifbe (Intrev32ifbe (is->length) +1) to update the value of length.

The is of the first element int16_a is shown in the following illustration:

Corresponds to the output result:

Add int16_a success, is Encoding:2, Length:1, bloblen:10
Next, our code adds a second element, because it is larger than intset_enc_int16, so the add operation performs the Intsetupgradeandadd () function extension encoding:
/* Upgrades the Intset to a larger encoding and inserts the given integer.
    */Static Intset *intsetupgradeandadd (Intset *is, int64_t value) {uint8_t Curenc = Intrev32ifbe (is->encoding);
    uint8_t Newenc = _intsetvalueencoding (value);
    int length = INTREV32IFBE (is->length); int prepend = value < 0?

    1:0;
    /* The new encoding and resize * * is->encoding = INTREV32IFBE (Newenc);

    is = Intsetresize (Is,intrev32ifbe (is->length) +1);
     /* Upgrade Back-to-front so we don ' t overwrite values. * the "prepend" variable is used to make sure we have a empty * spaces at either the beginning or the end O f the Intset.

    */while (length--) _intsetset (is,length+prepend,_intsetgetencoded (Is,length,curenc)); /* Set The value at the beginning or the end.
    */if (prepend) _intsetset (Is,0,value);
    Else _intsetset (Is,intrev32ifbe (is->length), value); Is->length = Intrev32ifbe (Intrev32ifbe (is->leNgth) +1);
return is; }
Saves the current encoding of is to CURENC and saves the encoding of value to Newenc. int prepend = value < 0? 1:0,prepend is used to determine the insertion position of the new value: the first or the last one, because it is larger than the encoding, so it is either larger than any current element, or smaller than all elements, that is, the insertion position either first or last. Then update the encoding value and reallocate the space. Move all elements to the new location. Determine whether to insert value into the first position or the last position based on the value of the prepend. Update Is->length.

There is a more vivid diagram below, reference [1]:

    /* Upgrade Back-to-front so we don ' t overwrite values. * the "prepend" variable is used to make sure we have a empty * spaces at either the beginning or the end O f the Intset. /////According to the original encoding to remove a collection element from the underlying array//and then add the element to the collection in a new encoding//When this step is completed, all the original elements in the collection complete the conversion from the old encoding to the new encoding///Because the newly allocated space is On the back end of the array, the program first moves the element from the back end to the front end//For example, assuming that there are three elements curenc encoded, they are arranged in the array as follows://| x | y | 
    Z | After the program has been reassigned to a group, the array is expanded (symbol. Indicates unused memory)://| x | y |   Z |? |   ?   |   ?
    | The program starts at the back of the array and inserts the element again://| x | y |   Z |? |   Z |   ?
    | // | x |   y |   y |   Z |   ?
    |   // |   x |   y |   Z |   ?
    | Finally, the program can add the new element to the end.   The location of the number marked://|   x |   y |  Z |
    New | This shows the new element is larger than all the original elements, which is prepend = = 0//When the new elements than all the original elements of the hour (prepend = = 1), the adjustment process is as follows://| x | y |   Z |? |   ?   |   ?
    | // | x | y |   Z |? |   ?   |
    Z | // | x | y |   Z |? |   y |
    Z | // | x |   y |   x |   y |
    Z | When you add a new value, the original | x | y |  Data will be replaced by the new value//|   New |   x |   y | Z |

After the second element is inserted, it is shown in the following illustration:

The output looks like this:

Add Int32_c success, is Encoding:4, Length:2, bloblen:16
Next we insert the third element, at which point the encoding satisfies the int16_b size, so the code branch performs the lookup operation Intsetsearch () function:
/* Search for the position of "value". Return 1 when the value is found and * sets "POS" to the position of the value within the Intset. return 0 when * the ' value is ' not present ' intset and sets ' pos ' to the position * where ' value ' can be inserted. * * Static uint8_t Intsetsearch (Intset *is, int64_t value, uint32_t *pos) {int min = 0, max = intrev32ifbe (Is->leng
    TH)-1, mid =-1;

    int64_t cur =-1;  /* The value can never be found the ' set is empty/if (INTREV32IFBE (is->length) = = 0) {if (POS) *pos
        = 0;
    return 0;  else {* * Check for the case where we know we cannot find the value, * but do know the insert position. */if (Value > _intsetget (IS,INTREV32IFBE (is->length)-1)) {if (pos) *pos = Intrev32ifbe (is-&gt
            ; length);
        return 0;
            else if (Value < _intsetget (is,0)) {if (pos) *pos = 0;
        return 0;
 } while (max >= min) {       Mid = ((unsigned int) min + (unsigned int) max) >> 1;
        The addition operation level is higher than the shift cur = _intsetget (is,mid);
        if (value > cur) {min = mid+1;
        else if (value < cur) {max = mid-1;
        } else {break;
        } if (value = = cur) {if (pos) *pos = mid;
    return 1;
        else {if (pos) *pos = min;
    return 0; }
}
If the current is->length is 0, the token pos is 0, and the lookup fails. If the value is larger than the maximum value, or is smaller than the minimum value, the token pos is length or 0, and the lookup fails. Otherwise, the binary is used to find the element and point the POS to where it should be inserted.

When Intsetsearch () returns, the POS indicates where value should be inserted, at which point the element after the POS needs to be moved back one position, and the move function is Intsetmovetail ():

static void Intsetmovetail (Intset *is, uint32_t from, uint32_t to) {
    void *src, *DST;
    uint32_t bytes = intrev32ifbe (is->length)-from;
    uint32_t encoding = INTREV32IFBE (is->encoding);

    if (encoding = = Intset_enc_int64) {
        src = (int64_t*) is->contents+from;
        DST = (int64_t*) is->contents+to;
        Bytes *= sizeof (int64_t);
    } else if (encoding = = intset_enc_int32) {
        src = (int32_t*) is->contents+from;
        DST = (int32_t*) is->contents+to;
        Bytes *= sizeof (int32_t);
    } else {
        src = (int16_t*) is->contents+from;
        DST = (int16_t*) is->contents+to;
        Bytes *= sizeof (int16_t);
    }
    Memmove (dst,src,bytes);
}

In effect, the whole memory is moved backwards to the position of an element, it should be noted that the Memmove function allows the memory overlap between src and DST.

Another vivid illustration, also from the reference [1]:

 * * Forward or successively move an array element within the specified index range
 *
 * The Movetail in the function name is actually a misleading name,
 * This function can move elements forward or backward,
 not just backwards
 *
 * When you add a new element to an array, you need to move backwards,
 * If the array represents the following (. Represents a space that does not have a new value set:
 * | x | y | z |? |
 *     |<----->|
 * and the new element n POS is 1, then the array will move Y and Z two elements
 * | x | y | y | z |
 *         |<----->|
 * Then you can set the new element N to the pos:
 * | x | n | y | z |
 *
 * When you delete an element from an array, you need to move forward,
 * If the array is represented below, and B is the target to be deleted:
 * | a | b | c | d |
 *         |<----->|
 * Then the program will move all elements after B to the position of an element forward,
 * Thus covering the data of B:
 * | a | c | d
 | | *     |<----->|
 * Finally, the program then deletes the space of an element from the end of the array:
 * | a | c | D
 | * This completes the delete operation.
 *
 * T = O (N) * *
 

The is as shown in the following illustration:

3.3 Find

The logic of the lookup is already mentioned in the insert operation above, and is actually a binary lookup. 3.4 Delete

/* Delete integer from Intset *
/Intset *intsetremove (Intset *is, int64_t value, int *success) {
    uint8_t = _intsetvalueencoding (value);
    uint32_t POS;
    if (success) *success = 0;

    if (Valenc <= intrev32ifbe (is->encoding) && Intsetsearch (Is,value,&pos)) {
        uint32_t len = Intrev32ifbe (is->length);

        /* We know we can delete *
        /if (success) *success = 1;

        /* Overwrite value with tail and update length *
        /if (POS < (len-1)) Intsetmovetail (is,pos+1,pos);
        is = Intsetresize (is,len-1);
        Is->length = Intrev32ifbe (len-1);
    }
    return is;
}
Gets the encoding of the element first, and success 0 indicates that the deletion failed if the condition is not met. Otherwise the call Intsetsearch () finds the appropriate location and then moves the pos+1 element to the POS location, which is equivalent to overwriting an element forward. Reduce the number of elements by one and reallocate the memory. 4. Summary

This blog analyzes the Intset data structure and basic operations, the entire data structure is relatively simple.
Personally feel that the Intset implementation in accordance with the increasing number of elements can be extended encoding memory is very friendly, but it does not provide a corresponding reduction encoding operation, that is, can continue to expand the encoding encoding type, but not narrow, this is not very good.

Resources:
[1] Redis Source Code Note 3.0-Huangjianhong

Finish

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.