Redis源碼分析-記憶體資料結構intset

來源:互聯網
上載者:User

標籤:redis   源碼   儲存   二分尋找   

這次研究了一下intset,研究的過程中,一度看不下過去,但是還是咬牙挺過來了,看懂了也就是那麼回事,靜下心來,切莫浮躁

Redis為了追求高效,在儲存下做了很多的最佳化,像intset就是作者為了節約記憶體定製的資料結構,包括後面將要閱讀的壓縮列表。

intset是一個有序的整數集,提供了增加,刪除,尋找的介面,針對uint16_t uint32_t uint64_t,提供了不同編碼的轉換(嚴格的說只是類型的提升)

首先,看一下它的結構定義:

typedef struct intset {                                                                                                                                                    uint32_t encoding;                                                                                                                                                     uint32_t length;                                                                                                                                                                                     int8_t contents[];                                                                                                                          } intset;
encoding:有如下幾種編碼

#define INTSET_ENC_INT16 (sizeof(int16_t))#define INTSET_ENC_INT32 (sizeof(int32_t))#define INTSET_ENC_INT64 (sizeof(int64_t))
實際上這裡使用一個uint8_t儲存就夠了

length:當前整數集有多少個整數

contents[]:具體儲存的位置,這裡以一個位元組為儲存單元,方便對高類型進行定址

看一下它對外提供的介面:

intset *intsetNew(void); intset *intsetAdd(intset *is, int64_t value, uint8_t *success);                                         intset *intsetRemove(intset *is, int64_t value, int *success);                                          uint8_t intsetFind(intset *is, int64_t value);                                                          int64_t intsetRandom(intset *is);uint8_t intsetGet(intset *is, uint32_t pos, int64_t *value);                                            uint32_t intsetLen(intset *is);size_t intsetBlobLen(intset *is); 
一種資料結構,必然要提供類似插入,查詢,刪除這樣的介面,另外不要暴露內部使用的介面,這裡提供的介面,我們具體分析幾個

初始化介面:

/* Create an empty intset. */intset *intsetNew(void) {    intset *is = malloc(sizeof(intset));    is->encoding = intrev32ifbe(INTSET_ENC_INT16);    is->length = 0;    return is; }
沒什麼難的,注意預設使用最低的2位元組儲存

/* Insert an integer in the intset */intset *intsetAdd(intset *is, int64_t value, uint8_t *success) {    uint8_t valenc = _intsetValueEncoding(value);    uint32_t pos;    if (success) *success = 1;    /* Upgrade encoding if necessary. If we need to upgrade, we know that     * this value should be either appended (if > 0) or prepended (if < 0),     * because it lies outside the range of existing values. */    if (valenc > intrev32ifbe(is->encoding)) {        /* This always succeeds, so we don't need to curry *success. */        return intsetUpgradeAndAdd(is,value);    } else {        /* Abort if the value is already present in the set.         * This call will populate "pos" with the right position to insert         * the value when it cannot be found. */        if (intsetSearch(is,value,&pos)) {            if (success) *success = 0;            return is;        }        is = intsetResize(is,intrev32ifbe(is->length)+1);        if (pos < intrev32ifbe(is->length)) intsetMoveTail(is,pos,pos+1);    }    _intsetSet(is,pos,value);    is->length = intrev32ifbe(intrev32ifbe(is->length)+1);    return is;}

這個介面比較有難度,具體分析:

1、首先判斷要增加的值的編碼是否大於當前編碼,大於則進行型別提升,並加入value

2、如果小於當前編碼,首先查詢資料是否存在,存在則返回,不存在則設定插入位置pos

3、重新分配記憶體大小

4、移動資料,所有資料往後移動,複雜度有點高啊

5、插入資料,設定資料個數

其中,型別提升並插入value的介面如下:

/* Upgrades the intset to a larger encoding and inserts the given integer. */static intset *intsetUpgradeAndAdd(intset *is, int64_t value) {    uint8_t curenc = intrev32ifbe(is->encoding);    uint8_t newenc = _intsetValueEncoding(value);    int length = intrev32ifbe(is->length);    int prepend = value < 0 ? 1 : 0;    /* First set new encoding and resize */    is->encoding = intrev32ifbe(newenc);    is = intsetResize(is,intrev32ifbe(is->length)+1);    /* Upgrade back-to-front so we don't overwrite values.     * Note that the "prepend" variable is used to make sure we have an empty     * space at either the beginning or the end of the intset. */    while(length--)        _intsetSet(is,length+prepend,_intsetGetEncoded(is,length,curenc));    /* Set the value at the beginning or the end. */    if (prepend)        _intsetSet(is,0,value);    else        _intsetSet(is,intrev32ifbe(is->length),value);    is->length = intrev32ifbe(intrev32ifbe(is->length)+1);    return is;}
可以看到,型別提升的過程如下:

1、因為整數集是有序的,所以首先判斷要加入的數是正數還是負數,正數就在尾部添加,負數則在頭部添加

2、增加記憶體大小

3、移動資料,這裡和第一步掛鈎,而且移動的過程比較難以理解,首先根據原來編碼取出資料,然後根據新的編碼插入資料

4、插入資料,在頭部還是尾部插入

5、修改資料個數


另外移動資料的介面如下:

static void intsetMoveTail(intset *is, uint32_t from, uint32_t to) {    void *src, *dst;    uint32_t bytes = intrev32ifbe(is->length)-from;    uint32_t encoding = intrev32ifbe(is->encoding);    if (encoding == INTSET_ENC_INT64) {        src = (int64_t*)is->contents+from;        dst = (int64_t*)is->contents+to;        bytes *= sizeof(int64_t);    } else if (encoding == INTSET_ENC_INT32) {        src = (int32_t*)is->contents+from;        dst = (int32_t*)is->contents+to;        bytes *= sizeof(int32_t);    } else {        src = (int16_t*)is->contents+from;        dst = (int16_t*)is->contents+to;        bytes *= sizeof(int16_t);    }    memmove(dst,src,bytes);}
因為是連續的記憶體,找到移動的起始位置,然後memmove(),bingo!!!


尋找資料的介面實現:

static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) {    int min = 0, max = intrev32ifbe(is->length)-1, mid = -1;    int64_t cur = -1;    /* The value can never be found when the set is empty */    if (intrev32ifbe(is->length) == 0) {        if (pos) *pos = 0;        return 0;    } else {        /* Check for the case where we know we cannot find the value,         * but do know the insert position. */        if (value > _intsetGet(is,intrev32ifbe(is->length)-1)) {            if (pos) *pos = intrev32ifbe(is->length);            return 0;        } else if (value < _intsetGet(is,0)) {            if (pos) *pos = 0;            return 0;        }    }    while(max >= min) {        mid = ((unsigned int)min + (unsigned int)max) >> 1;        cur = _intsetGet(is,mid);        if (value > cur) {            min = mid+1;        } else if (value < cur) {            max = mid-1;        } else {            break;        }    }    if (value == cur) {        if (pos) *pos = mid;        return 1;    } else {        if (pos) *pos = min;        return 0;    }} 

還是個二分尋找,niubility!!!個人感覺這種資料結構的高效就體現在這裡,因為是有序,所以尋找快速,因為是數組,所以插入,刪除,是連續記憶體拷貝,也很快

有時間突然想去看一下STL Vector的實現了,它的insert是如何?的?




Redis源碼分析-記憶體資料結構intset

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.