Redis源碼分析(六)--- ziplist壓縮列表

來源:互聯網
上載者:User

標籤:記憶體資料庫   源碼   nosql資料庫   redis   

           ziplist和之前我解析過的adlist列表名字看上去的很像,但是作用卻完全不同。之前的adlist主要針對的是普通的資料鏈表操作。而今天的ziplist指的是壓縮鏈表,為什麼叫壓縮鏈表呢,因為鏈表中我們一般常用pre,next來指明當前的結點的前一個指標或當前的結點的下一個指標,這其實是在一定程度上佔據了比較多的記憶體空間,ziplist採用了長度的表示方法,整個ziplist其實是超級長的字串,通過裡面各個結點的長度,上一個結點的長度等資訊,通過快速定位實現相關操作,而且編寫者,在長度上也做了動態分配位元組的方法,表示長度,避免了一定的記憶體耗費,比如一個結點的字串長度每個都很短,而你使用好幾個位元組表示字串的長度,顯然造成大量浪費,所以在長度表示方面,ziplist 就做到了壓縮,也體現了壓縮的效能。ziplist 用在什麼地方呢,ziplist 就是用在我們平常最常用的一個命令rpush,lpush等這些往鏈表添加資料的方法,這些資料就是存在ziplist 中的。之後我們會看到相應的實現方法。

     在學習ziplist的開始,一定要理解他的結構,關於這一點,必須花一定時間想想,要不然不太容易明白人家的設計。下面是我的理解,協助大家理解:

/* The ziplist is a specially encoded dually linked list that is designed * to be very memory efficient. It stores both strings and integer values, * where integers are encoded as actual integers instead of a series of * characters. It allows push and pop operations on either side of the list * in O(1) time. However, because every operation requires a reallocation of * the memory used by the ziplist, the actual complexity is related to the * amount of memory used by the ziplist. * * ziplist是一個編碼後的列表,特殊的設計使得記憶體操作非常有效率,此列表可以同時存放 * 字串和整數類型,列表可以在頭尾各邊支援推加和彈出操作在O(1)常量時間,但是,因為每次 * 操作設計到記憶體的重新分配釋放,所以加大了操作的複雜性 * ---------------------------------------------------------------------------- * * ziplist的結構組成: * ZIPLIST OVERALL LAYOUT: * The general layout of the ziplist is as follows: * <zlbytes><zltail><zllen><entry><entry><zlend> * * <zlbytes> is an unsigned integer to hold the number of bytes that the * ziplist occupies. This value needs to be stored to be able to resize the * entire structure without the need to traverse it first. * <zipbytes>代表著ziplist佔有的位元組數,這方便當重新調整大小的時候不需要重新從頭遍曆 *  * <zltail> is the offset to the last entry in the list. This allows a pop * operation on the far side of the list without the need for full traversal. * <zltail>記錄了最後一個entry的位置在列表中,可以方便快速在列表末尾彈出操作 * * <zllen> is the number of entries.When this value is larger than 2**16-2, * we need to traverse the entire list to know how many items it holds. * <zllen>記錄的是ziplist裡面entry資料結點的總數 * * <zlend> is a single byte special value, equal to 255, which indicates the * end of the list. * <zlend>代表的是結束標識別,用單位元組表示,值是255,就是11111111 * * ZIPLIST ENTRIES: * Every entry in the ziplist is prefixed by a header that contains two pieces * of information. First, the length of the previous entry is stored to be * able to traverse the list from back to front. Second, the encoding with an * optional string length of the entry itself is stored. * 每個entry資料結點主要包含2部分資訊,第一個,上一個結點的長度,主要就可以可以從任意結點從後往前遍曆整個列表 * 第二個,編碼字串的方式的類型儲存 * * The length of the previous entry is encoded in the following way: * If this length is smaller than 254 bytes, it will only consume a single * byte that takes the length as value. When the length is greater than or * equal to 254, it will consume 5 bytes. The first byte is set to 254 to * indicate a larger value is following. The remaining 4 bytes take the * length of the previous entry as value. * 之前的資料結點的字串長度的長度少於254個位元組,他將消耗單個位元組,一個位元組8位,最大可表示長度為2的8次方 * 當字串的長度大於254個位元組,則用5個位元組表示,第一個位元組被設定成254,其餘的4個位元組佔據的長度為之前的資料結點的長度 * * The other header field of the entry itself depends on the contents of the * entry. When the entry is a string, the first 2 bits of this header will hold * the type of encoding used to store the length of the string, followed by the * actual length of the string. When the entry is an integer the first 2 bits * are both set to 1. The following 2 bits are used to specify what kind of * integer will be stored after this header. An overview of the different * types and encodings is as follows: * 頭部資訊中的另一個值記錄著編碼的方式,當編碼的是字串,頭部的前2位為00,01,10共3種 * 如果編碼的是整型數位時候,則頭部的前2位為11,代表的是整數編碼,後面2位代表什麼類型整型值將會在頭部後面被編碼 * 00-int16_t, 01-int32_t, 10-int64_t, 11-24 bit signed,還有比較特殊的2個,11111110-8 bit signed, * 1111 0000 - 1111 1101,代表的是整型值0-12,頭尾都已經存在,都不能使用,與傳統的通過固定的指標表示長度,這麼做的好處實現 * 可以更合理的分配記憶體 * * String字串編碼的3種形式 * |00pppppp| - 1 byte *      String value with length less than or equal to 63 bytes (6 bits). * |01pppppp|qqqqqqqq| - 2 bytes *      String value with length less than or equal to 16383 bytes (14 bits). * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes *      String value with length greater than or equal to 16384 bytes. * |11000000| - 1 byte *      Integer encoded as int16_t (2 bytes). * |11010000| - 1 byte *      Integer encoded as int32_t (4 bytes). * |11100000| - 1 byte *      Integer encoded as int64_t (8 bytes). * |11110000| - 1 byte *      Integer encoded as 24 bit signed (3 bytes). * |11111110| - 1 byte *      Integer encoded as 8 bit signed (1 byte). * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer. *      Unsigned integer from 0 to 12. The encoded value is actually from *      1 to 13 because 0000 and 1111 can not be used, so 1 should be *      subtracted from the encoded 4 bit value to obtain the right value. * |11111111| - End of ziplist. * * All the integers are represented in little endian byte order. * * ----------------------------------------------------------------------------

希望大家能仔細反覆閱讀,理解作者的設計思路,下面給出的他的實際結構體的定義:

/* 實際存放資料的資料結點 */typedef struct zlentry {//prevrawlen為上一個資料結點的長度,prevrawlensize為記錄該長度數值所需要的位元組數    unsigned int prevrawlensize, prevrawlen;    //len為當前資料結點的長度,lensize表示表示當前長度表示所需的位元組數    unsigned int lensize, len;    //資料結點的頭部資訊長度的位元組數    unsigned int headersize;    //編碼的方式    unsigned char encoding;    //資料結點的資料(已包含頭部等資訊),以字串形式儲存    unsigned char *p;} zlentry;/* <zlentry>的結構圖線表示 <pre_node_len>(上一結點的長度資訊)<node_encode>(本結點的編碼方式和編碼資料的長度資訊)<node>(本結點的編碼資料) */

我們看一下裡面比較核心的操作,插入操作,裡面涉及指標的各種來回移動,這些都是記憶體位址的調整:

/* Insert item at "p". *//* 插入操作的實現 */static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;    unsigned int prevlensize, prevlen = 0;    size_t offset;    int nextdiff = 0;    unsigned char encoding = 0;    long long value = 123456789; /* initialized to avoid warning. Using a value                                    that is easy to see if for some reason                                    we use it uninitialized. */    zlentry tail;    /* Find out prevlen for the entry that is inserted. */    //尋找插入的位置    if (p[0] != ZIP_END) {    //定位到指定位置        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);    } else {    //如果插入的位置是尾結點,直接定位到尾結點,看第一個位元組的就可以判斷        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);        if (ptail[0] != ZIP_END) {            prevlen = zipRawEntryLength(ptail);        }    }    /* See if the entry can be encoded */    if (zipTryEncoding(s,slen,&value,&encoding)) {        /* 'encoding' is set to the appropriate integer encoding */        reqlen = zipIntSize(encoding);    } else {        /* 'encoding' is untouched, however zipEncodeLength will use the         * string length to figure out how to encode it. */        reqlen = slen;    }    /* We need space for both the length of the previous entry and     * the length of the payload. */    reqlen += zipPrevEncodeLength(NULL,prevlen);    reqlen += zipEncodeLength(NULL,encoding,slen);    /* When the insert position is not equal to the tail, we need to     * make sure that the next entry can hold this entry's length in     * its prevlen field. */    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;    /* Store offset because a realloc may change the address of zl. */    //調整大小,為新結點的插入預留空間    offset = p-zl;    zl = ziplistResize(zl,curlen+reqlen+nextdiff);    p = zl+offset;    /* Apply memory move when necessary and update tail offset. */    if (p[0] != ZIP_END) {        /* Subtract one because of the ZIP_END bytes */        //如果插入的位置不是尾結點,則挪動位置        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);        /* Encode this entry's raw length in the next entry. */        zipPrevEncodeLength(p+reqlen,reqlen);        /* Update offset for tail */        ZIPLIST_TAIL_OFFSET(zl) =            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);        /* When the tail contains more than one entry, we need to take         * "nextdiff" in account as well. Otherwise, a change in the         * size of prevlen doesn't have an effect on the *tail* offset. */        tail = zipEntry(p+reqlen);        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);        }    } else {    //如果是尾結點,直接設定新尾結點        /* This element will be the new tail. */        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);    }    /* When nextdiff != 0, the raw length of the next entry has changed, so     * we need to cascade the update throughout the ziplist */    if (nextdiff != 0) {        offset = p-zl;        zl = __ziplistCascadeUpdate(zl,p+reqlen);        p = zl+offset;    }    /* Write the entry */    //寫入新的資料結點資訊    p += zipPrevEncodeLength(p,prevlen);    p += zipEncodeLength(p,encoding,slen);    if (ZIP_IS_STR(encoding)) {        memcpy(p,s,slen);    } else {        zipSaveInteger(p,value,encoding);    }        //更新列表的長度加1    ZIPLIST_INCR_LENGTH(zl,1);    return zl;}

下面是刪除操作:

/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. *//* 刪除方法涉及p指標的滑動,後面的地址內容都需要滑動 */static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {    unsigned int i, totlen, deleted = 0;    size_t offset;    int nextdiff = 0;    zlentry first, tail;    first = zipEntry(p);    for (i = 0; p[0] != ZIP_END && i < num; i++) {        p += zipRawEntryLength(p);        deleted++;    }    totlen = p-first.p;    if (totlen > 0) {        if (p[0] != ZIP_END) {            /* Storing `prevrawlen` in this entry may increase or decrease the             * number of bytes required compare to the current `prevrawlen`.             * There always is room to store this, because it was previously             * stored by an entry that is now being deleted. */            nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);            p -= nextdiff;            zipPrevEncodeLength(p,first.prevrawlen);            /* Update offset for tail */            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);            /* When the tail contains more than one entry, we need to take             * "nextdiff" in account as well. Otherwise, a change in the             * size of prevlen doesn't have an effect on the *tail* offset. */            tail = zipEntry(p);            if (p[tail.headersize+tail.len] != ZIP_END) {                ZIPLIST_TAIL_OFFSET(zl) =                   intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);            }            /* Move tail to the front of the ziplist */            memmove(first.p,p,                intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);        } else {            /* The entire tail was deleted. No need to move memory. */            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe((first.p-zl)-first.prevrawlen);        }        /* Resize and update length */        //調整列表大小        offset = first.p-zl;        zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);        ZIPLIST_INCR_LENGTH(zl,-deleted);        p = zl+offset;        /* When nextdiff != 0, the raw length of the next entry has changed, so         * we need to cascade the update throughout the ziplist */        if (nextdiff != 0)            zl = __ziplistCascadeUpdate(zl,p);    }    return zl;}

該方法的意思是從index索引對應的結點開始算起,刪除num個結點,這是刪除的最原始的方法,其他方法都是對此方法的封裝。

下面我們看看我們在redis命令列中輸入的lpush或rpush調用的是什麼方法呢?調用的形式:

zl = ziplistPush(zl, (unsigned char*)"foo", 3, ZIPLIST_TAIL);    zl = ziplistPush(zl, (unsigned char*)"quux", 4, ZIPLIST_TAIL);    zl = ziplistPush(zl, (unsigned char*)"hello", 5, ZIPLIST_HEAD);

/* 在列表2邊插入資料的方法 */unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) {    unsigned char *p;    //這裡開始直接定位    p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl);    //組後調用插入資料的insert方法    return __ziplistInsert(zl,p,s,slen);}

到最後還是調用了insert方法。在寫之前看了一些別人分析的ziplist分析,感覺有些說的的都很粗略,還是自己仔細過一遍心裡會清楚很多,建議大家多多閱讀源碼。每個人側重點都是不一樣的。最後給出標頭檔和比較關鍵的宏定義:
/* zip列表的末尾值 */#define ZIP_END 255/* zip列表的最大長度 */#define ZIP_BIGLEN 254/* Different encoding/length possibilities *//* 不同的編碼 */#define ZIP_STR_MASK 0xc0#define ZIP_INT_MASK 0x30#define ZIP_STR_06B (0 << 6)#define ZIP_STR_14B (1 << 6)#define ZIP_STR_32B (2 << 6)#define ZIP_INT_16B (0xc0 | 0<<4)#define ZIP_INT_32B (0xc0 | 1<<4)#define ZIP_INT_64B (0xc0 | 2<<4)#define ZIP_INT_24B (0xc0 | 3<<4)#define ZIP_INT_8B 0xfe/* 4 bit integer immediate encoding */#define ZIP_INT_IMM_MASK 0x0f    //後續的好多運算都需要與掩碼進行位元運算#define ZIP_INT_IMM_MIN 0xf1    /* 11110001 */#define ZIP_INT_IMM_MAX 0xfd    /* 11111101 */   //最大值不能為11111111,這跟最末尾的結點重複了#define ZIP_INT_IMM_VAL(v) (v & ZIP_INT_IMM_MASK)#define INT24_MAX 0x7fffff#define INT24_MIN (-INT24_MAX - 1)/* Macro to determine type */#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)/* Utility macros *//* 下面是一些用來到時能夠直接定位的數值位移量 */#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))#define ZIPLIST_ENTRY_HEAD(zl)  ((zl)+ZIPLIST_HEADER_SIZE)#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)
.h檔案:

/* * Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com> * Copyright (c) 2009-2012, Salvatore Sanfilippo <antirez at gmail dot com> * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * *   * Redistributions of source code must retain the above copyright notice, *     this list of conditions and the following disclaimer. *   * Redistributions in binary form must reproduce the above copyright *     notice, this list of conditions and the following disclaimer in the *     documentation and/or other materials provided with the distribution. *   * Neither the name of Redis nor the names of its contributors may be used *     to endorse or promote products derived from this software without *     specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. *//* 標記列表前端節點和尾結點的標識 */#define ZIPLIST_HEAD 0#define ZIPLIST_TAIL 1unsigned char *ziplistNew(void);    //建立新列表unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where);  //像列表中推入資料unsigned char *ziplistIndex(unsigned char *zl, int index);   //索引定位到列表的某個位置unsigned char *ziplistNext(unsigned char *zl, unsigned char *p);   //擷取當前列表位置的下一個值unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p);   //擷取當期列表位置的前一個值unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval);   //擷取列表的資訊unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen); //向列表中插入資料unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p); //列表中刪除某個結點unsigned char *ziplistDeleteRange(unsigned char *zl, unsigned int index, unsigned int num);   //從index索引對應的結點開始算起,刪除num個結點unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen);   //列表間的比較方法unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip); //在列表中尋找某個結點unsigned int ziplistLen(unsigned char *zl);   //返回列表的長度size_t ziplistBlobLen(unsigned char *zl);   //返回列表的二進位長度,返回的是位元組數


Redis源碼分析(六)--- ziplist壓縮列表

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.