A collection of dictionary jumping table integers for Redis implementations

Source: Internet
Author: User
Tags new set

Integer collection

An integer collection (insert) is one of the underlying implementations of a collection key, and when a collection contains only an integer value element and the number of elements in the collection is not long, Redis uses an integer collection as the underlying implementation of the collection key. For a chestnut, if we create a collection key that contains only five elements, and all the elements in the collection are integer values, the underlying implementation of the set key is an integer collection:

127.0.0.1:6379> Sadd numbers 1 3 5 7 9 (integer) 5127.0.0.1:6379> smembers numbers1) "1" 2) "3" 3) "5" 4) "7" 5) "9" 127.0 .0.1:6379> OBJECT ENCODING Numbers "Intset"

    

Implementation of an integer collection

An integer collection (insert) is a collection abstract data structure used by Redis to hold integer values that can hold integer values of type int16_t, int32_t, or int64_t, and ensure that duplicate elements do not appear in the collection.

Intset.h

typedef struct INTSET {//encode    uint32_t encoding;//collection contains the number of elements    uint32_t length;//save an array of elements    int8_t contents[];} Intset;

  

The contents array is the underlying implementation of an integer collection: Each element of an integer collection is an array item in the contents array, and items are arranged from small to large in the array by value, and the array contains no duplicates. The Length property records the number of elements contained in an integer collection, that is, the lengths of the contents array. Although the Intset structure declares the contents property as an array of type int8_t, the contents array does not actually hold any int8_t type values, and the true type of the contents array depends on the value of the Encoding property:

    • If the value of the Encoding property is intset_enc_int16, then contents is an array of type int16_t, and each item in the array is an integer value of type int16_t (minimum value is-32 768, the maximum value is 32 767)
    • If the value of the Encoding property is Intset_enc_int32, then contents is an array of type int32_t, and each item in the array is an integer value of type int32_t (minimum value is-2 147 483 648 with a maximum value of 2 147 483 647)
    • If the value of the Encoding property is Intset_enc_int64, then contents is an array of type int64_t, and each item in the array is an integer value of type int64_t (minimum value is-9 223 372 036 854 775 808, the maximum value is 9 223 372 036 854 775 807).

Figure 1-1 shows an example of an integer collection:

Figure 1-1 An integer collection containing five int6_t-type integer values

    • If the value of the Encoding property is intset_enc_int16, the underlying implementation of the integer collection is an array of type int16_t, and the collection holds the integer value of the int16_t type
    • The Length property is 5, which indicates that the set of integers contains 5 elements
    • Contents array holds five elements from a collection in small to large order
    • Because each collection element is an integer value of type int16_t, the size of the contents array equals the sizeof (int16_t) *5=16*5=80 bit

Figure 1-2 shows an example of another set of integers:

Figure 1-2 An integer collection containing four int16_t-type integer values

    • The value of the Encoding property is Intset_enc_int64, which means that the underlying implementation of the integer collection is an array of type int64_t, and the array holds integer values of the int64_t type
    • The value of the length property is 4, which means that the integer collection contains four elements
    • Contents array holds four elements from a collection in small to large order
    • Because each collection element is an integer value of type int64_t, the size of the contents array is sizeof (int64_t) *4=64*4=256 bit

Although the contents array holds four integer values, only-2 675 256 175 807 981 027 are really saved with the int64_t type, while the other 1, 3, and 53 values can be saved with the int16_t type, but according to the upgrade rules of the integer collection, When you add an integer value of type int64_t to an integer collection of int16_t arrays, all elements of an integer collection are converted to int64_t types, so the four integer values saved by the contents array are int64_t types, not just 2 675 256 175 807) 981 027

Upgrade

Whenever we want to add a new element to the set of integers, and the new element's type is longer than the integer collection of all existing elements, the integer collection needs to be upgraded (upgrade) before the new element can be added to the integer collection. Upgrading an integer set merge adding new elements is divided into three steps:

    1. Expands the space size of the underlying array pair and allocates space for the new element based on the type of the new element
    2. All existing elements of the underlying array are converted to the same type as the new element, and the type-converted elements are placed in the correct position, and in the process of placing the elements, the ordered nature of the underlying array needs to remain unchanged.
    3. Add a new element to the underlying array

For a chestnut, suppose there is now a intset_enc_int16 encoded integer set that contains three elements of the int16_t type, as shown in 1-3

Figure 1-3 An integer collection of elements with three int16_t types

Because each element occupies 16 bits of space, the size of the underlying array of the integer collection is the 3*16=48 bit, and figure 1-4 shows the position of the three elements of the integer collection in these 48 bits.

Now, suppose we want to add an integer value of type int32_t 65 535 to the set of integers, because 65 535 of the type int32_t is longer than the current all elements of an integer collection, so before adding 65 535 to an integer collection, the program needs to first upgrade the set of integers. The first thing to do is to spatially redistribute the underlying array based on the length of the new type and the number of elements in the collection (including new elements to be added)

The integer collection currently has three elements, plus the new element 65 535, the integer collection needs to allocate four elements of space, because each int32_t integer value takes up 32 bits of space, so after the spatial redistribution, the size of the underlying array will be 32*4=128 bit, 1-5. Although the program has spatially redistributed the underlying array, the original three elements 1, 2, and 3 of the array are still int16_t types, which are also stored in the first 48 bits of the array, so the next thing the program will do is convert the three elements into the int32_t type. and place the converted element above the correct bit, and in the process of placing the element, it is necessary to maintain the ordered nature of the underlying array.

Figure 1-5 Array after spatial redistribution

First, because Element 3 is ranked third in the 1, 2, 3, 655,354 elements, it is moved to the index 2 position of the contents array, which is the space of 64 bits to 95 bits of the array, as shown in 1-6

Figure 1-6 Type conversion of element 3 and save in the appropriate location

Next, because element 2 is ranked second in the 1, 2, 3, 655,354 elements, it is moved to the position of index 1 of the contents element, which is the 32-bit to 63-bit space of the array, 1-7

Figure 1-7 Type conversion of element 2 and save on the appropriate bit

After that, because element 1 is ranked number one in 1, 2, 3, 655,364 elements, it is moved to the index 0 position of the contents array, which is the 0-bit to 31-bit space of the array, as shown in 1-8

Figure 1-8 Type conversion of element 1 and save in the appropriate location

Then, because element 65535 is ranked fourth in 1, 2, 3, 655,354 elements, it is added to the index 3 position of the contents array, as well as the 96-bit to 127-bit space of the array, as shown in 1-9

1-9 adding 65535 to an array

Finally, the program changes the value of the integer set encoding property from Intset_enc_int16 to Intset_enc_int32, and changes the value of the length property from 3 to 4, as shown in the integer collection 1-10 after the set is complete.

Figure 1-10 Integer Collection after completion of the add operation

Because adding new elements to an integer collection can cause an upgrade every time the upgrade requires a type conversion of elements already in the underlying array, adding new elements to an integer collection has an O (N) time complexity. Other types of upgrade operations, such as upgrading from intset_enc_int16 encoding to intset_enc_int64 encoding, or upgrading from Intset_enc_int32 encoding to Intset_enc_int64 encoding, The upgrade process is similar to the upgrade process shown above

Benefits of upgrading

An upgrade strategy for an integer collection has two benefits, one of which is to increase the flexibility of the integer set, and the other is to conserve memory as much as possible

Increased flexibility

Because C is a statically typed language, in order to avoid type errors, we usually do not place two different types of values in the same data structure. For example, we typically only use arrays of type int16_t to hold values of the int16_t type, using only an array of type int32_t to hold values of the int32_t type, and so forth. However, since integer collections can be adapted to new elements by automatically upgrading the underlying array, we can arbitrarily add int16_t, int32_t, or int64_t types of integers to the collection without worrying about type errors

Save Memory

The simplest way to allow an array to hold three types of values, int16_t, int32_t, int64_t, is to use an array of the int64_t type directly as the underlying implementation of the set of integers. However, some of the values that can be stored with int16_t and int32_t are stored using the int64_t type, resulting in a waste of memory. An integer collection now allows the collection to save three different types of values at the same time, and ensures that the upgrade will only take place when needed, saving memory as much as possible.

Downgrade

The integer set does not support the downgrade operation, and once the array is upgraded, the encoding will remain in the upgraded state. For a chestnut, for the set of integers shown in Figure 1-11, even if we remove the only element in the collection that really needs to be saved with the int64_t Type 4 294 967 295, the encoding of the integer collection will still remain in use int64_t, and the underlying array will still make the int64_t type, 1-12 is shown

Figure 1-11 An integer collection of arrays encoded as Intset_enc_int64

Figure 1-12 Deleting an integer collection of 4 294 967 295

Integer Set API

Table 1-1 lists the operations API for an integer collection

table 1-1 Integer Collection API
Function Role Complexity of Time
Intsetnew (void) Create a new set of integers O (1)
Intsetadd (Intset *is, int64_t value, uint8_t *success) Adds a given element to an integer collection O (N)
Intsetremove (Intset *is, int64_t value, int *success) Removes the given element from the integer collection O (N)
Intsetfind (Intset *is, int64_t value) Checks whether a given value exists in the collection Because the underlying array is ordered, the lookup can be done by means of the binary lookup method, so the complexity is O (LOGN)
Intsetrandom (Intset *is) Returns an element randomly from a collection of integers O (1)
Intsetget (Intset *is, uint32_t pos, int64_t *value) Take out the elements of the underlying array at the given index O (1)
Intsetlen (Intset *is) Returns the number of elements contained in an integer collection O (1)
Intsetbloblen (Intset *is) Returns the number of memory bytes consumed by the integer collection O (1)

A collection of dictionary jumping table integers for Redis implementations

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.