An integer set (intset)

Last Update:2018-06-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

An integer set description An Integer Set intset is used to store multiple integer values in an orderly and non-repeating manner. It is automatically used to save elements based on the value of the element in the set. For example: if int32_t can be used to save the integer with the largest absolute value in intset, all elements in the entire intset are saved using int32_t. If the current intset is of the type

Integer collection

An integer set intset is used to store multiple integer values in an ordered and non-repeated manner. Based on the value of the element in the set, it is automatically selected to use the integer type to save the element. For example: if int32_t can be used to save the integer with the largest absolute value in intset, all elements in the entire intset are saved using int32_t.

If the type used by the current intset cannot save a new element to be added to the intset, You need to upgrade the intset. For example, the type of the new element is int64_t, and the type of the current intset is int32_t, then, the upgrade will first convert all intset elements from int32_t to int64_t, and then insert new elements.

For int8_t, int32_t, and int64_t, my personal understanding should correspond to char, int, long, and use int8_t, int32_t, and int64_t to differentiate platform differences, for details, see stdint. h file.

Data Structure of an Integer Set

Typedef struct intset {uint32_t encoding; // length of the type used, 4 \ 8 \ 16 uint32_t length; // int8_t contents []; // Save the array of elements} intset;

The value of encoding is one of the following three constants:

# Define INTSET_ENC_INT16 (sizeof (int16_t ))

# Define INTSET_ENC_INT32 (sizeof (int32_t ))

# Define INTSET_ENC_INT64 (sizeof (int64_t ))

The contents array is used to actually store data. The features of elements in the array: No repeated elements; the elements are arranged progressively in the array.

Introduction to APIs related to integer Sets

Function Name	Function	Complexity
_ IntsetValueEncoding	Obtains the encoding type of a given integer.	O (1)
_ IntsetGet	Returns an integer Based on the index.	O (1)
_ IntsetSet	Set integer value based on Index	O (1)
IntsetNew	Create intset	O (1)
IntsetResize	Re-allocate memory for a given intset	O (1)
IntsetSearch	Checks whether the given integer is in intset.	O (logN)
IntsetUpgradeAndAdd	Upgrade intset first and insert element	O (N)
IntsetAdd	Add element directly	O (N)
IntsetMoveTail	Offset element in intset	O (N)
IntsetRemove	Delete Element	O (N)
IntsetRandom	Returns an element of an intset randomly.	O (1)
IntsetLen	Number of elements in intset	O (1)
IntsetBlobLen	Intset bytes	O (1)

Simple parsing of important API source code intsetAdd

// Add an integer intset * intsetAdd (intset * is, int64_t value, uint8_t * success) {uint8_t valenc = _ intsetValueEncoding (value); // obtain the type length uint32_t pos; if (success) * success = 1;/* Upgrade encoding if necessary. if we need to upgrade, we know that * this value shoshould be either appended (if> 0) or prepended (if <0), * because it lies outside the range of existing values. * /// if the upgrade is required, update and insert the new value if (valenc> intrev32ifbe (is-> encoding) {/* This always succeeds, so we don't need to curry * success. */return intsetUpgradeAndAdd (is, value);} else {// otherwise/* Abort if the value is already present in the set. * This call will populate "pos" with the right position to insert * the value when it cannot be found. * /// if the value already exists in the Set, if (intsetSearch (is, value, & pos) {if (success) * success = 0; return is;} is = intsetResize (is, intrev32ifbe (is-> length) + 1); // offset all values after the pos position to a position backward, if (pos <intrev32ifbe (is-> length) intsetMoveTail (is, pos, pos + 1);} _ intsetSet (is, pos, value ); // Add the new element is-> length = intrev32ifbe (intrev32ifbe (is-> length) + 1); return is ;}

When the intsetAdd function adds an element value, it first compares the number of bytes of the value with the encoding of the current intset, and analyzes whether the intset needs to be upgraded. If yes, it calls the intsetUpdateAndAdd function for processing, otherwise, if the value already exists in the intset directly pass and does not exist, first resize and then offset all elements after the insertion position to add the value.

IntsetMoveTail

/** Use memmove to offset the set backward. The subscript starts from 0 and has been Resize for example: front | 1 | 2 | 3 | 4 | 5 | 6 | from = 1, to = 3 length = 6 src = | 2 | 3 | 4 | 5 | 6 | dst = | 4 | 5 | 6 | bytes = 5 * sizeof (...) after | 1 | 2 | 3 | 2 | 3 | 4 | 5 | 6 | before the offset, you must use the intsetResize function to scale up. If you do not understand the changes, we recommend that you check the memmove source code. here we need to consider the memory coverage problem, that is, why memmove must be used instead of memcpy */static void intsetMoveTail (intset * is, uint32_t from, uint32_t) {void * src, * dst; uint32_t bytes = intrev32ifbe (is-> length)-from; uint32_t encoding = intrev32ifbe (is-> encoding); if (encoding = bytes) {src = (int64_t *) is-> contents + from; dst = (int64_t *) is-> contents + to; bytes * = sizeof (int64_t );} else if (encoding = INTSET_ENC_INT32) {src = (int32_t *) is-> contents + from; dst = (int32_t *) is-> contents +; bytes * = sizeof (int32_t);} else {src = (int16_t *) is-> contents + from; dst = (int16_t *) is-> contents +; bytes * = sizeof (int16_t);} memmove (dst, src, bytes );}

IntsetUpdateAndAdd

// Upgrade the encoding type. O (n) // the value to be inserted is either greater than the maximum value in the current set or smaller than the minimum value in the set, otherwise, you do not need to upgrade it // It is larger or smaller than the maximum value. You only need to judge static intset * intsetUpgradeAndAdd (intset * is, int64_t value) based on the positive and negative values) {uint8_t curenc = intrev32ifbe (is-> encoding); // The current encoding type uint8_t newenc = _ intsetValueEncoding (value ); // The new encoding type int length = intrev32ifbe (is-> length); int prepend = value <0? 1: 0; // determines where the new value is inserted (1 indicates the header, 0 indicates the end) /* First set new encoding and resize */is-> encoding = intrev32ifbe (newenc); // set the encoding type is = intsetResize (is, intrev32ifbe (is-> length) + 1); // resize/* Upgrade back-to-front so we don't overwrite values. * Note that the "prepend" variable is used to make sure we have an empty * space at either the beginning or the end of the intset. * /// use _ intsetGetEncoded to obtain the integer value of the position before the upgrade. // set the value of the original Integer Set. If prepend = 1, the new value is inserted in the header, the original values are all offset backward while (length --) _ intsetSet (is, length + prepend, _ intsetGetEncoded (is, length, curenc )); /* Set the value at the beginning or the end. */if (prepend) // insert _ intsetSet (is, 0, value) in the header; else // insert _ intsetSet (is, intrev32ifbe (is-> length) at the end ), value); is-> length = intrev32ifbe (intrev32ifbe (is-> length) + 1); return is ;}

IntsetRemove

// Delete an integer intset * intsetRemove (intset * is, int64_t value, int * success) {uint8_t valenc = _ intsetValueEncoding (value); uint32_t pos; if (success) * success = 0; // value in the original set if (valenc <= intrev32ifbe (is-> encoding) & intsetSearch (is, value, & pos )) {uint32_t len = intrev32ifbe (is-> length);/* We know we can delete */if (success) * success = 1; /* Overwrite value with tail and update length * // If the pos is not the end of is, delete the integer directly by overwriting memmove memory. // if it is the end, directly resize Delete if (pos <(len-1) intsetMoveTail (is, pos + 1, pos); is = intsetResize (is, len-1 ); // reduce the space is-> length = intrev32ifbe (len-1);} return is ;}

Flowchart of adding intset Elements

Summary

Intset is used to store multiple integer values in an ordered and non-repeated manner. Based on the value of an element, intset automatically selects the Length Integer type to save the element;

When adding a new element, You need to determine whether the encoding type of the current intset can save the new element. If not, you need to upgrade the intset, the elements in the upgraded intset will increase the number of bytes it occupies, but the value does not change;

Intset only supports upgrading and does not support downgrading. Therefore, memory is wasted;

The elements in intset are ordered, so the time complexity of semi-query is O (logN ).

Finally, I would like to thank Huang jianhong (huangz1990) for its Redis design and implementation and other comments on the Redis2.6 source code for my help in studying the Redis2.8 source code.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

An integer set (intset)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

An integer set (intset)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support