Explanation of Redis internal data structure-Simple Dynamic string (sds)

Source: Internet
Author: User
The following is an example of the number of bytes occupied by sizeof computing parameters. For more information, see typedefstructNode {intlen; charstr [5];} Node; typedefstructNode2 {intlen; charstr [];} Node2; sizeof (char *) 4 sizeof (Node *) 4 sizeof (Node)

The following is an example of the number of bytes occupied by sizeof computing parameters. For more information about the calculation of sds data structure address, see typedef struct Node {int len; char str [5];} node; typedef struct Node2 {int len; char str [];} Node2; sizeof (char *) = 4 sizeof (Node *) = 4 sizeof (Node) =

Prerequisites

The following describes some examples of the number of bytes occupied by sizeof computing parameters to facilitate the calculation of sds data structure addresses.

typedef struct Node{    int len;    char str[5];}Node;typedef struct Node2{    int len;    char str[];}Node2;sizeof(char*) = 4sizeof(Node*) = 4sizeof(Node) = 12sizeof(Node2) = 4

The result value of sizeof is interpreted as a pointer. The first two values are 4. The third value is 12 because len occupies 4 bytes, and char str [5] actually occupies 5 bytes, however, the computer memory alignment actually occupies 8 bytes. The last one is equal to 4 because char str [] has no actual length and is not allocated with memory.

After learning about sizeof, you also need to know about va_list, va_start, va_end, and va_copy in stdarg. h. There are a lot of explanations on the Internet.

Comparison Between Simple Dynamic string sds and char *

Sds is a tool for implementing string objects in Redis and completely replaces char *.

Char * has a single function and cannot meet Redis's demand for efficient string processing. The performance bottleneck of char * mainly lies in the use of strlen function to calculate the string length, the time complexity of this function is O (N), while operations on String Length Calculation in Redis are very frequent, and O (N) time complexity is completely unacceptable, the sds implementation can get the length value of the string within the O (1) time. At the same time, when processing the append string operation, if char * is used, the memory needs to be re-allocated multiple times.

Simple Dynamic string sds Data Structure
Typedef char * sds; struct sdshdr {int len; // The length occupied by the buf, that is, the current String Length Value int free; // The available length of the buf, use char buf [] During append; // actually save string data };

By adding the len field, you can get the length of the string within the O (1) time complexity and add the free field. When you need to append the string, if the value of free is greater than or equal to the length of the string that requires append, you can directly append the value without re-allocating the memory. Sizeof (sdshdr) = 8.

Function API in simple dynamic string sds

Function Name

Function

Complexity

Sdsnewlen

Creates an sds with a specified length and accepts a specified C string as the initialization value.

O (N)

Sdsempty

Create an sds containing only null strings"

O (N)

Sdsnew

Creates an sds according to the given C string.

O (N)

Sdsdup

Copy the given sds

O (N)

Sdsfree

Release a given sds

O (1)

Sdsupdatelen

Update the free and len values of sdshdr corresponding to a given sds.

O (1)

Sdsclear

Clears the buf of the given sds, initializes the buf to "", and modifies the free and len values of the corresponding sdshdr.

O (1)

SdsMakeRoomFor

Extended the buf of the sdshdr corresponding to the given sds

O (N)

SdsRemoveFreeSpace

Release excess buf space without modifying sds

O (N)

SdsAllocSize

Calculate the memory occupied by a given sds.

O (1)

SdsIncrLen

Extend or contract the right end of the buf of the given sds

O (1)

Sdsgrowzero

Extend the given sds to the specified length, and fill the spare parts with \ 0

O (N)

Sdscatlen

Append a C string to the buf of the sdshdr corresponding to the given sds.

O (N)

Sdscpylen

To copy a C string to sds, You need to determine whether expansion is required based on the total length of sds.

O (N)

Sdscatprintf

Append to the specified sds by formatting the output format.

O (N)

Sdstrim

For a given sds, delete the characters in the front-end/backend in the given C string

O (N)

Sdsrange

Truncate the specified sds, [start, end] string

O (N)

Sdscmp

Compare the size of two sds

O (N)

Sdssplitlen

Splits the given string s by the given sep.

O (N)

Detailed analysis of sds implementation in Redis
static inline size_t sdslen(const sds s) {    struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));    return sh->len;}static inline size_t sdsavail(const sds s) {    struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));    return sh->free;}

The two functions sdslen and sdsavail are used to calculate the string length of the given sds and the number of free sds bytes respectively. After careful observation, we will find that the function parameter is sds, that is, char *. Then we can use a line of code to get the sdshdr data structure corresponding to the given sds. It looks amazing!

Check the code for initializing an sds in Redis.

/* Init: C string, initlen: C string length */sds sdsnewlen (const void * init, size_t initlen) {struct sdshdr * sh; if (init) {sh = zmalloc (sizeof (struct sdshdr) + initlen + 1);} else {sh = zcalloc (sizeof (struct sdshdr) + initlen + 1 );} if (sh = NULL) return NULL; sh-> len = initlen; sh-> free = 0; if (initlen & init) memcpy (sh-> buf, init, initlen); sh-> buf [initlen] = '\ 0'; return (char *) sh-> buf;}/* Create a new sds st Ring starting from a null termined C string. */sds sdsnew (const char * init) {size_t initlen = (init = NULL )? 0: strlen (init); return sdsnewlen (init, initlen );}

The core function is sdsnewlen, sh = zmalloc (sizeof (struct sdshdr) + initlen + 1) to allocate memory for the sdshdr data structure. The memory in this section is divided into two parts: the memory size occupied by the sdshdr data structure is sizeof (sdshdr). We know that the value is 8, and initlen + 1 is the memory of the buf in the sdshdr data structure. The Return Value of the sdsnewlen function is the first address of the buf. In this way, the sdslen function is used to subtract sizeof (sdshdr) from the first address of the given sds ), then it should be the first address of the sdshdr data structure corresponding to the sds. Naturally, we can get sh-> len and sh-> free. This kind of operation is really amazing. This is the clever use of the C language pointer. In addition, this method hides the sdshdr data structure, and all external interfaces are similar to the C string, however, it achieves the effect of obtaining sds String Length complexity O (1) and reducing append operations to frequently apply for memory.

Analysis of simple dynamic string sds space extension operations

The functions of the sds module are relatively simple. I will not describe them one by one. I will mainly explain how sds expands the space and how to use the extension operation during the append operation.

/* Enlarge the free space at the end of the sds string so that the caller * is sure that after calling this function can overwrite up to addlen * bytes after the end of the string, plus one more byte for nul term. ** Note: this does not change the * length * of the sds string as returned * by sdslen (), but only the free buffer space we have. * // extended sdshdr buf sds sdsMakeRoomFor (sds s, size_t addlen) {struct sdshdr * sh, * newsh; size_t free = sdsavail (s ); // view the free length of the current sds size_t len, newlen; if (free> = addlen) return s; // you do not need to extend len = sdslen (s ); // obtain the length of the current sds string sh = (void *) (s-(sizeof (struct sdshdr); // obtain the first sdshdr address newlen = (len + addlen ); // new length of sds after append if (newlen <SDS_MAX_PREALLOC) // SDS_MAX_PREALLOC (1024*1024), specific extension method newlen * = 2; else newlen + = SDS_MAX_PREALLOC; newsh = zrealloc (sh, sizeof (struct sdshdr) + newlen + 1); // re-allocate the memory if (newsh = NULL) return NULL; // failed to allocate memory newsh-> free = newlen-len; // return newsh-> buf ;}
Summary

Redis's simple dynamic string sds compares the C-language string char * with the following features:

1) The length of the string can be obtained in the time complexity of O (1 ).

2) Efficient append string operations

3) binary Security

Sds determines the available length of the current string and the length of the string to be appended. If the available length is greater than or equal to the length of the string to be appended, it can be directly appended, which reduces the memory reallocation operation; otherwise, first use the sdsMakeRoomFor function to expand sds, determine the size of the extended memory according to a certain mechanism, and then perform the append operation. After expansion, the extra space will not be released, it is convenient to append the string again next time. The cost is a waste of memory. However, when Redis string append operations are frequent, this mechanism can efficiently complete the append string operation.

Because other sds functions are relatively simple, if you have any questions, you can raise them in the reply.

It is pointed out that one of the comments made by the author of sds in source code 2.8 is incorrect and will not be listed.

Finally, I would like to thank Huang jianhong (huangz1990) for its Redis design and implementation and other comments on the Redis2.6 source code for my help in studying the Redis2.8 source code.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.