Memcached source code analysis ----- slab Memory distributor, memcached ----- slab

Last Update:2015-01-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprinted please indicate the source: http://blog.csdn.net/luotuo44/article/details/42737181

Tip: This article uses some global variables that can be set when memcached is started. For the meanings of these global variables, refer to memcached startup parameters. For these global variables, take their default values directly as described in "how to read memcached source code.

Slab Memory Pool distributor:

Slab introduction:

Memcached uses a memory allocation method called slab. For details about slab, refer to links 1 and 2. You can simply regard it as a memory pool. The memory block size allocated by the memcached memory pool is fixed. Although the memory size is fixed, the memory size (size) That memcached can allocate also has many specifications. Generally, it meets the requirements.

Memcached declares an slabclass_t struct type and defines an slabclass_t array slabclass (a global variable ). Each element of the array can be called an slab distributor. The size of memory allocated by an slab distributor is fixed, and the size of memory allocated by different slab is different. The following is a classic image:

The memory blocks allocated from each slab class (slab distributor) are connected by pointers (the links won't be lost ). As shown in:

Is a logical diagram. Each item is not big, from a few B to 1 M. If every item is applied to call malloc dynamically, it will inevitably cause a lot of memory fragments. Therefore, memcached applies for a relatively large piece of memory, divides the memory into items, and connects these items with two pointers (prev and next. The actual physical diagram is as follows:

, Each slabclass_t has an slab array. The memory size allocated by multiple slabs of the same slabclass_t is the same, and the memory size allocated by different slabclass_t is different. Because the total memory allocated by each slab distributor has an upper limit, for a single slabclass_t, to allocate a lot of memory, multiple slab splitters are required.

Determine the slab distributor allocation specification:

After reading the figure, let's see how memcached determines the slab distributor allocation specification. Because memcached uses global variables, let's take a look at global variables first.

// Slabs. c file typedef struct {unsigned int size; // the size of the item assigned by the slab distributor unsigned int perslab; // The number of item void * slots that each slab distributor can allocate; // point to the idle item linked list unsigned int sl_curr; // Number of idle items // This is the number of slabs instances with memory allocated. List_size is the size of this slabs array (slab_list) unsigned int slabs; // Number of slab splitters available in this slabclass_t // slab array. Each element of the array is an slab distributor, all of these splitters allocate the same size of memory void ** slab_list; unsigned int list_size; // size of the slab array, list_size> = slabs // used for reassign, specify the block of memory in slabclass_t to be used by other slabclass_t using unsigned int killing; size_t requested; // The number of bytes allocated by this slabclass_t} slabclass_t; # define POWER_SMALLEST 1 # define POWER_LARGEST 200 # define CHUNK_ALIGN_BYTES 8 # define MAX_NUMBER_OF_SLAB_CLASSES (POWER_LARGEST + 1) // although the array element has MAX_NUMBER_OF_SLAB_CLASSES, not all of them are actually used. // The number of actually used elements is indicated by power_largest static slabclass_t slabclass [MAX_NUMBER_OF_SLAB_CLASSES]; // 201 static int power_largest; // The number of used elements in the slabclass array.

The above Code defines a global slabclass array. This array is the slabclass_t array of the preceding figures. Although the slabclass array has 201 elements, it may not be used by all elements. The global variable power_largest indicates how many elements are used. Let's take a look at the slabs_init function, which initializes the array. This function will be called in the main function.

// Slabs. file c static size_t mem_limit = 0; // The maximum memory limit set by the user is static size_t mem_malloced = 0; // if the program requires a pre-allocated memory, instead of allocating the memory only when necessary, // mem_base points to the pre-allocated memory. // mem_current indicates the starting position of memory that can be used. // mem_avail indicates the amount of memory that can be used. static void * mem_base = NULL; static void * mem_current = NULL; static size_t mem_avail = 0; // The factor parameter is the expansion factor. The default value is 1.25 void slabs_init (const size_t limit, const double factor, const bool prealloc) {int I = POWER_SM ALLEST-1; // settings. the default value of chunk_size is 48. You can use the-n option to set the value when starting memcached. // size consists of two parts: the item struct itself corresponds to this item. // The data here is the data in the set and add commands. the following loop shows that the size variable will gradually expand according to the expansion factor, so the data length that can be stored will also become unsigned int size = sizeof (item) + settings. chunk_size; mem_limit = limit; // user setting or default maximum memory limit // a large block of memory needs to be pre-allocated by the user. In the future, you need to apply for the memory. If (prealloc) {// default value: false mem_base = malloc (mem_limit); if (mem_base! = NULL) {mem_current = mem_base; mem_avail = mem_limit;} else {fprintf (stderr, "Warning: Failed to allocate requested memory in" one large chunk. \ nWill allocate in smaller chunks \ n ") ;}// initialize the array. This operation is very important. The member variable values of all elements in the array are 0 and memset (slabclass, 0, sizeof (slabclass); // The first element in the slabclass array does not use // settings. item_size_max is the maximum size of items supported by memcached. The default size is 1 MB (that is, the maximum size of data stored in memcached is 1 MB ). While (++ I <POWER_LARGEST & size <= settings. item_size_max/factor) {/* Make sure items are always n-byte aligned */if (size % CHUNK_ALIGN_BYTES) // 8-byte aligned size + = bytes-(size % CHUNK_ALIGN_BYTES ); // The item size allocated by the slab distributor of this slabclass [I]. size = size; // The maximum number of items that can be allocated by the slab distributor of this slabclass (also determines the maximum amount of memory allocated) slabclass [I]. perslab = settings. item_size_max/slabclass [I]. size; size * = factor; // resize} // the largest item power_largest = I; slabclass [power_largest]. size = settings. item_size_max; slabclass [power_largest]. perslab = 1 ;... if (prealloc) {// pre-allocated memory slabs_preallocate (power_largest );}}

The items in the code above are used to store data stored in memcached. The cycle in the Code determines the size of items that can be allocated by each slabclass_t in the slabclass array, that is, the size of items that can be allocated by the slab distributor, and the number of items that can be allocated by the slab distributor.

The code above also shows that memcached can store a larger piece of data by increasing settings. item_size_max. Of course there is a limit, and the maximum is only 128 MB. Coincidentally, the maximum memory allocated by the slab distributor is also limited by settings. item_size_max. Because each slab distributor has an upper limit on the maximum memory that can be allocated, each slabclass_t in the slabclass array has multiple slab splitters and uses an array to manage these slab splitters. The size of this array is unrestricted, so there can be many items of a specific size. Of course, the total memory size that can be allocated for the entire memcached is also limited. You can use the-m option to set it when starting memcached. The default value is 64 MB. The limit parameter in the slabs_init function is the total memory allocated by memcached.

Pre-allocated memory:

Now we assume that you need to allocate some memory in advance, instead of waiting until the client sends the Storage Data command. The slabs_preallocate function pre-allocates idle items for each slabclass_t element in the slabclass array. Because items may be relatively small (this can be seen in the code above), you cannot apply for memory in units of items (this can easily cause memory fragmentation ). Therefore, when the application is used, a large piece of memory is applied, and the memory is divided into items, which means that multiple items are applied. In this article, the applied memory is called a memory page, that is, a page is applied. If the global variable settings. slab_reassign is true, the page size is settings. item_size_max. Otherwise, it is equal to slabclass_t.size * slabclass_t.perslab. Settings. slab_reassign is mainly used to balance each slabclass_t. The Memory Page and page size will be used in a unified manner.

Now let's assume that the user needs to allocate memory in advance. Let's take a look at the slabs_preallocate function. The parameter value of this function is the number of slabclass array elements used. The slabs_preallocate function is called to allocate slab memory blocks and set items.

// The parameter value is the number of slabclass array elements used // allocate the memory static void slabs_preallocate (const unsigned int maxslabs) for each element of the slabclass array (used element) {int I; unsigned int prealloc = 0; // traverse the slabclass array for (I = POWER_SMALLEST; I <= POWER_LARGEST; I ++) {if (++ prealloc> maxslabs) // Of course, only the returned array element is traversed; if (do_slabs_newslab (I) = 0) {// allocate a memory page for each slabclass_t // if the allocation fails, will exit the program. because the pre-allocated memory is the foundation of the subsequent program running // if the allocation fails, the subsequent Code cannot be executed. So exit the program directly. Exit (1) ;}}// the number of slab in slabclass_t is gradually increasing. This function is used to apply for an additional slab for slabclass_t // specify the slabclass_tstatic int do_slabs_newslab (const unsigned int id) in the slabclass_t array) {slabclass_t * p = & slabclass [id]; // settings. the default value of slab_reassign is false. Here, false is used. Int len = settings. slab_reassign? Settings. item_size_max: p-> size * p-> perslab; // Its product <= settings. item_size_max char * ptr; // The mem_malloced value is set through the environment variable. The default value is 0 if (mem_limit & mem_malloced + len> mem_limit & p-> slabs> 0) | (grow_slab_list (id) = 0) | // increase slab_list (0 is returned for failure ). Generally, it succeeds unless the memory (ptr = memory_allocate (size_t) len) cannot be allocated) {// The len byte memory is allocated (that is, a page) return 0;} memset (ptr, 0, (size_t) len); // It is required to clear the memory block. // It is used to cut the memory into items, of course, the size and id of the item are controlled by split_slab_page_pai_freelist (ptr, id ); // submit the allocated memory page to slab_list for management. p-> slab_list [p-> slabs ++] = ptr; mem_malloced + = len; return 1 ;}

The preceding do_slabs_newslab function internally calls three functions. The grow_slab_list function is used to increase the size of the slab array (as shown in the slab array ). The memory_allocate function is responsible for applying for memory of len bytes. The split_slab_page_pai_freelist function is used to split the applied memory into multiple items and link these items to form a two-way linked list. As shown in: I have already seen this figure. Let's take a look at the code.

The following describes the specific implementation of the three functions.

// Increase the memory pointed by the slab_list Member, that is, increase the slab_list array. So that more slab splitters can be available. // unless memory allocation fails, 1 is returned, regardless of whether the static int grow_slab_list (const unsigned int id) is actually increased) {slabclass_t * p = & slabclass [id]; if (p-> slabs = p-> list_size) {// all the elements of the previously applied slab_list array, size_t new_size = (p-> list_size! = 0 )? P-> list_size * 2: 16; void * new_list = realloc (p-> slab_list, new_size * sizeof (void *); if (new_list = 0) return 0; p-> list_size = new_size; p-> slab_list = new_list;} return 1;} // apply for memory allocation. If the program has pre-allocated memory blocks, apply for memory from the pre-allocated memory block // otherwise call malloc to allocate memory static void * memory_allocate (size_t size) {void * ret; // if the program requires pre-allocated memory, instead of allocating memory when necessary, // mem_base points to the pre-allocated memory. // mem_current indicates the starting position of memory that can be used. // mem_avail indicates the memory that can be used. if (Mem_base = NULL) {// not pre-allocated memory/* We are not using a preallocated large memory chunk */ret = malloc (size);} else {ret = mem_current; // In byte alignment, the last few bytes used for alignment are meaningless (not used). // Therefore, whether the size is larger than the available memory is calculated first, then the alignment is calculated. if (size> mem_avail) {// There is not enough available memory to return NULL;} // now the alignment is considered, it doesn't matter if the size after alignment is larger than mem_avail. // The last few bytes used for alignment won't actually use/* mem_current pointer _ must _ be aligned !!! */If (size % CHUNK_ALIGN_BYTES) {// byte alignment. ensure that the size is a multiple of CHUNK_ALIGN_BYTES (8) size + = CHUNK_ALIGN_BYTES-(size % CHUNK_ALIGN_BYTES);} mem_current = (char *) mem_current) + size; if) {mem_avail-= size;} else {// at this time, the size is greater than mem_avail and it doesn't matter if mem_avail = 0 ;}} return ret ;} // divides the memory pages pointed by ptr into itemstatic void split_slab_page_pai_freelist (char * ptr, const unsigned int id) {slabclass_t * p = & Slabclass [id]; int x; for (x = 0; x <p-> perslab; x ++) {// divides the memory pointed by ptr into items. A total of perslab items // and connect these items before and after. // The do_slabs_free function is called when the worker thread returns the memory to the memory pool. However, the newly applied memory can also be used as a return to the memory pool. Inject the memory into the memory pool do_slabs_free (ptr, 0, id); ptr + = p-> size; // size is the size of item} static void do_slabs_free (void * ptr, const size_t size, unsigned int id) {slabclass_t * p; item * it; assert (item *) ptr)-> slabs_clsid = 0 ); assert (id> = POWER_SMALLEST & id <= power_largest); if (id <POWER_SMALLEST | id> power_largest) return; p = & slabclass [id]; it = (item *) ptr; // Add the ITEM_SLABBED attribute for it_flags of the item, indicating that the item is in the sl AB is not allocated to it-> it_flags | = ITEM_SLABBED; // when called by split_slab_page_pai_freelist, the following four rows serve to direct the prev and next of these items to each other, connect these items. // when this function is called when the worker thread returns the memory to the memory pool, the following four rows are used, // use the linked list header to insert the item into the idle item list. It-> prev = 0; it-> next = p-> slots; if (it-> next) it-> next-> prev = it; p-> slots = it; // The slot variable points to the first idle item p-> sl_curr ++; // The number of items available for idle use p-> requested-= size; // reduce the number of bytes allocated by the slabclass_t return ;}

According to the do_slabs_free function, this function is also called when the worker thread returns the memory to the memory pool. Because the return time of each item in the same slab memory block is different, after memcached runs for a period of time, the item linked list will become very messy, not as shown in the figure above. It may be as follows:

Although chaotic, there will certainly be a clear chain chart like the previous logic chart, where the slots variable points to the first idle item.

Apply for memory from the memory pool:

The do_slabs_free function corresponds to the do_slabs_alloc function. This function is called when the worker thread applies for memory from the memory pool. Before calling the API, You need to determine which element of the slabclass array to apply for memory based on the Applied memory size. The slabs_clsid function completes this task.

Unsigned int slabs_clsid (const size_t size) {// return the slabclass index value int res = POWER_SMALLEST; // The initial value of res is 1 // If 0 is returned, the query fails, because the first element in the slabclass array is unused if (size = 0) return 0; // because the size of items that can be allocated by each element in the slabclass array is ascending // you can find the smallest but satisfying element in the array from small to large while (size> slabclass [res]. size) if (res ++ = power_largest)/* won't fit in the biggest slab */return 0; return res ;}

In the do_slabs_alloc function, if the corresponding slabclass_t has idle items, it will be allocated directly. Otherwise, you need to expand the slab to get some idle items and then allocate them. The code is shown below:

// Apply for an item from slabclass. Before calling this function, you have called the slabs_clsid function to determine the id of the slabclass_t to which you applied for an item. The parameter id indicates the slabclass_t to which you applied for an item. If the slabclass_t has idle items, allocate one from the idle item queue // if there is no idle item, then apply for a memory page. Then assign an item from the newly applied page // The returned value is the expected item. If no memory is available, return NULLstatic void * do_slabs_alloc (const size_t size, unsigned int id) {slabclass_t * p; void * ret = NULL; item * it = NULL; if (id <POWER_SMALLEST | id> power_largest) {// subscript out-of-bounds MEMCACHED_SLABS_ALLOCATE_FAILED (size, 0 ); return NULL;} p = & slabclass [id]; assert (p-> sl_curr = 0 | (item *) p-> slots) -> slabs_clsid = 0); // if p-> sl_curr is equal to 0, the slabclass_t has no idle items.. // Call do_slabs_newslab to apply for a memory page if (! (P-> sl_curr! = 0 | do_slabs_newslab (id )! = 0) {// when p-> sl_curr is equal to 0 and do_slabs_newslab returns 0, click Here/* We don't have more memory available */ret = NULL;} else if (p-> sl_curr! = 0) {// unless the call to do_slabs_newslab fails, it will always be here. No matter whether sl_curr is 0 at the beginning. // P-> slots points to the first idle item. In this case, allocate the first idle item/* return off our freelist */it = (item *) p-> slots; p-> slots = it-> next; // slots points to the next idle item if (it-> next) it-> next-> prev = 0; p-> sl_curr --; // The number of idle resources minus ret = (void *) it;} if (ret) {p-> requested + = size; // increase the number of bytes allocated by this slabclass} return ret ;}

We can see that in the do_slabs_alloc function, the item is added by calling do_slabs_newslab.

No lock is found in the code above this article. As a large lock user of memcached, it is a bit abnormal. In fact, some of the code above need to be locked for access, such as the do_slabs_alloc function. The reason why I didn't see it in the above Code is that memcached uses the wrap function (this concept is quite familiar to readers who have read "UNIX network programming ). Memcached only accesses the functions above after locking them in the function package. The following are two function packages.

static pthread_mutex_t slabs_lock = PTHREAD_MUTEX_INITIALIZER;void *slabs_alloc(size_t size, unsigned int id) {    void *ret;    pthread_mutex_lock(&slabs_lock);    ret = do_slabs_alloc(size, id);    pthread_mutex_unlock(&slabs_lock);    return ret;}void slabs_free(void *ptr, size_t size, unsigned int id) {    pthread_mutex_lock(&slabs_lock);    do_slabs_free(ptr, size, id);    pthread_mutex_unlock(&slabs_lock);}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Memcached source code analysis ----- slab Memory distributor, memcached ----- slab

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Memcached source code analysis ----- slab Memory distributor, memcached ----- slab

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support