memcached Source Analysis-----Slab memory Allocator

Source: Internet
Author: User
Tags assert prev

Reprint Please specify Source:

Warm tip: This article uses some global variables that can be set at startup Memcached. The meaning of these global variables can be referenced in the memcached startup parameters in detail. For these global variables, the process is simply to take their default values as described in how to read memcached source code.

Slab Memory Pool Allocator:

Slab Introduction:

Memcached uses a memory allocation method called Slab, which can refer to link 1 and Link 2 for an introduction to slab. You can simply think of it as a memory pool. The memory block size allocated by the memcached memory pool is fixed. Although it is a fixed size, the memcached memory size (size) can be allocated in many different sizes. In general, the demand is met.

Memcached declares a slabclass_t struct type and defines an array of slabclass_t types Slabclass (a global variable). Each element of an array can be called a slab allocator. The amount of memory a slab allocator can allocate is fixed, and the amount of memory allocated by different slab is different. Let's borrow a classic diagram to illustrate:

The memory blocks allocated from each slab class (slab allocator) are connected by pointers (not lost). As shown in the following:

is a logical diagram. Each item is small, from a few B to 1M. If each item is dynamically calling malloc for a request, it is bound to cause a lot of memory fragmentation. So the memcached approach is to apply a larger piece of memory first, then divide the memory into one item and connect the item with two pointers (prev and next). So the actual physical diagram is as follows:

, each of the slabclass_t has an slab array. The memory size allocated by multiple slab of the same slabclass_t is the same, and the memory size allocated by different slabclass_t is different. Because each slab allocator can allocate the total memory has a limit, so for a slabclass_t, to allocate a lot of memory must have more than slab allocator.

Determine the distribution specifications for the slab allocator:

After reading the figure, now look at how memcached determines the distribution specifications of the slab allocator. Because memcached uses global variables, take a look at global variables first.

SLABS.C file typedef struct {    unsigned int size;//slab allocator allocates the size of item     unsigned int perslab; How many item void *slots can be assigned to each slab allocator    ;  Point to Idle Item list    unsigned int sl_curr;//number of idle items//This is the number of slabs that have been allocated memory. List_size is the size of this slabs array (slab_list)    unsigned int slabs;//slabclass_t The number slab allocator of the//slab   array, Each element of an array is a slab allocator that allocates memory void **slab_list of the same size    ;     unsigned int list_size; The size of the slab array, list_size >= slabs//for reassign, indicates which block of memory in slabclass_t is to be used by other slabclass_t unsigned    int killing;     size_t requested; The number of bytes allocated by this slabclass_t} slabclass_t; #define Power_smallest 1#define power_largest 200#define chunk_align_bytes  8#define max_number_of_slab_classes (power_largest + 1)//array elements Although there are max_number_of_slab_classes, they are not actually used in all. The number of elements actually used is indicated by power_largest static slabclass_t slabclass[max_number_of_slab_classes];//201static int Power_largest The number of elements in the//slabclass array that have already been used.

As you can see, the above code defines a global Slabclass array. This array is the slabclass_t array of the preceding graphs. Although the Slabclass array has 201 elements, it may not be used by all elements. The global variable power_largest indicates how many elements are used. Take a look at the Slabs_init function, which initializes the array. The function is called in the main function.

SLABS.C file static size_t Mem_limit = 0;//user set maximum memory limit static size_t mem_malloced = 0;//If the program requires pre-allocating memory instead of allocating memory when needed, then/ The mem_base points to the pre-allocated memory.//mem_current points to the starting position of the memory that can also be used//mem_avail indicates how much memory is available for static void *mem_base = null;static void * Mem_current = null;static size_t mem_avail = 0;//parameter factor is the expansion factor, the default value is 1.25void slabs_init (const size_t limit, const double FA ctor, const bool prealloc) {int i = power_smallest-1;//settings.chunk_size default value is 48, can be set by the-N option when starting memcached//size by two-part Composition: The item structure itself and the data corresponding to the item//The data in this case is the data in the set, add command. The following loop can see that the size variable will be//////factor slowly expand according to the expansion factor, so the length of the data can be stored will also become larger unsigned    int size = sizeof (item) + Settings.chunk_size;    Mem_limit = limit;//user settings or default memory maximum limit//the user requires a large chunk of memory to be pre-allocated and later requires memory to request this memory.        if (prealloc) {//The default value is False mem_base = malloc (mem_limit);            if (mem_base! = NULL) {mem_current = Mem_base;        Mem_avail = Mem_limit; } else {fprintf (stderr, "warning:failed to allocate requested memory in" "One LArge Chunk.\nwill allocate in smaller chunks\n "); }}//initializes the array, which is important, and all elements in the array have a member variable value of 0 memset (slabclass, 0, sizeof (slabclass)); The first element in the//slabclass array does not use//    The Settings.item_size_max is the largest item size supported by memcached, which defaults to 1M (that is, the maximum data stored on the Web//memcached is 1MB). while (++i < power_largest && size <= settings.item_size_max/factor) {/* Make sure items is alway S n-byte aligned */if (size% chunk_align_bytes)//8 byte aligned size + = chunk_align_bytes-(size% chunk_align        _bytes);//This slabclass slab allocator can allocate the item size slabclass[i].size = size;        This slabclass slab allocator can allocate up to the maximum number of item (also determines how much memory is allocated) Slabclass[i].perslab = Settings.item_size_max/slabclass[i].size;    Size *= factor;//Expansion}//the largest item power_largest = i;    Slabclass[power_largest].size = Settings.item_size_max;    Slabclass[power_largest].perslab = 1, ... if (prealloc) {//Pre-allocates memory slabs_preallocate (power_largest); }}

The item that appears in the above code is used to store the data we put in memcached. The loop in the code determines the item size that each slabclass_t in the Slabclass array can allocate, that is, the item size that the slab allocator can allocate, and also determines the number of items that the slab allocator can allocate.

The above code can also be seen to allow memcached to store a larger piece of data by increasing settings.item_size_max. Of course there is a limit, the maximum can only be 128MB. Coincidentally, the maximum amount of memory that the slab allocator can allocate is also limited by this settings.item_size_max. Because each slab allocator can allocate the maximum number of internal limits, each slabclass_t in the Slabclass array has multiple slab allocators, which manage these slab allocators with an array. This array size is unrestricted, so there can be a lot more for a particular size item. Of course, the total amount of memory allocated by the memcached is limited, and can be set by the-M option when the memcached is started, and the default value is 64MB. The limit parameter in the Slabs_init function is the total memory that the memcached can allocate.

Pre-Allocated memory:

Now assume that the user needs to allocate some memory beforehand rather than wait until the client sends the store data command. The Slabs_preallocate function pre-allocates some idle item for each slabclass_t element in the Slabclass array. Because item may be small (as the above code can see), it is not possible to request memory in item (which could easily cause memory fragmentation). So in the application of the use of a larger piece of memory, and then the memory is divided into one item, so that is equal to the application of more than one item. The memory that this article will apply for is called a memory page, that is, a page was requested. if the global variable settings.slab_reassign is true, then the page size is Settings.item_size_max, otherwise it is equal to Slabclass_t.size * Slabclass_t.perslab. Settings.slab_reassign is mainly used to balance individual slabclass_t. The following article will use the memory page, page size uniformly.

Now assume that the user needs to pre-allocate memory and take a look at the Slabs_preallocate function. The function's parameter value is the number of Slabclass array elements that are used. The call to the Slabs_preallocate function is to allocate slab memory blocks and set item.

The parameter value is the number of Slabclass array elements to use//for each element of the Slabclass array (the element used) allocates memory static void Slabs_preallocate (const unsigned int maxslabs)    {int i; unsigned int prealloc = 0;//Traversal slabclass array for (i = power_smallest; I <= power_largest; i++) {if (++prealloc        > Maxslabs)//IS, of course, only traversing the used array element return; if (Do_slabs_newslab (i) = = 0) {//Allocate a memory page for each slabclass_t//If the allocation fails, the program is exited. Because this pre-allocated memory is the basis for subsequent programs//If the allocation fails here, the subsequent code cannot be executed.            So just quit the program.        Exit (1); The number of slab in}}//slabclass_t is slowly increasing.    The function is to apply one more slab//parameter ID for slabclass_t to indicate that the slabclass_tstatic int do_slabs_newslab (const unsigned int id) in the slabclass array {    slabclass_t *p = &slabclass[id];//settings.slab_reassign The default value is False, this uses false. int len = settings.slab_reassign? Settings.item_size_max:p->size * p->perslab;//The value of <= Settings.item_size_max char *ptr;//mem_malloced        By setting the environment variable, the default is 0 if ((mem_limit && mem_malloced + len > mem_limit && p->slabs > 0) | | (grow_slab_list (id) = =0) | | Growth Slab_list (failed to return 0).    Generally succeeds unless memory is not allocated ((PTR = Memory_allocate ((size_t) len) = = 0)) {//Allocate Len bytes memory (that is, a page) return 0; } memset (PTR, 0, (size_t) len);//Clear 0 memory block is required//the memory is cut into one item, of course item size has the ID of the control split_slab_page_into_freelist (PTR, id    )///Assign the allocated memory page to Slab_list in charge p->slab_list[p->slabs++] = ptr;    mem_malloced + = Len; return 1;}

Three functions are called inside the Do_slabs_newslab function above. The function of grow_slab_list is to increase the size of the slab array (as shown in the slab array). The Memory_allocate function is responsible for requesting memory with a size of Len bytes. The function split_slab_page_into_freelist is responsible for dividing the requested memory into multiple item, and the item is connected together to form a doubly linked list. As shown: I have seen this diagram before, read the code to see it again.

Let's take a look at the specific implementations of the three functions.

Increase the memory that the Slab_list member points to, that is, increase the slab_list array. Makes it possible to have more slab allocators//Unless memory allocation fails, otherwise all returns 1, regardless of whether the static int grow_slab_list (const unsigned int id) is actually enlarged {slabclass_t *p = &AMP;SL    Abclass[id]; if (P->slabs = = p->list_size) {//run out of all elements of the slab_list array previously requested size_t new_size = (p->list_size! = 0)? p->        List_size * 2:16;        void *new_list = ReAlloc (p->slab_list, new_size * sizeof (void *));        if (new_list = = 0) return 0;        P->list_size = new_size;    P->slab_list = new_list; } return 1;} Request to allocate memory, if the program has pre-allocated memory blocks, request memory to the pre-allocated memory block//Otherwise call malloc to allocate memory static void *memory_allocate (size_t size) {void *ret;// If the program requires pre-allocating memory instead of allocating memory when it is needed,//mem_base points to the pre-allocated memory.//mem_current points to the beginning of the memory that can also be used//mem_avail indicates how much memory is available if (    Mem_base = = NULL) {//Is not pre-allocated memory/* We is not using a preallocated large memory chunk */ret = malloc (size); } else {ret = mem_current;//in byte alignment, the last few bytes used to align are meaningless (not used)//So this is the first to calculate if the size is larger than the available memory, then calculate the alignment if (size > Mem_avaiL) {//does not have enough available memory to return NULL; }//now consider the alignment problem, if the Zi size is larger than the mem_avail is irrelevant//because the last few bytes used for alignment will not really use/* mem_current pointer _must_ be aligned!!! */if (size% chunk_align_bytes) {//byte aligned. Guaranteed size is a multiple of chunk_align_bytes (8) Size + = chunk_align_bytes-(si        Ze% chunk_align_bytes);        } mem_current = ((char*) mem_current) + size;        if (Size < Mem_avail) {mem_avail-= size;        } else {//At this time, size is larger than mem_avail mem_avail = 0; }} return ret;} Divides the pages of memory pointed by PTR into itemstatic void split_slab_page_into_freelist (char *ptr, const unsigned int id) {slabclass_t *p = &    Amp;slabclass[id];    int x; for (x = 0, x < p->perslab; x + +) {//The memory pointed to by PTR is divided into a single item. A total of Perslab//and connect the item to the front and back. The Do_slabs_free function is called when the worker thread returns memory to the memory pool. But here//the newly requested memory can also be considered as a return memory to the memory pool.        Inject memory into the memory pool in Do_slabs_free (PTR, 0, id); PTR + = P->size;//size is the size of item}}static void Do_slabs_free (void *ptr, const size_t size, unsigned int id) {    slabclass_t *p;    Item *it;    ASSERT ((item *) ptr)->slabs_clsid = = 0);    ASSERT (ID >= power_smallest && ID <= power_largest);    if (ID < power_smallest | | ID > power_largest) return;    p = &slabclass[id]; it = (item *) ptr;//adds the item_slabbed attribute to the it_flags of item, indicating that the item is not assigned in slab It->it_flags |= item_slabbed;//by Split_ When Slab_page_into_freelist is called, the following 4 lines are//Let these item prev and next point to each other and link the item together.//When this function is called when the worker thread returns memory to the memory pool, the following 4 rows are    /Use the chain header interpolation to insert the item into the free item list.    It->prev = 0;    It->next = p->slots;    if (it->next) It->next->prev = it; P->slots = It;//slot variable points to the first idle can use item p->sl_curr++;//idle can use the item number p->requested-= size;//reduce this slabclass_t score Number of bytes to be sent to return;}

The comment in the Do_slabs_free function says that the function is also called when the worker thread returns memory to the memory pool. Because each item in the same slab memory block has a different return time, the item list becomes confusing after memcached has been running for a while, not like the one pictured above. It is possible that:

It's confusing, but there's definitely a clear list of diagrams in front of that logic diagram, where the slots variable points to the first idle item.

To request memory from a memory pool:

corresponding to the Do_slabs_free function is the Do_slabs_alloc function. This function is called when the worker thread requests memory from the memory pool. Before calling, it is necessary to determine which element of the Slabclass array to request memory, based on the size of the memory requested. The function slabs_clsid is to complete this task.

unsigned int slabs_clsid (const size_t size) {//return Slabclass index subscript value    int res = Power_smallest;//res The initial value of 1//returns 0 indicates that the lookup failed, Because the first element in the Slabclass array is not used    if (size = = 0)        return 0;//because each element in the Slabclass array can be assigned an item size that is ascending// So from small to large direct judgment can be found in the array of the smallest but satisfying elements while    (Size > Slabclass[res].size)        if (res++ = = power_largest)//     won ' t fit In the biggest slab */            return 0;    return res;}

In the Do_slabs_alloc function, if the corresponding slabclass_t has an idle item, it is allocated directly. Otherwise, you need to expand slab to get some free item and assign it. The code looks like the following:

Apply an item to Slabclass. Before calling the function, the SLABS_CLSID function has been called to determine which slabclass_t the application is applying to, and the parameter ID indicates to which slabclass_t//to request the item. If the slabclass_t has an idle item, allocate one from the idle item queue///If there is no idle item, then request a memory page. Then assign a item//return value for the resulting item, if there is no memory, return nullstatic void *do_slabs_alloc (const size_t size, unsigned int id) {Slabclass    _t *p;    void *ret = NULL;    Item *it = NULL;        if (ID < power_smallest | | ID > power_largest) {//Subscript out of bounds memcached_slabs_allocate_failed (size, 0);    return NULL;    } p = &slabclass[id]; ASSERT (P->sl_curr = = 0 | |    (item *) p->slots)->slabs_clsid = = 0);    If P->sl_curr equals 0, it means that the slabclass_t has no idle item. At this point you need to call Do_slabs_newslab to request a memory page if (! (P->sl_curr! = 0 | | do_slabs_newslab (ID)! = 0))    {//when P->sl_curr equals 0 and the return value of Do_slabs_newslab equals 0 o'clock, enter here/* We do not have more memory available */ret = NULL;    } else if (P->sl_curr! = 0) {//unless the Do_slabs_newslab call fails, it will come here. Whether or not the first Sl_curr is 0.         P->slots points to the first idle item, at which point the first free item is assigned   /* return off our freelist */it = (item *) p->slots;        P->slots = It->next;//slots points to the next idle item if (it->next) It->next->prev = 0;    p->sl_curr--;//idle number minus one ret = (void *) it; if (ret) {p->requested + = size;//Increase the number of bytes allocated by this slabclass} return ret;

You can see that the inside of the Do_slabs_alloc function is also adding item by calling Do_slabs_newslab.

In the code earlier in this article, none of the locks were seen. As memcached this with the lock large, a bit abnormal. In fact, some of the previous code is to be locked to access, such as the Do_slabs_alloc function. It is not seen in the above code because memcached uses the wrapping function (the concept is familiar to readers who have read UNIX network programming). memcached access to the above functions only after they have been locked in the wrapping function. Here are two package functions.

static pthread_mutex_t Slabs_lock = Pthread_mutex_initializer;void *slabs_alloc (size_t size, unsigned int id) {    void *ret;    Pthread_mutex_lock (&slabs_lock);    ret = Do_slabs_alloc (size, id);    Pthread_mutex_unlock (&slabs_lock);    return ret;} void Slabs_free (void *ptr, size_t size, unsigned int id) {    pthread_mutex_lock (&slabs_lock);    Do_slabs_free (PTR, size, id);    Pthread_mutex_unlock (&slabs_lock);}

memcached Source Analysis-----Slab memory Allocator

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.