stl--Simulation for Space Configurator

Source: Internet
Author: User
Tags throw exception

Directory

    • Problem
    • SGI Version Space Configurator-std::alloc
    • First-level Space Configurator
    • Second-level Space Configurator
    • Refill, Chunkalloc functions
    • Finally, the Simple_alloc interface that the Configurator encapsulates
Problem

In our daily writing C + + programs, we often use our STL standard library to help us solve the problem, in which we use the most estimate is its vector, list container, they bring the convenience of not much to say (after all, OJ, brush the question what, basically all their figure), In everyday learning, however, we may have a relatively small understanding of another large component in the STL -the Space Configurator . But it is also a useful thing, the reason for this is that it solves the memory fragmentation problem that occurs during the memory allocation process, in particular

As above, for a piece of memory allocated from the heap, because the release of the block memory is usually indeterminate, because depending on the user, for the 32 bytes just released, although returned to the OS, but because the middle is fragmented memory, so at this time to use the 32 bytes from the OS to request 20 bytes of memory can not be completed.
In the multi-threaded environment, the impact of this memory fragmentation problem is even greater, multiple threads frequent memory requests and releases, while the application and release of memory blocks are very small; During program execution, the memory of these fragments can indirectly cause memory waste, and then an OS to manage such frequent operations, is bound to affect its efficiency.

SGI Version Space Configurator-std::alloc

The Configurator in STL is always hidden behind all groups (specifically, container), silently working. But from the perspective of STL implementation, the first thing we need to figure out is the space Configurator, because we operate all the STL objects are basically stored in the container, and the container must be configured to place data, do not know its principle, will affect the future of the STL in-depth learning.
In SGI STL,Std::alloc is the default space configurator:
vector<int, std::alloc> iv
Yes, it does not seem to be a standard notation (the standard notation should be allocator), and it does not accept parameters, but it does not bother us because it is the default and seldom requires us to specify the Configurator name ourselves. (As for why it is not necessary to allocator this more standard notation, it stems from its efficiency problem.) Specific can refer to the STL source Analysis ), today mainly look at the Alloc version Configurator implementation principle, deepen their understanding of space allocation.

What the Configurator is doing is actually the space configuration before the object is constructed and the space release after the object is reconstructed. Refer to the SGI Practice Configurator for this design to consider:

    • Getting space to the system heap space
    • Consider multithreading status
    • Measures to consider when memory is low
    • Consider the possible memory fragmentation problems of too many "small chunks"

Based on this, a two-level Configurator model is designed in the ALLOC implementation. The first-level Configurator uses malloc and free directly, and the level two configurator takes a different approach depending on the situation: when the required memory block exceeds 128 bytes, it is considered a chunk of memory requirements and is directly called a first-level configurator to allocate; when memory blocks < 128 bytes are required, is managed by the level two Configurator (which may also be used together with a one-level configurator, for the following reasons).

First-level Space Configurator

First, the primary configurator STL default name is usually __malloc_alloc_template<0> . In the STL implementation, it is typedef to ALLOC. It is then important to note that the __use_malloc is not typically defined, and it is not the default Configurator in the STL.


first-level configurator simulation implementation:

#pragma once#include <iostream> #include <windows.h>using namespace std;//level space configurator typedef void (*handle_ FUNC) (); template <int inst>//Inst is a reserved parameter for later extension class __mallocalloctemplate {private:/* Defines a function pointer type member for easy callback execution user customization    The memory deallocation function, the member default setting does not execute */static Handle_func __malloc_alloc_oom_handler;  Static void* Oom_malloc (size_t N) {while (1) {if (0 = = __malloc_alloc_oom_handler) {throw            Bad_alloc ();  }else{__malloc_alloc_oom_handler ();                Free memory Sleep (200);                void* ret = malloc (n);            if (ret) return ret;        }}}public:static void* Allocate (size_t n) {void *result = malloc (n);        Malloc application failed, execution Oom_malloc request memory if (0 = = result) result = Oom_malloc (n); cout<< "Application Successful!"        <<endl;    return result;    } static void deallocate (void *p, size_t/* n */) {free (p); }/* Set the Oom_malloc handle function,*/Static Handle_func Setmallochandler (Handle_func f) {handle_func old = f;        __malloc_alloc_oom_handler = f;    return old; }};template<int Inst>handle_func __mallocalloctemplate<inst>::__malloc_alloc_oom_handler = 0;// Custom memory deallocation function static void Freememory () {cout<< "executes a user-defined function to begin freeing memory ..." &LT;&LT;ENDL;} void Test_alloc1 (); void Test_alloc2 ();


About the first-level configurator implementation. Note two places:

Both the memory allocation allocate and the release dellocate are simply encapsulated by malloc and free, while the member functions of the class are static member functions that are decorated with static

    • A static member function is set to be called directly outside the class without having to create the object. Note that the units targeted by the Configurator are actually processes. There may be different containers in a process that will have memory for the space Configurator, so the Configurator interface is set to universal. But in C + + focus on the encapsulation of the program, so they are then a layer of class packaging.

Implements a static void* Oom_malloc (size_t) function. This is usually after a malloc call fails, and then it is called to throw a Bad_alloc exception. But the design here considers its extensibility.

    • The first-level Configurator class declares a function pointer type member "**__malloc_alloc_oom_handler" if the user himself has the help of the OS to get space to allocate the Freememory method, you can pass that member, let oom_malloc** callback your Freememor function in order to help the OS get memory and make malloc allocation successful.
    • The __malloc_alloc_oom_handler member can be set through the static Handle_func Setmallochandler (Handle_func f)
    • This is usually a strategy of your own design. Designing this function is a way to improve the efficiency of the Space Configurator, because malloc is as successful as possible. This is usually the big guy to play with. Let's just do this. Initialize the handle function to 0, using the default way.

The implementation of the first level configurator, but we are not difficult to find from the front, this simple package malloc, free first-level configurator is not efficient, so STL really has a design philosophy is the following level two configurator.



Second-level Space Configurator

First, when the caller needs less than 128 bytes of memory, it is time to use the level two configurator to allocate memory, of course, not only that, this two-level configurator is also a memory recovery work. The entire space Configurator is precisely because it is able to achieve true, rapidly allocated memory. As for the reason, it starts with its constituent structure.
It has two constituent structures:

    • A memory pool (a chunk of memory)
    • a set of free lists (freelist)

Notice that there are two pointers Startfree, Endfree, which are equivalent to the watermark, which represents the size of the memory pool.
The free list is actually a pointer array of size 16, spaced in multiples of 8. The respective management sizes are 8,16,24 120,128 bytes of small chunks respectively. A linked list is hung under each subscript, and the same size blocks of memory are linked together. (This looks like a hash bucket!) )

To allocate memory procedures:

First, when our container requests <128 small chunks of memory from the Configurator, it is necessary to get a chunk from the corresponding list first. In particular: Take the application memory size to approximate the 8 method is calculated in the pointer array subscript, and then you can remove the first block of memory from the list to return. When a piece of memory is exhausted, the user releases the same operation, then calculates the subscript for the block and then inserts the memory header into the corresponding list.
(Of course, the actual calculation of these corresponding subscript, using two more accurate and efficient functions, see behind, here is just a simple analysis)


Look at the list node structure and links.
There is one such structure in the second-level configurator

union Obj{        union Obj* _freelistlink;        char client_data[1];    /* The client sees this.  用来调试用的*/    };
    • Notice that this is a consortium in which the function of a block of memory is to pull out 4 bytes in a block of memory, then force this obj to link to the next free block, and when the memory block is delivered to the user, It stores the user's data directly. obj* is 4 bytes large, but most memory blocks are larger than 4. All we want to do is link a block of chunks of memory, and we don't have to see everything in memory, so we can use a strong switch to obj* to achieve a link to large chunks of memory.
    • another is that the free list of the different subscript under the block is increased by 8 units, and the minimum is 8 bytes . The reason is simple, because we also have to consider the environment in the 64-bit machine. Because each chunk must save at least one obj*, so that the small chunks can be connected together.
    • And that's why it comes from above. If we only need 5 bytes of memory, it will result in 3 bytes of waste; So our level two Configurator introduces another problem-the internal fragmentation problem (the only thing we've solved with the free list is the OS-allocated out-of-memory fragmentation problem). For a small block of links, we also can not use it hundred percent, after all, everything is difficult.

Well, we're here to discuss a major premise.--freelist below are links to small chunks. when these blocks are not attached to a location on the freelist? So, this is what the following refill,chunkalloc The two functions to do.

Secondary Configurator-related interfaces:
#pragma once#include "Allocator.h"/////////////////////////////////////////////////////////////////////////   Second-level Space Configurator template <bool threads, int inst>class __defaultalloctemplate{public://///-index=8//72    size_t Static Freelist_index (size_t N) {return (n + __align-1)/__align-1); }////size_t round_up (size_t bytes) {return (((bytes) + __align-1) &A mp    ~ (__align-1));                  } static void* Chunkalloc (size_t size, size_t& NOBJS);//Get chunk memory static void* refill (size_t bytes);                    Fill free list static void* Allocate (size_t N);          The allocation returns a small memory block of static void Deallocate (void* p, size_t N);    Manage Reclaim memory Private:enum {__align = 8};     enum {__max_bytes = 128};    enum {__nfreelists = __max_bytes/__align};        Union obj{Union obj* _freelistlink;    Char client_data[1];  /* The client sees this.    Used for debugging */}; Free link Table static Obj* _freelist[__nfreelists];    Memory pool static char* _startfree;    Static char* _endfree; Static size_t _heapsize;};/ /__defaultalloctemplate member Initialization template <bool threads, int inst>typename __defaultalloctemplate<threads, Inst >::obj*__defaultalloctemplate<threads, inst>::_freelist[__nfreelists] = {0};//Memory pool template <bool threads , int inst>char* __defaultalloctemplate<threads, inst>::_startfree = null;template <bool threads, int inst& gt;char* __defaultalloctemplate<threads, Inst>::_endfree = null;template <bool threads, int inst>size_t __ Defaultalloctemplate<threads, inst>::_heapsize = 0;



Refill, Chunkalloc functions

As we said earlier, when the memory blocks we need are not in the memory block at the subscript of the free list, we must call refill to populate the free list. Application is generally a one-time application of 20 memory block size of memory (can participate in STL implementation source code).
and where do you find it? --Of course the memory pool! Allocating such a large chunk of memory to level two Configurator is now used. The memory can be quickly "cut" from the memory pool by moving the startfree pointer and then cut into small pieces by size and hanging underneath the free list. in this process , the first small block of memory can be directly returned to the user, the rest is hung in the free list, convenient next allocation.


Based on this idea, the refill can be implemented as follows:

void* __DefaultAllocTemplate<threads, inst>::Refill(size_t bytes){    size_t nobjs = 20;   /*默认从内存池取20块对象,填充*/    //从内存池中拿到一大块内存    char* chunk = (char*)ChunkAlloc(bytes, nobjs);    if (nobjs == 1)      /*只取到了一块*/        return chunk;    size_t index = FREELIST_INDEX(bytes);    printf("返回一个对象,将剩余%u个对象挂到freelist[%u]下面\n", nobjs-1, index);    Obj* cur = (Obj*)(chunk + bytes);    _freelist[index] = cur;    for (size_t i = 0; i < nobjs-2; ++i){        Obj* next = (Obj*)((char*)cur + bytes);        cur->_freelistlink = next;        cur = next;    }    cur->_freelistlink = NULL;    return chunk;}

Note: Chunkalloc requests memory from the memory pool

Consider a problem

here, we seem to have a question. Since the simple mobile startfree can be cheerful from the memory pool to get a piece of memory back, then why do you want to take 20 pieces, return a piece, will be the remaining 19 pieces to the freelist corresponding position below it? It's so troublesome to hang up! Is it more cheerful to return a piece of memory directly from the memory pool each time? There is of course no need to worry about external fragmentation. Because each time the memory is released, it can be added to our maintained free list and continue the next assignment.

    1. and here, in fact, consider a high concurrency situation: This concurrency situation, when a piece of memory taken from the memory pool, there will undoubtedly be more than one thread to operate at the same time, Startfree perform addition to return a piece of memory is not an atomic operation, so this will inevitably involve lock unlock, At the same time, these threads are not unified in the size of the memory block, so many threads will inevitably affect the execution speed because of the lock, which affects the efficiency.
    2. take 20 blocks at a time to alleviate this situation, when multiple threads to take the memory block is not the same, this time will not lock, because it is taken from a different list; At this point, the lock will only be locked in multiple threads from the same linked list of the same size of memory.
    3. Although a memory operation from the memory pool also involves locking, but because the call refill to fill the free-link list of the number of relatively few, so the above so a one-time fetch of 20 blocks can be improved high concurrency under the program execution efficiency.



      Next is the Chuncalloc function.
      It represents a large chunk of memory from the memory pool and, as much as possible, ensures that the memory pool has "water" at all times as a pool. specifically it follows 4 guidelines:
    4. Memory pool memory is plenty, direct "generous" return
    5. Memory pool memory is a bit tight, try to return the memory of the caller's needs
    6. Memory Pool "Poor to eat soil", need to help the OS to malloc to supplement it "Fountainhead"
    7. The OS also "eats the soil", the memory Pool "brainwave", the idea of the back free linked list.
    8. Nothing, a memory pool last-ditch, call-level Configurator

To the last level of the Configurator based on its out-of-memory processing mechanism, there may be a chance to free up other memory and then take it here. If so, the "help" memory pool will be successful, otherwise the bad_alloc exception Notification consumer will be issued.


Based on this idea, we can simulate the implementation of the CHUNKALLOC function

Function: Request a chunk of memory from the memory pool template <bool threads, int inst>void* __defaultalloctemplate<threads, inst>::    Chunkalloc (size_t size, size_t& nobjs) {size_t totalbytes = nobjs*size;    size_t leftbytes = _endfree-_startfree;        A) There is enough memory in the memory pool if (leftbytes >= totalbytes) {printf ("Memory pool has enough memory blocks of%u objects \ n", NOBJS);        void* ret = _startfree;        _startfree + = TotalBytes;    return ret;  b) Memory pool remains only part of the object memory block}else if (leftbytes > Size) {nobjs = leftbytes/size;        /* Save can use object block number */totalbytes = SIZE*NOBJS;        printf ("Memory pool has only a memory block of%u objects \ n", NOBJS);        void* ret = _startfree;        _startfree + = TotalBytes;    return ret; c) There is not enough memory in the memory pool for an object block size}else{//1. First, dispose of the memory pool's remaining small block memory and plug its head into the corresponding free-link list if (leftbytes > 0) {size_t in            Dex = Freelist_index (leftbytes);            ((obj*) _startfree)->_freelistlink = _freelist[index];        _freelist[index] = (obj*) _startfree;       }//2. Call malloc to request a larger chunk of memory into the memory pool size_t bytestoget = totalbytes*2 + round_up (_heapsize>>4);        _startfree = (char*) malloc (bytestoget);                        printf ("Memory pool without memory, to system request%ubytes\n", bytestoget);             if (_startfree = = NULL) {//3. malloc requests memory failure, memory pool does not have memory recharge, find size_t index = freelist_index (size) in the larger free list;                                   for (; index < __nfreelists; ++index) {//Free list take out a piece to the memory pool if (_freelist[index]) { _startfree = (char*) _freelist[index];                    BUG??                    obj* obj = _freelist[index];                    _freelist[index] = obj->_freelistlink;                  Return Chunkalloc (size, NOBJS);  }} _endfree = NULL;  /*in case of exception. !! Guaranteed exceptional Safety///revolt, final stroke. If the memory is really tight, then the first-level configurator to see if Out-of-memory can do a bit of force, not to throw exception notification user _startfree = (char*) __mallocalloctemplate<0>::allocate (byte        Stoget);        } _heapsize + = Bytestoget; _endfree = _startfree+ Bytestoget;    Recursively calls itself, in order to fix NOBJS return chunkalloc (size, NOBJS); }}



There is also a point to note here: is endfree= NULL _ Such an operation

    • This sentence is easily overlooked by us. This is really a very important operation, which is related to the exception of the security problem, in the memory pool poor mountain water, it calls the first level configurator, I hope that the level of the Configurator can release some memory, in the Chunkalloc to malloc success, but usually this is a failure, so the first-level Configurator throws an exception, However, exception throwing does not mean that the program ends, at this point the endfree is not null and may be a large number, (Endfree hold the previous value) at this time the Startfree pointer is null. The difference between the two indicates that the memory pool has large chunks of memory , but this is no longer a memory pool.

      Organize the flow of configurator assignments

Finally, the Simple_alloc interface that the Configurator encapsulates

Whether Alloc is defined as a first-level or a second-level configurator, SGI also wraps an interface Simple_alloc for the Configurator interface to conform to the STL specification:

#ifdef __USE_MALLOCtypedef __MallocAllocTemplate<0> alloc;#elsetypedef __DefaultAllocTemplate<false, 0> alloc;#endiftemplate<class T, class Alloc>class SimpleAlloc {public:    static T* Allocate(size_t n){         return 0 == n? 0 : (T*) Alloc::Allocate(n * sizeof (T));    }    static T* Allocate(void){         return (T*) Alloc::Allocate(sizeof (T));    }    static void Deallocate(T *p, size_t n){         if (0 != n)            Alloc::Deallocate(p, n * sizeof (T));    }    static void Deallocate(T *p){         Alloc::Deallocate(p, sizeof (T));    }};

Inside this four member function is actually a simple transfer call, the call passed to the Configurator member function, this interface when the configuration unit of the Configurator from bytes to the size of individual elements. All containers in SGI STL use the Simple_alloc interface, for example

template< class T, class Alloc= alloc>class vector{protected:    //专属空间配置器,每次配置一个元素大小    typedef simple_alloc<value_type, Alloc> data_allocator;    void deallocate(){        if(...)            data_allocator::deallocate(start, end_of_storage- start);    }    ...};




In order to control the problem in a certain degree of complexity, to these, only the single-threaded case. In case of concurrency, the process of processing it is relatively more complex. We can see the STL in the Space Configurator of the source code implementation to further study, which will reflect a lot of good ideas,

    • For example, when the operation of the Chunk_alloc is locked, a mechanism similar to the "smart pointer" is used. Because in multi-threaded situations, when allocating memory in Chunk_alloc, it is possible that a thread is not unlocked because of an abnormal termination, which in turn causes the other threads to block, causing the deadlock problem and affecting the execution of the program.
      STL is locked here, with a package lock class object, when the object out of scope will be automatically destructor, to unlock the operation, to ensure that the thread security problem. And this is a concrete embodiment of the idea of RAII(Resource acquisition is initialization).

STL Configurator has many other excellent design, here is just a part of my understanding of it. In order to deepen our understanding, we can view the source code in STL to learn more deeply.



Analog Overall implementation: GITHUB.COM/TP16B/PROJECT/TREE/MASTER/ALLOC/SRC
reference:"STL Source Code Analysis"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.