Allocator in the C ++ standard library has a complex and low-level interface [NOTE 1]. Unlike new and delete, They decouple the memory allocation from the object structure. Unlike malloc and free, they require you to specify the data type and number of objects in the memory being allocated.
Generally, this is not a problem. Allocator has low-level interfaces because they are low-level concepts: they are typically hidden inside the container class, rather than the code of common users. However, sometimes you may have to worry about Allocator: When you write a new container class by yourself. Allocator is more difficult to use than new/delete, so it is more prone to errors. If the code you write has to use Allocator, How can you ensure that the Code is correct? Tests during runtime cannot prove that there is no error, but can only prove that the error exists. Even so, testing is important-the sooner you discover a bug, the sooner it is corrected.
All standard container classes accept an allocator class as its template parameter. This parameter has a default value, for example, STD: vector <int> is vector <int, STD :: abbreviation of Allocator <int>. If we write a alocator class with an extra correctness check, we can use it to replace STD: Allocator <int> as the second template parameter of vector. If no bug exists, the vector behavior will be the same as before. (Of course, in addition to additional checks, this will make it slower .)
This article will show such a debugging allocator class. It is valuable within its applicability, and implementing it is also an interesting exercise using allocator.
Test
What types of errors do we want to check? The most important thing is:
L The Memory passed to the deallocate () is indeed allocated by the Allocator.
L when deallocate () is a piece of memory, the same number of bytes are returned as allocated. (Unlike malloc and free, Allocator does not record this information for itself .)
L the memory of deallocate (), which is of the same type as the allocation time. (Although Allocator decouples the memory allocation from the object structure, the memory is still allocated for a specific data type .)
L when writing data in a memory, we will not cross the border.
L never tries to construct two objects in the same position, and does not try to destroy the same object twice.
L when deallocate () is a piece of memory, ensure that all objects in the memory are destroyed first.
As we will see, our debugging Allocator will not meet all requirements. We can check some of the errors, not all.
The idea behind debugging allocator is simple [NOTE 2]: When allocate () is a piece of memory, allocate more and record some additional information in the first few bytes. The user cannot see this debug region; the pointer we send to the user points to the memory after this. When we return a pointer to the memory block we allocate, we can reduce the value to view the debug region and make sure that the memory block is used correctly.
This design has two hidden problems. First, it is impossible to implement it in a portable manner. We must keep alignment: If the address of user data needs to be aligned, we keep some bytes while maintaining alignment. How do we know how much to add? Theoretically, we cannot know; the language does not provide a mechanism to determine alignment requirements. (Maybe this will be added to future standards ). In practice, it is not a serious issue of portability: Alignment everything on double word, which is good enough on most platforms at present, it is also easy to make corresponding modifications on platforms with stricter requirements.
The second problem is that in this design, some errors are easy to check, and others won't work. The user obtains a piece of memory through allocate () and passes it to deallocate (). Therefore, this design can easily check that allocate () and deallocate () are used in the same way. Unfortunately, it is difficult to verify that construct () and destory () are used in the same way.
The problem is that the parameters passed to construct () and destory () are not pointers returned from allocate. If a. Construct (p, x) is written, P must point to the memory block internally allocated through a-internal, rather than (beginning ). You may have allocated enough memory for 1000 elements through p1 = A. Allocate (1000. In this case, we do not know whether the first parameter of construc () is P1, P1 + 5, or P1 + 178. We cannot find the desired debugging information, because we cannot find the start address of the allocated memory block.
There are two possible solutions to this problem, but neither of them works well. First, obviously, we can give up the idea that all the debugging information must be placed at the beginning of the memory block. However, this is not feasible, because we will have to store our information and user data in a hybrid manner. This will undermine the pointer algorithm: users cannot step from an element to the next element without knowing our debugging information. Another option is that we can maintain an additional data structure. For example, we can maintain a STD: Set that saves the information of the active object. Every time a user creates an object with construct, add an address to the set. Each time a user destroys an object with destory (), the address is removed.
This technology is simple and elegant, and the reason why it cannot work is subtle. The problem is that users do not have to use the same allocator to create and destroy objects: users only need to use an equivalent allocator. Think about the following operations:
L the user is given a Allocator, A1, and then a copy of it, A2.
L The user creates a new object with a1.construct (p, X.
L The user uses a2.destroy () to destroy objects.
This sequence is legal, but maintaining the Allocator of the active object list indicates that it is an error: we add the new object to the Active Object List of A1, however, when an object is destroyed with A2, it cannot be found.
Can this problem be bypassed by sharing the list among all given copies of Allocator? Maybe, but we will be stuck in the definition issue. If:
My_allocator <int> A1;
My_allocator <double> A2 (A1 );
Should A1 and A2 share the same list of activity objects? (The answer may seem "no". What should I do when my_allocator <int> (A2) is used ?) We are also stuck in implementation problems: shared objects need to be specially processed behind the scenes, especially when there is a concurrency problem.
Because of these problems, I have simply abandoned the idea of conducting substantive checks on construct () and destory. This version of debugging Allocator only performs a minimal check on their parameters, and does not try to ensure that the destory () parameter is constructed, or this object is destroyed only once.
The main purpose of this debugging allocator is to ensure that allocate () and deallocate () are used in the same way. When memory is allocated, we reserve two word memories at the beginning of each memory and record the number of elements in the memory block, and a hash code that originated from the type (from the type name, especially for example, typeid (t ). name ). Then we keep another word at the end of the memory block, and store another copy of the hash code as the Sentinel. Then, when deallocate () memory block is used, we check that the number of stored elements is the same as the input parameter, and both hash codes are correct. We call assert () so that an inconsistent behavior will cause the program to fail.
This does not give us all the expected checks, but it lets us ensure that the memory passed to deallocate () is allocated by the Allocator and is allocated and returned for the same type, the number is also consistent. It also provides limited protection for the cross-border: an error that crosses the border in any direction will overwrite the sentry, and this error will be detected when it is returned.
One allocator adaptor
So far, I have not displayed any code, because I have not described any precise form of debugging allocator. A simple choice is to implement debugging Allocator Based on malloc or STD: allocator. In this case, only one template parameter is required: the type of the object to be allocated. This is not as common as we expected: users cannot use a custom Allocator for testing. To be more generic, it is best to write debugging Allocator as an allocator adaptor. (Another motivation for doing so, I admit, is for the purpose of teaching: So we can explore the general characteristics of Allocator adaptor .)
Writing an allocator adaptor produces two new problems. First, we cannot make any assumptions about what is being adapted. We cannot take Allocator: the pointer type is T * or put things in the uninitialized memory (even if it is a built-in data type, such as char and INT ). We must earnestly use construct () and destroy (). Although annoying, you only need to pay attention to it.
The second problem is a design question: what should the template parameters of our debugging Allocator look like? The first idea may be that there should be only one template parameter: The template parameter of a template. However, this is not general enough: the template parameters are only good for the real parameters of a specific number, but Allocator does not have such a limit. The template parameter is sufficient for the default Allocator (STD: Allocator <t>), but it cannot be used for custom Allocator with additional parameters, such as my_allocator <t, flags>.
How about a common template parameter? We may want to write:
Template <class Allocator>
Class debug_allocator;
Fast. You only need to write debug_allocator <STD: Allocator <int>. Unfortunately, there is another problem. The value_type of Allocator may be void, which is useful and important in some cases (as described in previous articles ). What will happen if the user writes this?
Typedef debug_allocator <STD: Allocator <int>;
Typedef typename A: Template rebind <void >:: other A2;
The problem is that the value_type of A2 is void, while some internal things of Allocator are not valid for void. For example, if you have a reference typedef, and void & is meaningless, it will cause compilation errors. The default Allocator has a special attribute, STD: Allocator <void>. For no reason, we also need a special feature.
We cannot explicitly indicate that when the value_type of allocator is void, we need a special version of debug_allocator <Allocator>. But there is a good method. We can give debug_allocator the second template parameter, which defaults to Allocator: value_type. Therefore, we can write a special parameter when the second template parameter is void. The second template parameter is completely Implementation Details: you do not need to explicitly write it out. You can deduce it by writing (for example) debug_allocator <STD: Allocator <int>.
With this method, it is not difficult to combine all things: the complete debug Allocator can be found in List 1. You may find that debug_allocator is useful for memory usage when you need to check your container class, but more importantly, you can use it as a prototype. The implementation skills used on debug_allocator are useful to Your Own Allocator adaptor.
Listing 1: complete implementation of the debugging Allocator
Template <class Allocator, class T = typename Allocator: value_type>
Class debug_allocator {
Public: // typedefs from underlying allocator.
Typedef typename Allocator: size_type;
Typedef typename Allocator: difference_type;
Typedef typename Allocator: pointer;
Typedef typename Allocator: const_pointer;
Typedef typename Allocator: Reference reference;
Typedef typename Allocator: const_reference;
Typedef typename Allocator: value_type;
Template <Class U> struct rebind {
Typedef typename Allocator: Template rebind <u >:: other A2;
Typedef debug_allocator <A2, typename A2: value_type> Other;
};
Public: // constructor, destructor, etc.
// Default constructor.
Debug_allocator ()
: Alloc (), hash_code (0)
{Compute_hash ();}
// Constructor from an underlying allocator.
Template <class allocator2>
Debug_allocator (const allocator2 &)
: Alloc (A), hash_code (0)
{Compute_hash ();}
// Copy constructor.
Debug_allocator (const debug_allocator &)
: Alloc (A. alloc), hash_code (0)
{Compute_hash ();}
// Generalized copy constructor.
Template <class A2, class T2>
Debug_allocator (const debug_allocator <A2, T2> &)
: Alloc (A. alloc), hash_code (0)
{Compute_hash ();}
// Destructor.
~ Debug_allocator (){}
Public: // member functions.
// The only interesting ones
// Are allocate and deallocate.
Pointer allocate (size_type N, const void * = 0 );
Void deallocate (pointer P, size_type N );
Pointer address (reference X) const {return a. Address (x );}
Const_pointer address (const_reference X) const {
Return A. Address (X );
}
Void construct (pointer P, const value_type & X );
Void destroy (pointer P );
Size_type max_size () const {return a. max_size ();}
Friend bool operator = (const debug_allocator & X,
Const debug_allocator & Y)
{Return X. alloc = Y. alloc ;}
Friend bool Operator! = (Const debug_allocator & X,
Const debug_allocator & Y)
{Return X. alloc! = Y. alloc ;}
PRIVATE:
Typedef typename Allocator: Template rebind <char >:: other
Char_alloc;
Typedef typename Allocator: Template rebind <STD: size_t >:: other
Size_alloc;
// Calculate the hash code, and store it in this-> hash_code.
// Only used in the constructor.
Void compute_hash ();
Const char * hash_code_as_bytes ()
{Return reinterpret_cast <const char *> (& hash_code );}
// Number of bytes required to store n objects of Type value_type.
// Does not include the overhead for debugging.
Size_type data_size (size_type N)
{Return N * sizeof (value_type );}
// Number of bytes allocated in front of the User-visible memory
// Block. must be large enough to store two objects of Type
// Size_t, and to preserve alignment.
Size_type padding_before (size_type)
{Return 2 * sizeof (STD: size_t );}
// Number of bytes from the beginning of the block we allocate
// Until the beginning of the Sentinel.
Size_type sentinel_offset (size_type N)
{Return data_size (n) + padding_before (n );}
// Number of bytes in the Sentinel.
Size_type sentinel_size ()
{Return sizeof (STD: size_t );}
// Size of the area we allocate to store n objects,
// Including overhead.
Size_type total_bytes (size_type N)
{Return data_size (n) + padding_before (n) + sentinel_size ();}
Allocator alloc;
STD: size_t hash_code;
};
// Specialization when the value type is void. We provide typedefs
// (And not even all of those), and we save the underlying Allocator
// So we can convert back to some other type.
Template <class Allocator>
Class debug_allocator <Allocator, void> {
Public:
Typedef typename Allocator: size_type;
Typedef typename Allocator: difference_type;
Typedef typename Allocator: pointer;
Typedef typename Allocator: const_pointer;
Typedef typename Allocator: value_type;
Template <Class U> struct rebind {
Typedef typename Allocator: Template rebind <u >:: other A2;
Typedef debug_allocator <A2, typename A2: value_type> Other;
};
Debug_allocator (): alloc (){}
Template <class A2>
Debug_allocator (const A2 & A): alloc (){}
Debug_allocator (const debug_allocator & A): alloc (A. alloc ){}
Template <class A2, class T2>
Debug_allocator (const debug_allocator <A2, T2> & ):
Alloc (A. alloc ){}
~ Debug_allocator (){}
PRIVATE:
Allocator alloc;
};
// Noninline member functions for debug_allocator. They are not defined
// For the void specialization.
Template <class Allocator, class T>
Typename debug_allocator <Allocator, t >:: pointer
Debug_allocator <Allocator, T>: Allocate
(Size_type N, const void * = 0 ){
Assert (n! = 0 );
// Allocate enough space for n objects of type T, plus the debug
// Info at the beginning, plus a one-byte sentinel at the end.
Typedef typename char_alloc: pointer char_pointer;
Typedef typename size_alloc: pointer size_pointer;
Char_pointer result = char_alloc (alloc). Allocate (total_bytes (n ));
// Store the size.
Size_pointer debug_area = size_pointer (result );
Size_alloc (alloc). Construct (debug_area + 0, N );
// Store a hash code based on the Type name.
Size_alloc (alloc). Construct (debug_area + 1, hash_code );
// Store the Sentinel, which is just the hash code again.
// For reasons of alignment, We have to copy it byte by byte.
Typename char_alloc: pointer sentinel_area =
Result + sentinel_offset (N );
Const char * Sentinel = hash_code_as_bytes ();
{
Char_alloc Ca (alloc );
Int I = 0;
Try {
For (; I <sentinel_size (); ++ I)
CA. Construct (sentinel_area + I, Sentinel);
}
Catch (...){
For (Int J = 0; j <I; ++ J)
CA. Destroy (& * (sentinel_area + j ));
Throw;
}
}
// Return a pointer to the user-visible portion of the memory.
Pointer data_area = pointer (result + padding_before (n ));
Return data_area;
}
Template <class Allocator, class T>
Void debug_allocator <Allocator, t >:: deallocate
(Pointer P, size_type N)
{
Assert (n! = 0 );
// Get a pointer to the space where we put the debugging information.
Typedef typename char_alloc: pointer char_pointer;
Typedef typename size_alloc: pointer size_pointer;
Char_pointer CP = char_pointer (P );
Size_pointer debug_area = size_pointer (CP-padding_before (n ));
// Get the size request and the hash code, and check for consistency.
Size_t stored_n = debug_area [0];
Size_t stored_hash = debug_area [1];
Assert (n = stored_n );
Assert (hash_code = stored_hash );
// Get the Sentinel, and check for consistency.
Char_pointer sentinel_area =
Char_pointer (debug_area) + sentinel_offset (N );
Const char * Sentinel = hash_code_as_bytes ();
Assert (STD: equal (Sentinel, Sentinel + sentinel_size (),
Sentinel_area ));
// Destroy our debugging information.
Size_alloc (alloc). Destroy (debug_area + 0 );
Size_alloc (alloc). Destroy (debug_area + 1 );
For (size_type I = 0; I <sentinel_size (); ++ I)
Char_alloc (alloc). Destroy (sentinel_area + I );
// Release the storage.
Char_alloc (alloc). deallocate (CP-padding_before (N), total_bytes (n ));
}
Template <class Allocator, class T>
Void debug_allocator <Allocator, T>: Construct (pointer P, const
Value_type & X)
{
Assert (P );
A. Construct (p, X );
}
Template <class Allocator, class T>
Void debug_allocator <Allocator, t >:: destroy (pointer P)
{
Assert (P );
A. Destroy (P );
}
Template <class Allocator, class T>
Void debug_allocator <Allocator, t >:: compute_hash (){
Const char * name = typeid (value_type). Name ();
Hash_code = 0;
For (; * name! =/'// 0/'; ++ name)
Hash_code = hash_code * (size_t) 37 + (size_t) * Name;
Note:
[1] Matt austern./"the standard librarian: what are allocators good? /"C/C ++ users Journal C ++ Experts Forum, December 2000, <www.cuj.com/experts/1812/austern.htm>.
[2] This debugging allocator is based on the one in the SGI library, <www.sgi.com/tech/stl>. The original version was written by Hans-J. Boehm.