C + + Template classes

Last Update:2018-07-25 Source: Internet

Author: User

Tags int size

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Understanding the compiler's compilation template process

How to organize a template program

Objective
Often encountered to ask whether the template is easy to use the question, my answer is: "The use of templates is easy, but the organization is not easy to write." Look at the types of templates that we can meet almost every day, such as STL, ATL, WTL, and Boost's template classes, all of which can be tasted: the interface is simple, the operation is complex.

I started using the template 5 years ago, when I saw the MFC container class. I didn't have to write a template class myself until last year. But when I needed to write my own template class, the first thing I encountered was the fact that the "traditional" programming method (defined in the *.h file declaration and in the *.cpp file) could not be used for the template. So I took some time to understand the problem and how to solve it.

This article is about programmers who are familiar with templates but don't have much experience writing templates. This article covers only the template classes and does not involve template functions. But the principle of elaboration is the same for both.

The problem arises
Use the following example to illustrate the problem. For example, there is a template class array in the Array.h file:
Array.h
Template <typename T, int size>
Class Array
{
T Data_[size];
Array (const array& other);
Const array& operator = (const array& other);
Public
Array () {};
t& operator[] (int i) {return data_[i];}
Const t& Get_elem (int i) const {return data_[i];}
void Set_elem (int i, const t& value) {Data_[i] = value;}
Operator t* () {return data_;}
};

The above template is then used in the main function in the main.cpp file:
Main.cpp
#include "Array.h"

int main (void)
{
Array<int, 50> Intarray;
Intarray.set_elem (0, 2);
int firstelem = Intarray.get_elem (0);
int* begin = Intarray;
}

Compile and run are normal at this time. The program first creates an array of 50 integers, then sets the first element value of the array to 2, reads the first element value, and finally points the pointer to the array start point.

But what happens if you write in a traditional way. Let's take a look at:

Split the Array.h file into Array.h and array.cpp two files (main.cpp remain unchanged)
Array.h
Template <typename T, int size>
Class Array
{
T Data_[size];
Array (const array& other);
Const array& operator = (const array& other);
Public
Array () {};
t& operator[] (int i);
Const t& Get_elem (int i) const;
void Set_elem (int i, const t& value);
operator t* ();
};

Array.cpp
#include "Array.h"

Template<typename T, int size> t& array<t, Size>::operator [] (int i)
{
return data_[i];
}

Template<typename T, int size> const t& array<t, Size>::get_elem (int i) const
{
return data_[i];
}

Template<typename T, int size> void array<t, Size>::set_elem (int i, const t& value)
{
Data_[i] = value;
}
Template<typename T, int size> array<t, size>::operator t* ()
{
return data_;
}

There are 3 errors at compile time. The problem comes out:
Why do mistakes occur in the first place.
Why only 3 links are wrong. There are 4 member functions in the array.cpp.

To answer the above questions, you need to get a deeper understanding of the template instantiation process.

Template instantiation
The most common mistake programmers make when working with template classes is to treat the template class as some type of data. The term "type parameterization" (parameterized types) leads to this misconception. A template is certainly not a data type, and a template is a template, just like its name:

The compiler uses templates to create data types by replacing template parameters. This process is template instantiation (instantiation).
The type that is created from the template class is called a special case (specialization).
Template instantiation depends on how the compiler can find the code available to create the special case (called the instantiation element,
Point of instantiation).
To create a special case, the compiler sees not only the declaration of the template, but also the definition of the template.
The template instantiation process is slow, that is, it can only be instantiated with the definition of a function.

Looking back at the example above, you can see that array is a template, Array<int, 50> is a template instance-a type. To create Array<int from an array, the 50> process is the instantiation process. The instantiation elements are embodied in the Main.cpp file. In the traditional way, the compiler sees the declaration of the template in the Array.h file, but does not have the definition of the template, so the compiler cannot create the type Array<int, 50>. But this is not an error, because the compiler thinks that the template is defined in other files, leaving the problem to the linker to process.

Now, what happens when compiling array.cpp? The compiler can parse the template definition and check the syntax, but it cannot generate the code for the member function. It cannot generate code because to generate code, you need to know the template parameters, that is, a type is required, not the template itself.

In this way, the linker cannot find the Array<int, 50> definition in either main.cpp or array.cpp, and then reports an error with no defined members.

So we answered the first question. But there is a second problem, in which there are 4 member functions in the Array.cpp, why the linker only reported 3 errors. The answer is: an instantiated inertia causes this behavior. Operator[] is not used in main.cpp, and the compiler has not instantiated its definition yet.

Solving method
If you recognize the problem, you can solve the problem:
Let the compiler see the template definition in the instantiation element.
Explicitly instantiate the type with a different file, so the linker can see the type.
Use the Export keyword.

The first two methods are often referred to as inclusion patterns, and the third method is called detach mode.

The first approach means that you include not only the template declaration file but also the template definition file in the transform file that uses the template. In the example above, this is the first example, where all the member functions are defined in the inline function in Array.h. Or include the Array.cpp file in the main.cpp file. This allows the compiler to see the Declaration and definition of the template, and thus generate Array<int, 50> instances. The disadvantage of this is that the compilation file becomes very large, obviously to reduce the compilation and link speed.

The second method obtains the type through an explicit template instantiation. It is a good idea to place all of the explicit instantiation procedures in a different file. In this example, you can create a new file Templateinstantiations.cpp:
Templateinstantiations.cpp
#include "Array.cpp"

Template class Array <int, 50>; An explicit instantiation

Array<int, 50> type is not produced in Main.cpp, but in Templateinstantiations.cpp. The linker can then find its definition. In this way, there is no huge header file to speed up the compilation. and the header file itself appears more "clean" and more readable. But this method does not get the benefit of lazy instantiation, that is, it will explicitly generate all member functions. Also maintain templateinstantiations.cpp files.

The third approach is to use the Export keyword in the template definition, leaving the compiler to handle the rest. When I was in
Stroustrup was very excited when he read export in his book. But soon found that VC 6.0 does not support it, and later found that there is no compiler to support this keyword (the first compiler to support it to be published by the end of 2002). Since then, I have read a lot of articles about export, and learned that it can hardly solve the problem that the inclusion pattern solves. For more export keywords, it is recommended to read the articles written by Herb Sutter.

Conclusion
To develop a template library, it is necessary to know that the template class is not called the "original type", to use other programming ideas. The purpose of this article is not to scare programmers who want to do template programming. On the contrary, it is to remind them to avoid making mistakes when starting template programming.

//////////////////////////////
Http://www.cnblogs.com/xgchang/archive/2004/11/12/63139.aspx
Even when you define a non inline function, all of the declarations and definitions are also placed in the header file of the template. This seems to violate the usual header file rule: "Do not put anything before allocating storage", this rule is to prevent multiple definition errors at the time of connection. But the template definition is very special. Anything handled by template<...> means that the compiler does not allocate storage space for it at the time, and it waits until it is told by a template instance. At one point in the compiler and connector, there is a mechanism to remove multiple definitions of templates, so in order to be easy to use, almost always put all the template declarations and definitions in the header file.

Why the C + + compiler cannot support separate compilation of templates
Liu Weipeng (Pongba)/Wen

First, the C + + standard mentions that a compilation unit [translation units] Refers to a. cpp file and all the. h files it includes, the code in the. h file is extended to the. cpp file that contains it, and the compiler compiles the. cpp file as an. obj file, which has pe[portable executable, The Windows executable file format, and contains the binary code itself, but it is not necessarily able to execute because there is no guarantee that there must be a main function. When the compiler compiles all the. cpp files in a project in a separate way, it is connected by a connector (linker) into an. exe file.
As an example:
---------------Test.h-------------------//
void f ();//Here declare a function f
---------------Test.cpp--------------//
#include "test.h"
void F ()
{
...//do something
///This implements the F function declared in Test.h
---------------main.cpp--------------//
#include "test.h"
int main ()
{
f (); Call F,f with external connection type
}
In this example, test. CPP and Main.cpp are compiled into different. obj files [named Test.obj and Main.obj], In main.cpp, the F function is invoked, but when the compiler compiles main.cpp, it only knows a declaration about void F () in the Test.h file contained in main.cpp, so the compiler sees the f here as an external connection type, That is to say that its function implementation code is in another. obj file, this example is test.obj, that is, there is actually no single line of binary code on the F function in Main.obj, and the code actually exists in the test.obj that Test.cpp compiles. A call to F in Main.obj will generate only one line of calling instructions, like this:
Call F [the name in C + + is of course processed by mangling[]
At compile time, this call instruction is obviously wrong, because there is no line F implementation code in the Main.obj. How about that. This is the task of the connector, which is responsible for finding the implementation code for f in the other. obj [This example is test.obj], and finding a function entry point address that converts the calling address of call f this instruction to the actual F. It is important to note that the connector actually "joins" the. obj "in the project" into an. exe file, and its most critical task is to look for an external connection symbol in another. obj address, and then replace the original "false" address.
The process, if it is more in-depth, is:
Call F This line of instructions is actually not the case, it is actually called stub, which is a
JMP 0x23423[This address may be arbitrary, but the key is that there is a line of instructions on this address to perform the real call F action. In other words, all calls to f in this. obj file are jmp to the same address, where the real call F is. The advantage of this is that the connector modifies the address as long as it changes the call XXX address of the latter. However, how does the connector find the actual address of F [In this case this is in test.obj] because. obj is the same in the format of. exe, in which there is a symbol import table and symbol export table [Import table and Export tables] Where all the symbols are associated with their addresses. This allows the connector to look for symbols in the Test.obj symbolic export table f[of course C + + mangling] address, and then do some offset processing [since the two. obj files are merged, of course the address will have a certain offset, this connector is clear] Writes the entry in the Main.obj in the Symbol import table for F.
This is the approximate process. The key is:
When compiling a main.cpp, the compiler does not know the implementation of F, all when encountering a call to it simply gives an indication that the connector should look for the implementation body of F for it. This means that there is no one-line binary code for f in Main.obj.
When compiling test.cpp, the compiler found the implementation of F. So the implementation of F [binary code] appears in the Test.obj.
When connected, the connector finds the implementation code [binary] address of f in test.obj [export table by symbol]. Then change the pending call XXX address in main.obj to the actual address of F.
Complete.

However, for the template, you know, the template function code is not directly compiled into binary code, which should have a "materialized" process. As an example:
----------main.cpp------//
Template<class t>
void F (t)
{}
int main ()
{
...//do something
f (10); The call F<int> compiler decides here to give F a f<int> of the present body
...//do other thing
}
That is, if you do not call the f,f in the main.cpp file, you will not be available, so there is no single line of code for F in Main.obj. If you call this:
f (10); The f<int> is manifested.
f (10.0); The f<double> is manifested.
In this way, there is also a binary snippet of f<int>,f<double> two functions in the main.obj. Analogy
However, the visualization requires the compiler to know the definition of the template, isn't it.
Look at the following example: [separating the template from its implementation]
-------------test.h----------------//
Template<class t>
Class A
{
Public
void f (); It's just a statement.
};
---------------test.cpp-------------//
#include "test.h"
Template<class t>
void A<t>::f ()//template implementation, but note: not present
{
...//do something
}
---------------main.cpp---------------//
#include "test.h"
int main ()
{
A<int> A;
A. f (); The compiler doesn't know the definition of a<int>::f here, because it's not in the test.h.
The compiler then had to send a hope to the connector, hoping it would find it in other. obj.
The embodiment of a<int>::f, in this case, is test.obj, however, the latter is really a<int>::f
Binary code. NO ... Because the C + + standard clearly indicates that when a template is not used
Hou it should not be present out, Test.cpp used to a<int>::f it. No.. So Real
There is no single line of binary code on A::F in the test.cpp compiled Test.obj file.
So the connector on the dumbfounded, had to give a connection error
However, if you write a function in test.cpp that calls A<int>::f, the compiler will put it out because at this point [test.cpp], the compiler knows the definition of the template, so it can//is enough to be materialized, so Test.obj's symbolic export table has the a<int>::f of this symbol.
Address, the connector is able to complete the task.
}

The point is that in a decoupled compilation environment, the compiler compiles a. cpp file without knowing the existence of another. cpp file, nor does it look up [it will want to connector when a pending symbol is encountered]. This pattern works well without a template, but it's dumbfounded when it encounters a template, because the template is only materialized when it is needed, when the compiler sees only the template's declaration, it cannot present the template, only creates a symbol with an external connection and expects the connector to have the symbolic address resolved. However, when the. cpp file that implements the template does not use the template's present body, the compiler is lazy to be present, so the entire project. obj can not find a row of template with the current binary code, so the connector also Guizhou

/////////////////////////////////
Http://dev.csdn.net/develop/article/19/19587.shtm
How to organize C + + template code--including mode (inclusion model) Select from sam1111 Blog
Key word Template Inclusion Model
SOURCE C + + template:the Complete Guide

Description: This article is translated from the 6th chapter of the book "C + + template:the Complete Guide". Recently saw the C + + forum often on the template containing patterns of posts, Lenovo to their new templates, but also for similar problems confused, so the translation of this text, hoping to help beginners.

There are several different ways in which template code is organized, and this article describes one of the most popular ways: the inclusion pattern.

Link error

Most C + + programmers organize their non-template code as follows:

• Classes and other types are all placed in header files that have a. HPP (or. h,. h,. hh,. hxx) extension.

• For global variables and (non-inline) functions, only declarations are placed in header files, and definitions are placed in point C files that have a. cpp (or. c,. C,. cc,. cxx) extension.

This organization works well: it makes it easy to access the desired type definition at programming time, and avoids the "variable or function repeat definition" error from the linker.

As a result of the above organization-style conventions, novice template programmers often make the same mistake. The following small procedure reflects this error. As with "Normal code", we define the template in the header file:

Basics/myfirst.hpp

#ifndef MYFIRST_HPP
#define MYFIRST_HPP

Declaration of template

Template <typename t>

void Print_typeof (T const&);

#endif//MYFIRST_HPP

Print_typeof () declares a simple auxiliary function to print some type information. The definition of the function is placed in the point C file:

Basics/myfirst.cpp

#include <iostream>

#include <typeinfo>

#include "myfirst.hpp"

Implementation/definition of template

Template <typename t>
void Print_typeof (T const& x)
{

Std::cout << typeid (x). Name () << Std::endl;

}

This example uses the typeid operator to print a string that describes the type information for the parameter passed in.

Finally, we use our template in another point C file in which the template declaration is #include:

Basics/myfirstmain.cpp

#include "myfirst.hpp"

Use of the template

int main ()
{

Double ice = 3.0;
Print_typeof (ICE); Call function template for type double

}

Most C + + compilers (Compiler) are likely to accept this program without any problems, but the linker (Linker) will probably report an error, indicating the missing function print_typeof () definition.

The reason for this error is that the definition of the template function print_typeof () has not yet been materialized (instantiate). To be able to present a template, the compiler must know which definition should be materialized and what template parameters are used to be materialized. Unfortunately, in the previous example, the two sets of information exist in separate files that are compiled separately. So when our compiler sees a call to Print_typeof (), but does not see a definition of this function as a double type, it simply assumes that such a definition is provided elsewhere and creates a reference to that definition (the linker uses this reference to parse). On the other hand, when the compiler processes myfirst.cpp, the file does not have any indication that it must have a materialized template definition for the special parameters it contains.

The template in the header file

The common solution to the above problem is to use the same method as we do with macros or inline functions: we include the definition of the template in the header file of the declaration template. For our example, we can do this by adding #include "myfirst.cpp" to the end of the myfirst.hpp file, or by including myfirst.cpp files in each point C file that uses our template. Of course, there is a third way to delete the Myfirst.cpp file and rewrite the myfirst.hpp file so that it contains all the template declarations and definitions:

Basics/myfirst2.hpp

#ifndef MYFIRST_HPP
#define MYFIRST_HPP

#include <iostream>
#include <typeinfo>

Declaration of template
Template <typename t>
void Print_typeof (T const&);

Implementation/definition of template
Template <typename t>
void Print_typeof (T const& x)
{

Std::cout << typeid (x). Name () << Std::endl;

}

#endif//MYFIRST_HPP

This type of organization template code is called the inclusion pattern. After this adjustment, you will find that our program has been able to compile, link, and execute correctly.

We can get some observation results from this method. The most notable point is that this approach increases the cost of including myfirst.hpp to a considerable extent. In this case, this overhead is not caused by the size of the template definition itself, but by the fact that we must include the header file used by our template, in this case <iostream> and <typeinfo>. You will find that this eventually leads to thousands of lines of code because header files such as <iostream> also contain template definitions similar to ours.

This is really a problem in practice because it increases the time that the compiler takes to compile a real program. We will therefore validate some of the other possible ways to solve this problem in later chapters. But in any case, the real-world program takes an hour to compile the link is already fast (we've met a program that took days to compile from the source).

Aside from the compile time, we strongly recommend that you organize your template code as much as possible in the inclusion pattern.

Another observation is that the most important difference between a non-inline template function and an inline function and a macro is that it does not expand on the caller side. Conversely, a new copy of this function is generated when the template function is materialized. Since this is an automatic process, the compiler may produce two identical copies in different files, causing the linker to report an error. Theoretically, we don't care about this: this is something the compiler designer should be concerned about. In fact, most of the time everything is working, and we don't have to deal with it at all. However, for large projects that need to create their own libraries, the problem occasionally emerges.

Finally, it should be noted that in our example, the method applied to the Normal template function also applies to the member functions and static data members of the template class, as well as to the template member functions.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More