Testing of the template metaprogramming principle and speed in C ++

Source: Internet
Author: User

These two days I have been fascinated by this template's metaprogramming and think it is really a good thing! So curious, I studied it carefully. I have read a few articles about "programming programs ". It sounds amazing.

 

The basic principle is to let the compiler calculate some of the values we need to calculate during compilation. During the running of the program, you do not need to calculate these values, thus improving the running performance of the program. Of course, this will make the program compilation very slow and is generally not commonly used. However, in some cases, we are willing to solve compilation efficiency problems ..

 

First, write an example to see how it works:

# Include <iostream> <br/> using namespace STD; </P> <p> template <int X, int y> <br/> struct add <br/> {<br/> Enum {result = (x> Y )? X + 20: Y + 100 };< br/>}; </P> <p> int main (void) <br/>{< br/> int A = add <2, 3 >:: result; </P> <p> system ("pause "); <br/> return 0; <br/>}

In the above code, we can see that the template structure contains two integer variables. Struct is used to generate enumeration values. We can first theoretically conclude that these enumeration values will be calculated during compilation. Is a transparent data. In addition, the final a value should be 103. Run:

 

Output result: 103

 

The result is correct, but we cannot see whether it was completed during compilation .. What should we do? Therefore, set a breakpoint before int A = add <2, 3>: result; execution, or press F10 to directly break it at the beginning of the main function. Let's look at the disassembly code:

00411b00 push EBP
00411b01 mov EBP, ESP
00411b03 sub ESP, 0cch
00411b09 push EBX
00411b0a push ESI
00411b0b push EDI
00411b0c Lea EDI, [ebp-0CCh]
00411b12 mov ECx, 33 H
00411b17 mov eax, 0 cccccccch
00411b1c rep STOs dword ptr [EDI]

00411b1e mov dword ptr [a], 67 h

00411b25 push offset string "pause" (0000c8h)
00411b2a call @ ILT + 565 (_ system) (41123ah)
00411b2f add ESP, 4

00411b32 XOR eax, eax

 

The above is the disassembly code of the entire main function. We can see that the assembly code in the red above is to assign the enumerated value result to A. See that the direct mov is 67 h = 103! It can be proved that our enumerated values are computed during compilation ..

 

I have read an article about using template elements to solve the loop. We wrote a test for the sum of all elements in an integer array:

 

# Include <iostream>

Using namespace STD;

 

Template <int count, class ty>
Class add
{
Public:
Static int result (TY * ELEM)
{
Return ELEM [0] + Add <count-1, Ty >:: result (ELEM ++ );
}
};

Template <class ty>
Class add <1, Ty>
{
Public:
Static int result (TY * ELEM)
{
Return ELEM [0];
}
};

Int main (void)
{
_ Int64 begtime = 0;
_ Int64 endtime = 0;
_ Int32 sum = 0;

Int A [3] = {4, 5, 6 };

_ ASM
{
Rdtsc
MoV dword ptr [begtime], eax
Lea eax, dword ptr [begtime]
MoV dword ptr [eax + 4], EDX
}

Add <3, int>: result ();

_ ASM
{
Rdtsc
MoV dword ptr [endtime], eax
Lea eax, dword ptr [endtime]
MoV dword ptr [eax + 4], EDX
}

Cout <endtime-begtime <Endl;

Begtime = 0;
Endtime = 0;

_ ASM
{
Rdtsc
MoV dword ptr [begtime], eax
Lea eax, dword ptr [begtime]
MoV dword ptr [eax + 4], EDX
}

For (INT I = 0; I <3; ++ I)
{
Sum + = A [I];
}

_ ASM
{
Rdtsc
MoV dword ptr [endtime], eax
Lea eax, dword ptr [endtime]
MoV dword ptr [eax + 4], EDX
}

Cout <endtime-begtime <Endl;

System ("pause ");
Return 0;
}

 

In the above program, the red part indicates the template meta termination template. I understand it in this way .. The name doesn't matter-it means that when the template parameter 1 is executed by the program, the call is terminated. Of course, there can be two or more limits. The process is as follows:

Call order: Add <3, int>: result (int *)=>Add <2, int >:: result (int *) => Add <1, int >:: result (int *) is converted to recursion...

 

During the compilation, does the compiler recursively call the function to calculate the result? Is it so easy for us to obtain this value at runtime? Let's take a look at the disassembly code before or above add <3, int >:: result (:

 

0041b291 Lea eax, [a]
0041b294 push eax
0041b295 call add <3, int>: result (419a82h)
0041b29a add ESP, 4

 

We followed F11 in the blue area,

Keep following here:

 

0041 bfce mov eax, dword ptr [ELEM]
0041bfd1 mov dword ptr [ebp-0C4h], eax
0041bfd7 mov ECx, dword ptr [ELEM]
0041 bfda add ECx, 4
0041 bfdd mov dword ptr [ELEM], ECx
0041bfe0 mov edX, dword ptr [ebp-0C4h]
0041bfe6 push edX
0041bfe7 call add <2, int>: result (4193cfh)
0041 BFEC add ESP, 4
0041 bfef mov ECx, dword ptr [ELEM]
0041bff2 add eax, dword ptr [ECx]

 

Let's move in again in the blue area,

Till now:

 

0041c6ae mov eax, dword ptr [ELEM]
0041c6b1 mov dword ptr [ebp-0C4h], eax
0041c6b7 mov ECx, dword ptr [ELEM]
0041c6ba add ECx, 4
0041c6bd mov dword ptr [ELEM], ECx
0041c6c0 mov edX, dword ptr [ebp-0C4h]
0041c6c6 push edX
0041c6c7 call add <1, int>: result (41a063h)
0041c6cc add ESP, 4
0041c6cf mov ECx, dword ptr [ELEM]
0041c6d2 add eax, dword ptr [ECx]

 

Let's move in again in the blue area,

Till now:

 

0041cae0 push EBP
0041cae1 mov EBP, ESP
0041cae3 sub ESP, 0c0h
0041cae9 push EBX
0041 caea push ESI
0041 caeb push EDI
0041 CAEC Lea EDI, [ebp-0C0h]
0041caf2 mov ECx, 30 h
0041caf7 mov eax, 0 cccccccch
0041 cafc rep STOs dword ptr [EDI]

0041 cafe mov eax, dword ptr [ELEM]
0041cb01 mov eax, dword ptr [eax]

 

Now we can see the whole process. Here we can understand that the red numbers marked in the above assembly code are computed during compilation. When positioning, we directly input the number after the template element is computed during compilation. However, we do not see that the entire call process is executed during compilation. The articles I have read earlier say that they will be executed during compilation and do not know why. The tests above can be interrupted during the runtime and perform recursive calls.

_ ASM
{
Rdtsc
MoV dword ptr [begtime], eax
Lea eax, dword ptr [begtime]
MoV dword ptr [eax + 4], EDX

}

 

As mentioned in the previous blog, this code is used to obtain the time (in nanoseconds) since the program is running ).

Here we will test the velocity and speed of the cycle after the template element unlocks the loop.

The program output result is:

3090 cycle

130 cycle

 

The cycle here is divided by the CPU clock speed to get the running time, which can be clearly distinguished without calculation --

 

It can be seen that the unlock loop is only recursive, and the real function I think is to calculate the integer value that should be passed in each recursion during compilation. However, recursion overhead is very large, and stack overflow is very easy .. So we have to think deeply about this issue .. However, during the compilation, the benefits of the template element can be reflected when calculating some operations such as enumeration or similar operations in the first example ..

 

I hope you can advise on anything wrong .. I'm a cainiao --

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.