Analysis and consideration of C ++ virtual functions

Last Update:2013-11-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction:
The following are the books I have read and the processes and summaries I have thought about. I have analyzed the C ++ virtual functions. The analysis is not in-depth enough, however, I believe it will be helpful to understand the virtual functions of c ++. Now I have only written some of the details of the single inheritance, and I will continue to explore it later. I hope that I can do it with a calm mind, and I will not be able to afford it myself.
The following content is suitable for some friends who know some C ++ virtual functions and are relatively basic for pointer operations, because many strong pointers are transferred to verify their own thinking, the following tests require actual code operations. It is not interesting if you just want to see the conclusion. I created a local vs project, started with the simplest test, and then added the Code a little bit. a little deeper, the code writing is messy, just for your own testing, I hope my friends will forgive me.

The Code is as follows:
Shape. h
Log. h
Typedef. h
Main. cpp

Shape. h:
# Ifndef _ shape1_head_h
# Define _ shape1_head_h

# Include "typedef. h"
# Include "log. h"

Class CShape
{
Public:
CShape (){
TRACE_FUCTION_AND_LINE ("");
M_color = 15;
}
~ CShape (){
TRACE_FUCTION_AND_LINE ("");
}
Void SetColor (int colore ){
TRACE_FUCTION_AND_LINE ("");
M_color = colore;
}
Protected:
Private:
Int m_color;
};

Class CRect: public CShape
{
Public:
CRect () {TRACE_FUCTION_AND_LINE (""); m_width = 0; m_height = 255 ;}
~ CRect () {TRACE_FUCTION_AND_LINE ("");}
Void PrintMemory (){
TRACE_FUCTION_AND_LINE ("this: % p", this );
Int * p = (int *) this;
TRACE_FUCTION_AND_LINE ("4: % d", * p );
TRACE_FUCTION_AND_LINE ("4: % d", * (p + 1 ));
TRACE_FUCTION_AND_LINE ("4: % d", * (p + 2 ));
}
Protected:
Private:
Int m_width;
Int m_height;
};
# Endif

Log. h:
# Define TRACE_FUCTION_AND_LINE (fmt ,...) printf ("[% 30 s: % 4d]" fmt "\ n" ,__ FUNCTION __, _ LINE __, ##__ VA_ARGS __)

Typedef. h:
Only some macro definitions of cross-platform platforms will not be listed, and our problems will not play a major role.

Main. cpp:
# Include <iostream>
Using namespace std;
# Include "Shape. h"

Int main ()
{
CRect rect1;

TRACE_FUCTION_AND_LINE ("sizeof (CShape): % d", sizeof (CShape ));
TRACE_FUCTION_AND_LINE ("sizeof (CRect): % d, % p", sizeof (CRect), & rect1 );
Rect1.PrintMemory ();
Rect1.SetColor (10 );
Rect1.PrintMemory ();
Return 0;
}

Question 1:
How to arrange the memory of a derived class and a base class?
From the rect1 print memory operation in main. cpp, we can see that the derived class occupies 12 bytes of memory, namely the m_color of the base class and its own two int members.
The base class occupies 4 bytes of memory, and the SetColor function itself does not occupy any memory.

Truth: the memory occupied by an object consists of three parts: non-static data members (their own and their parent classes), vptr (described later), and byte alignment.
Therefore, do not arbitrarily say that c ++ occupies more memory than C. In fact, it is a vptr problem. byte alignment also occurs in the struct. byte alignment is transparent to the upper layer, therefore, you do not need to be too frequent.

How do Derived classes call non-virtual public functions of the base class, such as SetColor in this example?
1. For the SetColor method, the compiler will compile it into SetColor (int colore, CShape * pShape); For the rect method, the pure C method is used for calling, there is no additional overhead.
Rect1.SetColor (color) is expanded to SetColor (color, & rect1). Therefore, the address of rect1 is transferred to pShape, and then values are assigned to m_color by calling pShape-> m_color.
For the compiler, it only shows that the passed parameter address is & rect1, and it does not know what its actual type is. For any type, it will be converted to the CShape type by SetColor, so this raises a question: how does the compiler know the actual m_color address offset of rect1? In fact, it does not know that it is of the CRect type. when the SetColor FUNCTION command is run, it only offsets a specific address (0 for this example) in CShape mode based on the input pShape address, and then assigns a value. It can be determined that for the subclass CRect, the value is also assigned based on the address with the offset of 0. In other words, the memory of the subclass object has a block of memory that is the parent class, in addition, the memory of the parent class must be in the first half of the memory block. If you do not assign a value to the address with the address offset of rect1 being 0, it is possible to assign a value to another data member instead of m_color.

2. How to verify this idea?
1) According to the memory printing in the above example, we can see that the m_color memory of rect1 is indeed in the first half of the address.
2) You can pass a false CShape type to SetColor to see if it is indeed assigning values to the first four bytes of the dummy object?
The test code is as follows:
Struct FakeCRect {
Int fake1;
Int fake2;
Int fake3;
Int fake4;
} Fakerect = {1, 1, 1 };
TRACE_FUCTION_AND_LINE ("fakerect: % d, % d, % d", fakerect. fake1, fakerect. fake2, fakerect. fake3, fakerect. fake4 );
CRect * pfakerect = (CRect *) & fakerect;
Pfakerect-> SetColor (20 );
TRACE_FUCTION_AND_LINE ("fakerect: % d, % d, % d", fakerect. fake1, fakerect. fake2, fakerect. fake3, fakerect. fake4 );

The output is as follows:
Main: 23] fakerect: 1, 1, 1, 1
Main: 26] fakerect: 20, 1, 1, 1
We can see that it is indeed the first byte that has changed. If you pass in a false CRect type, it is still processing its first int variable. You can even pass an array of char type to test it.

3. Extended extension, an exception in this case.
1) The above example does not consider the case of a virtual function. Now we add a virtual function and the display method to the parent class and subclass respectively. Subclass inherits the virtual function of the parent class and overwrites this method.
Virtual void display (){
TRACE_FUCTION_AND_LINE (""); // print the name and row number of the current function to determine which class of the display method is called.
}
At this time, we can see that the memory size of the parent class and the Child class has changed, adding four bytes, respectively 8, 16, in addition, according to the PrintMemory subclass, we can clearly see that the added memory occupies the first four bytes of the object, and the rest remains unchanged. The virtual function mechanism makes it a reality to use the pointer of the base class to point to different objects for polymorphism. PShape-> display ();
When pShape points to different objects, different methods are called. You can create a CCircle class that inherits from the CShape class to observe the results.
CShape * pShape = new CCircle ();
PShape-> display (); // call the display method of CCircle
PShape = new CRect;
PShape-> display (); // call the display method of CShape
If you are not quite clear about the usage of virtual functions, or you are not going to explore the implementation principles of virtual functions, we recommend that you do not read the content.
As long as a class has a virtual function (inherited or itself), it will have a vptr, and vptr is a pointer to execute a vbtl virtual function table. You can think of vbtl as a pointer array. Its array elements are function pointers, pointing to its own virtual functions. Generally, there is a pointer to the structure of typeinfo, to identify the runtime type.
To put it simply, CRect has a virtual function display, so its vbtl has two elements, one being the typeinfo pointer and the other being the dispaly method pointer.

2) Let's take a look at typeinfo:
CShape * pShape = & rect1;
If (typeid (rect1) = typeid (CRect) & typeid (* pShape) = typeid (CRect )){
TRACE_FUCTION_AND_LINE ("rect1 is type CRect ");
TRACE_FUCTION_AND_LINE ("rect1 name :( % s) raw_name :( % s)", typeid (rect1). name (), typeid (rect1). raw_name ());
}
The above Code clearly shows the type of an object at runtime. Even if the address of rect1 is converted to the parent class pointer, the actual type of the object can be determined.
Typeinfo is a class. It is statically allocated during compilation by the compiler. Every object of a class with a virtual function will have a pointer pointing to it, the object pointer of the same class should be consistent. Next we will test this conjecture.
Some books refer to the structure pointer of typeinfo on The 0th items in the vtbl virtual table, but I cannot find it in vc ++ compilation and debugging, the display address of the virtual function is displayed on the 0th items.
So I changed it to a linux-like environment and tested it in MINGW. I found the typeinfo address on the-1 item option. But I still cannot find it on windows. I guess it is on the item option of-1, but this option should have other things, the main reason for my inference is that apart from the-1 and 0item locations, their front and back addresses are both 0. the position of the-1 item should be encapsulated by Microsoft, rather than simply pointing to the typeinfo structure.
The following is the test code:
Const type_info * ptypeidinfo = & (typeid (rect1 ));
TRACE_FUCTION_AND_LINE ("ptypeidinfo: % p", ptypeidinfo );
Int * p = (int *) & rect1;
Int * pp = (int *) (p [0]); // vptr
Type_info * prectinfo = (type_info *) (* (pp-1); // pp-0: virtual function address
TRACE_FUCTION_AND_LINE ("prectinfo: % p", prectinfo );
Vs2008 output:
Main: 36] ptypeidinfo: 003A9004
Main: 40] prectinfo: 003A7748
MINGW output:
Main: 36] ptypeidinfo: 004085a4
Main: 40] prectinfo: 004085a4
In vs2008 on the pp-2 and pp + 1 are viewed, are 0 address, so you can guess typeinfo must be related to the pp-1, but is encapsulated.

3) return to the problem (the above example does not consider the case with a virtual function ):
Both the parent class and the Child class have a virtual function, so that the parent class of each child class object is automatically shifted downward. Therefore, we can see that fakerect is the second int variable changed, this is easy to understand. For the SetColor method, the parameter is CShape, so it is assigned a value for the m_color offset address of CShape (4 here. In another case, the parent class does not have a virtual function, and the Child class has a virtual function. In this case, the parent class occupies 4 memories, and the Child class still occupies 4 vptr memories.
Test: you can comment out the display virtual function of the parent class. The output is as follows:
Fakerect: 1, 1, 1, 1
Fakerect: 1, 20, 1, 1
In this case, the parent class object is not located in the starting part of the Child class object. The starting part of the Child class object is vptr, and SetColor is offset based on its own parameter CShape type. For CShape, it only sees the m_color of four bytes. When it assigns values to m_color, it will assign values to the first four bytes of the pShape pointer, however, the first address of the fackrect object passed in is the fake1 variable, and the change is indeed the fake2 variable. What is the problem ??
The first address passed in the past is fake1, And the SetColor change is also the first address passed in the past, and the final change is fake2. This must be caused by some internal conversion mechanisms, the guess should be about passing the past (CRect * pfakerect = (CRect *) & fakerect; pfakerect-> SetColor (20 );) when the pfakerect pointer is forced to CShape *, the address is automatically offset by four bytes. The following is a test:

CShape class SetColor method, print this pointer address:
Void CShape: SetColor (int colore)
{
// TRACE_FUCTION_AND_LINE ("");
TRACE_FUCTION_AND_LINE ("pShape: % p", this );
M_color = colore;
}
Main. cpp program, print the pfakerect address passed in the past:
CRect * pfakerect = (CRect *) & fakerect;
TRACE_FUCTION_AND_LINE ("fakerect: % p", pfakerect );
Pfakerect-> SetColor (20 );
Output result:
Main: 27] fakerect: 002DF958
CShape: SetColor: 12] pShape: 002DF95C
We can clearly see that the address passed in is 002DF958, while the first address processed by setColor during running is 002DF95C. This should be the automatic offset when pfakerect is converted to pShape.

To better understand this change, add the test code directly in main. cpp:
Main. cpp:
CRect * pfakerect = (CRect *) & fakerect;
TRACE_FUCTION_AND_LINE ("fakerect: % p", pfakerect );
Pfakerect-> SetColor (20 );
CShape * pfakeshape = pfakerect;
TRACE_FUCTION_AND_LINE ("pfakeshape: % p", pfakeshape );
The output result is as follows:
Main: 27] fakerect: 003EF77C
CShape: SetColor: 12] pShape: 003EF780
Main: 30] pfakeshape: 003EF780
It can be seen that the fakerect address is 003EF77C, and the four bytes are offset when the strong conversion is performed during SetColor. Of course, when the strong conversion is directly performed through CShape * pfakeshape = pfakerect, the offset is still 4 bytes.
Conclusion:
For SetColor, a pure method, it is located in the CShape class. It only sees members of the CShape class. When assigning values to Members, it also assigns values based on the description results of its own class, however, when assigning CRect * To CShape *, the compiler will view the description of this class and find that the CShape class does not have virtual functions. The CRect class has virtual functions, therefore, when CRect * is converted to CShape *, the compiler automatically offsets four bytes. I think this should be done automatically during compiler compilation. Since I have not yet reached the point where I can understand the target code generated by the compiler, I have not continued to trace it down. If you can understand the target code generated by the compiler, you can trace it, in the target code for converting pfakerect to pfakeshape, pfakeshape has automatically shifted 4 bytes. I think it should be the action of the compiler during compilation.

I want to make a simple summary:
OK. Based on the above description (it's messy and its language expression ability is not good. You can write and test it while thinking. Hope you understand it. If you have any questions, leave a message ), most of the above discussions are about the assignment of m_color, a common member variable, and the running mechanism of SetColor, a non-virtual function.
1. First, we will introduce the memory layout of the inherited parent class and sub-classes, including common member variables, vptr, and byte-aligned memory (to verify the byte alignment of friends, you can remove the virtual function and add a double member variable in CRect to see the size change of CRect .), You may ask if the static data member is in the object. You can give a clear answer. No, the static object Member is in the class, not the object, so it is impossible to store a copy for each object. The above three methods are actually the memory usage of the entire object.
2. Then we talk about the running mechanism of SetColor, a common member function. The target Code Compiled by SetColor is actually implemented in the C mechanism and rect1.SetColor (10 ); in this call, SetColor (10, & rect1) is actually called in C mode. This is verified by adding a common SetColor function, it also tested the time consumption, and the difference was not big. Also discussed through fakerect also determine the SetColor operating mechanism, it can be regarded as a very common C value assignment, no matter what type you pass in, all it needs to do is to offset the transmitted address by a certain m_color value and then assign a value.
3. Then we discussed the extended part, briefly describing the impact of virtual functions on the object memory size. I accidentally pulled it far away and talked about the typeinfo structure. It also said that typeinfo occupies a table item in the vbtl pointer table pointed to by vptr, the structure of typeinfo can be read from vbtl through type forced conversion and the address of the structure obtained through typeid are compared. The difference between Vs2008 and MINGW is discussed, it is based on the typeinfo of a class and only has one memory. Every object (containing virtual functions) will have a pointer pointing to it (to achieve dynamic type identification at runtime), and typeid can get it (this should be set by the compiler, the compiler allocates the typeinfo memory and returns its address to the typeid command ).
4. Then we discussed the non-virtual function of the parent class object and the virtual function of the subclass object. How does the internal mechanism run during SetColor. It is verified again that for SetColor, it only determines the offset of m_color Based on the CShape result of its own object. For the parent class, there is no virtual function, and the subclass has a virtual function, therefore, when the CRect * is assigned to CShape *, the external party is responsible for the default offset of the 4-byte address. This ensures that the SetColor value only takes into account the structure of its own CShape, then, this SetColor can be used to offset m_color based on its own structure during the compilation period, and a unique target code is generated for compilation. External parameters are passed to ensure the validity of the passed code. So we can see that when the CRect * structure is passed to the CShape *, the external side is responsible for transmitting the 4-byte offset to the SetColor pShape. For the SetColor method, it only takes into account its own structure. I also explained that this is the work of the compiler, but it is just a personal guess. In theory, it will be like this for efficiency.
5. After a simple test, the cpu consumption time is basically the same for offset and non-offset cases. It can be roughly determined that the offset is performed during the compilation period. If the offset is performed during the runtime, time consumption should be much larger. However, you need to check and determine the target code.

--------------------------- 2013/5/12 12:22:33 ---------------------------------------------------------
1. I am fascinated by virtual functions. I can't help but look at the virtual function table again:
1) Since there is a virtual function in the virtual function table at the position of pp [0] above, can we call it to verify the correctness? To be honest, it is a bit unreasonable, however
Isn't that unreasonable when taking typeinfo? Good things are used for tossing.
Add the following code to main. cpp:
Const type_info * ptypeidinfo = & (typeid (rect1 ));
TRACE_FUCTION_AND_LINE ("ptypeidinfo: % p", ptypeidinfo );
Int * p = (int *) & rect1;
Int * pp = (int *) (p [0]); // vptr
Type_info * prectinfo = (type_info *) (* (pp-1); // virtual function address
TRACE_FUCTION_AND_LINE ("prectinfo: % p", prectinfo );

Typedef void * pDisplayFunc (CRect * rect );
TRACE_FUCTION_AND_LINE ("test begin :");
PDisplayFunc * pfunc = (pDisplayFunc *) (* pp );
Pfunc (& rect1 );
TRACE_FUCTION_AND_LINE ("test end ..");
Pp is the value of vptr, and it points to a vtbl. Therefore, 0th items of its table can be retrieved. * pp is the value of its 0th items. It is a function pointer, is directed to the display function, so according to the above discussion, we can determine that its C function prototype should be void display (CRect *);, so it is strongly converted to pDisplayFunc, then call it through pfunc (& rect1), and then you will find that it is actually the display method of the call CRect.
The output is as follows, and the display method of the call CRect is displayed:
[Main: 47] test begin:
[CRect: display: 44]
[Main: 50] test end ..
2) Click "PASS Parameters ".
Add another virtual function to the CRect class in Shape. h:
Virtual void display1 (int I ){
TRACE_FUCTION_AND_LINE ("I = % d", I );
}
Main. cpp:
Typedef void * pDisplayFunc1 (int I, CRect * rect );
TRACE_FUCTION_AND_LINE ("test begin :");
PDisplayFunc1 * pfunc1 = (pDisplayFunc1 *) (* (pp + 1 ));
(* Pfunc1) (8888, & rect1 );
TRACE_FUCTION_AND_LINE ("test end ..");
The preceding virtual function table has not been described in detail. Add a virtual function, which is located on the top of the 1st item table items and can be obtained through * (pp + 1, the parameter passed by the call is 8888.
The output is as follows:
[Main: 53] test begin:
[CRect: display1: 47] I = 8888
However, my vs2008 reported a memory access error later. I don't know why, but I can see that the function is still called, which proves that the expectation is good. If MINGW is used in linux, the result is printed normally. The security check degree of the two environments should be different.

3) Try the C function again.
The function address should be determined during the compilation period. I don't know how to print it out?
Let's look at it again: Since pfunc1 is the function address of display1, I don't need to input the & rect1 parameter at all, and pass a NULL verification conjecture. In (* pfunc1) (8888, & rect1); add a call (pfunc1) (8899, NULL). As I wish, 8899 is printed normally,
Print the result (comment out the previous sentence first, because this method causes memory access errors ):
[Main: 53] test begin:
[CRect: display1: 47] I = 8899
4) continue with the new style and change vtbl to see the effect.
Exchange the content of vtbl's 0th items and 1st items, so what will happen when you call display? You can't wait.
Unfortunately, it is depressing because the memory protection does not allow writing. But what can block our interests? You protect, I create a rect myself and a vtbl myself. The basic idea is to create a char array of rectbuf to save the memory of rect1, create a rectvtbl project to save vtbl, and convert the first four bytes of the rectbuf array into a vptr pointer pointing to the item0 address of rectvtbl, because this vbtl table is our own memory, it can be accessed, and then we can exchange the values of the two function pointers of the rectvtbl table item.
During the test, display and display1 are exchanged. vs2008 always says that memory access is incorrect. It is assumed that one of the two functions has a parameter and no parameter, and a virtual function display2 is added, it is the same as display1.
Shape. cpp:
Virtual void display2 (int I ){
TRACE_FUCTION_AND_LINE ("I = % d", I );
}
Main. cpp:
Char rectbuf [sizeof (rect1)];
Memcpy (rectbuf, & rect1, sizeof (rect1); // allocate memory to save the memory of rect1
Char rectvtbl [sizeof (int *) * 4];
Memcpy (rectvtbl, pp-1, sizeof (int *) * 4); // Memory Allocation Error rectvtbl memory
Int * prectbuf = (int *) rectbuf;
Int * pvtbl = (int *) rectvtbl + 1); // point pvtbl to the item0 project of the virtual table, because the first four bytes are typeinfo pointers.
Prectbuf [0] = (int) (pvtbl );

CRect * fakecharRect = (CRect *) rectbuf;
// FakecharRect-> display ();
TRACE_FUCTION_AND_LINE ("before deal: fakecharRect. display .....");
FakecharRect-> display1 (100 );
FakecharRect-> display2 (100); // print and call normally. Call display1 first, and then display2

Long temp = * (pvtbl + 1 );
* (Pvtbl + 1) = * (pvtbl + 2 );
* (Pvtbl + 2) = temp; // exchange the item1 and item2 items of the virtual table, that is, display1 and display2.

TRACE_FUCTION_AND_LINE ("after deal: fakecharRect. display .....");
// FakecharRect-> display ();
FakecharRect-> display1 (100 );
FakecharRect-> display2 (100); // print, surprised to call. Call display2 first, and then display1 (caused by table switching)
The output result is as follows:
[Main: 61] before deal: fakecharRect. display ....
[CRect: display1: 54] I = 100
[CRect: display2: 57] I = 100
[Main: 69] after deal: fakecharRect. display .....
[CRect: display2: 57] I = 100
[CRect: display1: 54] I = 100
[CRect ::~ CRect: 34]
From the function name printed in the log, you can see that you call diplay1 before processing, and then call display2. It changes after processing.
The test above can verify several conclusions. Don't look at objects and other things too mysterious. objects are memories. You can create objects through CRect rect1, you can also create an object using a char array, depending on you. What is a function? A function is just an address, which is determined during compilation. Because we know some mechanisms of the virtual function, the memory in the virtual function table project actually represents which function, we can swap the memory of the table project, the display1 function actually calls display2. If you want to, you can write a function and assign the function address to the table project, then when you call the class's diplay function, you will be surprised to find that it actually calls an external C function, which will definitely surprise people around you. Of course, this is not for the sake of popularity. Understanding it can help us better control the behavior of our programs. When there are some strange situations in the program, think carefully about the internal execution logic of the program, and you will be very open (believe me, what you encounter will not be so in-depth ).

I want to summarize the following:
OK. Through the description of the virtual function, we know that we can call the function by taking the virtual function table project. It can be like a real C function, you can change the table items of the virtual function to manipulate the call of the function. You can switch between them and assign values to the table items of the virtual function. In this way, you will no longer feel mysterious about the virtual function. It is actually a function pointer array that stores a bunch of virtual functions of the current class. The stored virtual functions can be used for dynamic calling at runtime. For non-virtual functions, the compiler determines which function to call, so it will directly write the call to the target code. I also mentioned that I don't know the function address of the class. I suddenly came up with a method and tested it.
The Code is as follows:
Char buf [100];
Sprintf (buf, "% d", & (CRect: PrintMemory); // The purpose of sprintf is mainly because the compiler prohibits the strong conversion of function pointers to int during compilation, I made a detour and did it.
* (Pvtbl) = atoi (buf );
TRACE_FUCTION_AND_LINE ("after set CRect: PrintMemory function ......");
FakecharRect-> display (); // print, surprised to call. Call the PrintMemory Method
Make sure that the PrintMemory method is called. In addition, I also tested the vptr of different CRect objects, which are indeed the same. One Class and one memory.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis and consideration of C ++ virtual functions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis and consideration of C ++ virtual functions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support