Item M24: Understanding the cost of virtual functions, multiple inheritance, virtual inheritance, and Rtti
C + + compilers must implement every feature of the language. The details of these implementations are, of course, determined by the compiler, and different compilers have different ways of implementing language features. In most cases, you don't have to worry about these things. However, the implementation of some features has a great impact on the object size and the execution speed of its member functions, so there is a basic understanding of these features and it is important to know what the compiler might be doing behind the scenes. The most important examples of this feature are virtual functions.
when a virtual function is called, the code being executed must conform to the dynamic type of the object that called the function, and the pointer to the object or the type of the reference is unimportant. How can the compiler provide this behavior efficiently?
Most compilers use virtual table and virtual table pointers. Virtual table and virtual table pointers are commonly referred to as VTBL and vptr, respectively.
a VTBL is usually an array of function pointers. (Some compilers use linked lists instead of arrays, but the basic method is the same)
each class in a program has its own VTBL if it declares a virtual function or inherits a virtual function, and the VTBL item in the class is a pointer to the virtual function implementation body. For example, the following class defines:
Class C1 {public: C1 (); Virtual ~c1 (); virtual void F1 (); virtual int F2 (char c) const; virtual void F3 (const string& s); void F4 () const; ...};
The virtual table array for C1 looks like the following:
Note that non-virtual functions F4 are not in the table, and C1 constructors are not.
Non-virtual functions ( including constructors , which are also defined as non-virtual functions) are implemented as normal C functions, so there is no special consideration for their use in terms of performance.
If you have a C2 class that inherits from C1, redefine some of the virtual functions it inherits, and add some of its own virtual functions,
Class C2:public C1 {public: C2 (); Virtual ~C2 (non virtual function); Redefine function virtual void F1 (); Redefine function virtual void F5 (char *str); New virtual function ...};
Its virtual table project points to the function that is appropriate for the object.
These items include pointers to C1 virtual functions that are not C2 redefined ( but virtual destructors that do not contain C1) :
This discussion leads to the first price required for virtual functions: You must make room for each virtual talbe of the class that contains the virtual function.
The size of the VTBL class is proportional to the number of virtual functions declared in the class (including virtual functions inherited from the base class). Each class should have only one virtual table, so the space required for virtual table is not too large, but if you have a large number of classes or a large number of virtual functions in each class, you will find that VTBL consumes a lot of address space.
because each class in the program requires only one VTBL copy, the compiler will definitely encounter a tricky question: where to put it.
most programs and libraries are made up of multiple object (target) files, but each object file is independent. Which object file should contain the VTBL of the given class? You might think of it in an object file that contains the main function, but the library does not have main, and the source file containing main does not involve many classes that need to be vtbl anyway. How does the compiler know that they are being asked to build that VTBL?
A different approach must be taken, and the compiler vendor is divided into two camps for this purpose.
for vendors that provide an integrated development environment (including compilers and connectors), a straightforward approach is to generate a VTBL copy of each object file that might require VTBL. The connector then removes duplicate copies, preserving one instance for each VTBL in the final executable or library.
A more common design approach is to use
heuristic AlgorithmTo determine which object file should contain the VTBL of the class.
The usual heuristic is to generate the vtbl of a class in an object file, requiring that the object file contain the first non-inline, non-pure virtual function of the class (Non-inline non-pure virual function) definition (i.e., the implementation body of the Class). Thus the vtbl of the above C1 class will be placed in an object file containing the C1::~C1 definition (not an inline function), and the vtbl of the C2 class is placed in the object file containing the C1::~C2 definition (not an inline function). (
A non-virtual function in a class is inline by default, and a virtual function cannot be inline because dynamic binding occurs! )
In practice, this heuristic algorithm works very well. But if you're overly fond of declaring a virtual function as an inline function (see effective C + + clause 33), the heuristic fails if all virtual functions within the class are declared inline. Most heuristic-based compilers generate a VTBL for a class in each object file that uses it. In large systems, this causes the program to contain hundreds of thousands of vtbl copies of the same class! Most compilers that follow this heuristic will give you some way to manually control the generation of VTBL, but aA
better way to solve this problem is to avoid declaring virtual functions as inline functions。 As we will see below, there are some reasons why the current compiler generally ignores the inline instruction of the virtual function.
(c + + Primer in version fifth: The inline description simply makes a request to the compiler, and the compiler can choose to ignore the request.) )
Virtual table only implements half the mechanism of a dummy function, if only these are useless. They can only be used if the VTBL of each object is indicated in some way. This is the work of virtual table pointer, which is to establish this connection.
each object that declares a virtual function has it, which is an invisible data member that points to the virtual table of the corresponding class. This invisible data member, also known as Vptr, is added to the object by the compiler and is only known by the compiler. Theoretically, we can assume that the layout of an object containing a virtual function is this:
This picture indicates that the vptr is located at the bottom of the object, but not deceived by it, and the different compiler places it differently.
where inheritance exists, the vptr of an object is often surrounded by data members. If there is multiple inheritance (multiple inheritance), this picture will become more complex, and we'll discuss it later.
The second cost of simply remembering a virtual function now is that you have to pay for the extra pointers in each object of the class that contains the virtual function.
If the object is small, this is a big price. For example, if your object has an average of 4 bits of member data, the extra vptr will increase the size of the member data by a factor of one (
Suppose the vptr size is 4 bits and 32 is the system)。 In a system with limited memory, this means that you must reduce the number of objects that are created. Even in systems with no memory limitations, you can see that this can degrade the performance of your software.
because larger objects may not fit in the cache or virtual memory pages, this can increase the number of system paging operations.
If we have a program that contains several C1 and C2 objects. The relationship between the object, the Vptr, and the vtbl that we talked about just now, in the program we can imagine: consider this paragraph of the program code:
void Makeacall (C1 *pc1) { pc1->f1 ();}
The virtual function F1 is called through the pointer pC1. Just looking at this code, you don't know what it's calling a F1 function ――c1::f1 or C2::F1, because pC1 can point to C1 objects or to C2 objects. Although the compiler still has to generate code for calls to the F1 function in Makeacall, it must make sure that the call to the function is correct no matter what object the pC1 points to. The compiler-generated code does the following things:
- The VTBL of the class is found through the vptr of the object. This is a simple operation because the compiler knows where to find the vptr within the object (after all, they are placed by the compiler). So this price is just an offset adjustment (to get vptr) and a pointer to the indirection (to get VTBL).
- Locate the pointer within the corresponding VTBL that points to the called function (F1 in the example above). This is also very simple, because the compiler assigns a unique index to each virtual function within the VTBL. The cost of this step is just an offset within the VTBL array.
- Call the function pointed to by the pointer found in the second step.
If we assume that each object has a hidden data called vptr, and F1 has an index of I in VTBL, this statement
PC1->F1 ();
This is what the generated code is.
(*pc1->vptr[i]) (pC1); Call the function that is referred to by VTBL in the unit I , while pc1->vptr //points to the VTBL;PC1 is made as //This pointer is passed to the function.
This is almost as efficient as calling a non-virtual function. On most computers it performs a few instructions more.
the cost of calling a virtual function is essentially the same as calling a function through a function pointer. Virtual functions themselves are usually not a bottleneck for performance.
In actual operation, the cost of virtual function is related to the inline function. in fact, virtual functions cannot be inline. This is because "inline" means "an instruction that replaces a function call with the called Function body itself during compilation," but the virtual function "virtual" means "until run time to know which function to invoke." "If the compiler does not know exactly which function is being called at the call point of a function, you will know why it does not inline the call of that function." This is the third cost of a virtual function: You are actually abandoning the use of inline functions. (
When a virtual function is called through an object, it can be inline, but most virtual functions are invoked through the object's pointer or reference, and such a call cannot be inline. Because this invocation is a standard invocation, virtual functions cannot actually be inline. )
What we've discussed so far applies to single inheritance and multiple inheritance, but with the introduction of multiple inheritance, things get more complicated (see effective C + + clause 43). The details are discussed here, but
in multiple inheritance, the calculation of the offsets in the object for finding vptr becomes more complex.
There are multiple vptr in a single object (one for each base class); In addition to the individual vtbl that we have discussed, we have to generate special VTBL for the base class. As a result, additional space is added to each class and the virtual functions in each object, and the cost of the run-time invocation increases.
Multiple inheritance often leads to the need for virtual base classes.
There is no virtual base class, if a derived class has more than one inherited path from the base class, the data members of the base class are copied into each inheriting class object, and each path between the inheriting class and the base class has a copy . Programmers generally do not want this replication to occur, and
defining a base class as a virtual base class can eliminate this replication。The virtual base classes themselves, however, cause their
own cost, because implementations of virtual base classes often use pointers to virtual base classes as a means of avoiding duplication, and one or more pointers are stored in objects.
For example, consider the following picture, which I often call "the horror of multi-Inheritance diamond" (the dreaded Multiple inheritance diamond), where a is a virtual base class, because B and C virtual inherit it. With some compilers (especially older compilers), the D object produces such a layout:
Place the data members of the base class at the bottom of the object ,It seems strange, but it often does. Of course how to implement is the freedom of the compiler, they can do whatever they want,
This diagram is just a conceptual description of how virtual base classes cause objects to require additional pointers, so you should not use this image outside of this scope. Some compilers may include fewer pointers, and some compilers will use some method without adding extra pointers at all (this compiler gives vptr and vtbl the burden of double responsibility).
If we combine this picture with the one that shows how to add virtual table pointer to the object, we realize that if base class A in the above inheritance system has any virtual functions, the memory layout of object D is like this:
here the object is added to the part of the compiler, and I have done a shading process. This picture can be misleading because the area ratio between the shaded and non-shaded parts is determined by the amount of data in the class. For small classes, the extra cost is great. For classes that contain more data, the additional cost is relatively small, albeit noteworthy.
It is also strange that although there are four classes, the above chart has only three vptr (
each base class generates a vptr)。 As long as the compiler likes, of course you can generate four vptr, but three is enough (it finds that B and D can share a vptr), and most compilers take advantage of this opportunity to reduce the additional burden generated by the compiler.
we have now seen that virtual functions can make objects larger and cannot be used inline, and we have tested too many inheritance and virtual base classes to increase the size of objects.
Let's turn to the last topic, run-time type recognition (RTTI).
Rtti allows us to find information about objects and classes at run time, so there must be somewhere where this information is stored for us to query. This information is stored in an object of type Type_info, and you can access the Type_info object of a class by using the typeID operator.
only one Rtti copy is required in each class, but there must be a way to get the type information of any object. In fact, the narrative is not very accurate.
The language Specification describes this: we guarantee that we can get an object dynamic type information if the type has at least one virtual function. This makes the RTTI data seem somewhat like
Virtual function Talbe (virtual functions table)。 Each class we just need a copy of the information, and we need a way to get the right information from any object that contains a virtual function. This
the similarity between Rtti and virtual function table is not coincidental: Rtti is designed to be implemented on the VTBL basis of the class.
For example, index 0 of the VTBL array can contain a pointer to a Type_info object that belongs to the class that corresponds to the VTBL. The VTBL of the C1 class above looks like this:
With this implementation, RTTI consumes additional units in the VTBL of each class plus the space to store Type_info objects.
just as the memory space occupied by virtual table in most programs is not worth noting, you are unlikely to encounter problems because of the size of the Type_info object.
The following table is a summary of the major costs required for virtual functions, multiple inheritance, virtual base classes, and Rtti:
Feature |
Increases Size of Objects |
Increases Per-class Data |
Reduces inlining |
Virtual Functions |
Yes |
Yes |
Yes |
Multiple inheritance |
Yes |
Yes |
No |
Virtual Base Classes |
Often |
Sometimes |
No |
RTTI |
No |
Yes |
No |
Some people will be surprised to see this form, and they announce that "I should still use C". Very good. But keep in mind that without the functionality provided by these features, you have to manually encode them to implement them. In most cases, your manual simulation may be less efficient and less stable than the code generated by the compiler. For example, using nested switch statements or cascading if-then-else statements to simulate a call to a virtual function produces more code than a virtual function, and the code runs slower. Again, you have to manually track the object types, which means that the objects carry their own type tag. So you don't get smaller objects.
It is important to understand the cost of virtual functions, multiple inheritance, virtual base classes, and Rtti, but it is equally important to understand this if you need these features, regardless of the approach you take to pay the price. Sometimes you do have some reasonable reasons to bypass the compiler-generated service. For example, hidden vptr and pointers to virtual base classes can make it difficult to store C + + objects in a database or move them across processes, so you might want to emulate these features in some way, making them easier to accomplish. But from an efficiency standpoint, it's not possible to write your own code better than the code generated by the compiler.
More effective C + +----(24) Understanding the cost of virtual functions, multiple inheritance, virtual inheritance, and Rtti