Memory layout in C + + inheritance

Source: Internet
Author: User
Tags abstract exception handling function definition goto inheritance object model

Today I saw a very good article on the internet, which is about the C + + class inheriting the memory layout. After looking at the benefit of a lot, now turned in my own blog inside, as a later review of the use.

--On VC + + object model
(US) Jane Gray
Chenghua translation

Preface of the Translator

A C + + programmer who wants to improve the level of technology should learn more about the semantic details of some languages. For the use of VC + + programmers, you should also know some VC + + for the interpretation of C + +. Inside the C + + Object model Although is a good book, however, the book more space, and the specific VC + + relationship smaller. Therefore, in terms of space and content, the translator thinks this article is a good starting point to understand the C + + object model in depth.
This article was seen before the very good, the old text reread, feel more understanding, so produced a translation, and share the idea. Although the article is not long, but time is limited, and several times in the translation nap asleep, procrastination spent one months.
On the one hand because my level is limited, on the other hand because the translation often nap, the wrong place is afraid many, welcome everybody criticize correct.

The original source of this article is MSDN. If you install MSDN, you can search for C + + Under the Hood. Otherwise, you can find on the website.

1 Preface

Knowing how your programming language is implemented can be particularly meaningful for C + + programmers. First of all, it removes the mystery of the language we use, so that we are not completely incredulous about what the compiler does, and, above all, it gives us more certainty when we debug and use the advanced features of the language. This knowledge can also help us well when it comes to improving the efficiency of the Code.

This article focuses on answering some of these questions:
How the 1* class is laid out.
2* How member variables are accessed.
3* how member functions are accessed.
4* so-called "adjustment block" (adjuster thunk) is going on.
5* the cost of using the following mechanism:
* Single inheritance, multiple inheritance, virtual inheritance
* Virtual function call
* Cast to base class, or cast to virtual base class
* Exception Handling
First, we examine the layout of C-compatible structures (struct), single inheritance, multiple inheritance, and virtual inheritance.
Next, we talk about the access of member variables and member functions, of course, where the bread contains virtual functions;
Next, we examine how constructors, destructors, and special assignment operator member functions work, and how the arrays are constructed and destroyed dynamically;
Finally, the support for exception handling is simply introduced.

For each language feature, we will briefly describe the motives behind the feature, its own semantics (of course, this article is not "Introduction to C + +", we should be fully aware of this), and this feature in Microsoft's VC + + is how to achieve. Note here to distinguish between abstract C + + language semantics and their specific implementations. Other C + + vendors outside of Microsoft may provide a completely different implementation, and we occasionally compare the implementation of VC + + with other implementations.

2 class Layout

This section discusses the different memory layouts that are caused by different inheritance patterns.

2.1 c structure (struct)

C + + is also "basically" compatible with C, because it is based on C. In particular, the C + + specification uses the same as C on the structure, the simple single memory layout principle: The member variables are arranged in the order in which they are declared, aligned on the memory address according to the alignment principle specified in the implementation. all of the C + + vendors guarantee that their C/C + + compiler will have the exact same layout for the effective structure of the structures. Here, A is a simple C structure whose member layout and alignment are at a glance

struct A {char c; int i;};

Translator Note: From the figure above, a in memory occupies 8 bytes, according to the order of the declared members, the first 4 bytes contain a character (actually occupy 1 bytes, 3 bytes empty, complement), after 4 bytes contains an integer. A's pointer points to the byte at which the character begins.

2.2 C-structure with C + + features

Of course, C + + is not a complex c,c++ is essentially an object-oriented language: Contains inheritance, encapsulation, and polymorphism. The original C structure has been transformed into the cornerstone of the object-oriented world-class. In addition to member variables, C + + classes can encapsulate member functions and other things. Interestingly, however, unless, in order to implement hidden member variables introduced by virtual functions and virtual inheritance, the size of the C + + class instance depends entirely on the member variables of a class and its base class. The member function basically does not affect the size of the class instance.

The b provided here is a C structure, however, the structure has some C + + features: the "public/protected/private" keyword that controls member visibility, member functions, static members, and nested type declarations. While looking at the dazzling, in fact, only member variables occupy the space of class instances . It is important to note that the C + + Standard committee does not limit the order in which the segments separated by the "public/protected/private" keyword are implemented, so the memory layouts implemented by different compilers may not be the same. ( in VC + +, member variables are always listed in the order in which they are declared).

struct B {public:int bm1; protected:int bm2; private:int bm3; static int bsm; void Bf (); static void BSF (); typedef VO id* BPV; struct N {}; };

Translator Note: In B, why the static int BSM does not occupy memory space. Because it is a static member, the data is stored in the data section of the program , not in the class instance.

2.3 Single Inheritance

C + + provides inheritance for the purpose of extracting commonalities between different types. For example, scientists classify species, thus having a kind of, genus, outline, etc. With this hierarchy, it is possible to classify something of a certain nature into the most appropriate classification level, such as "a mammal with a child". Since these attributes can be inherited by the quilt class, we can easily point out that "whales and humans can conceive children" simply by knowing that "whales and humans" are mammals. Those exceptions, such as the Platypus (egg-laying mammals), require us to overwrite the default attributes or behaviors.
The inheritance syntax in C + + is simple, adding ": Base" to the subclass. The following d inherits from base class C.

struct C {int C1; void CF ();

struct D:C {int d1; void df ();

Since the derived class retains all the properties and behavior of the base class, naturally, an instance of each derived class contains a complete base class instance data. In D, it is not that the base class C data must be placed in the data of D, but in this way, the C object address in D is guaranteed to be the first byte of the D object address. Under this arrangement, with a pointer to a derived class D, you do not have to compute the offset to get a pointer to base class C. This memory arrangement is used by almost all well-known C + + vendors (the base class member is in the front). at the single inheritance class level, each new derived class simply adds its own member variable to the base class's member variable. look at the above figure, the C object pointer and the D object pointer point to the same address.

2.4 Multiple Inheritance

In most cases, a single inheritance is sufficient. However, C + + provides multiple inheritance for our convenience.

For example, we have an organizational model, which has a manager class (Task), worker class (work). Then, for the first line of managers, that is, both from the superior manager to pick up the task of work, but also to subordinate workers to the role of task, how to express in the class level. Single inheritance is a bit Libushine here. We can arrange the manager class to inherit the worker class first, the line manager class again inherits the manager class, but this kind of hierarchy structure mistakenly lets the manager class inherit the worker class's attribute and the behavior. Vice versa. Of course, the line manager class can also be inherited only from one class (manager class or worker human), or one does not inherit, re-declare one or two interfaces, but such implementations do harm too much: polymorphism is not possible; The existing interface cannot be reused; most seriously, when the interface changes, it must be maintained in multiple places. The most plausible scenario seems to be that the first line manager inherits attributes and behaviors from two places--managers, workers.

C + + allows multiple inheritance to be used to solve such a problem:

struct Manager ... { ... }; struct Worker ... { ... }; struct Middlemanager:manager, Worker {...};

What kind of class layout does such inheritance cause? Let's use the word "alphabet" to illustrate the following:

struct E {int e1; void ef ();

struct F:C, E {int f1; void ff ();
Structure F is derived from multiple inheritance of C and E. As with single inheritance, the F instance copies all the data for each base class. Unlike single inheritance, the object pointers for embedded two base classes cannot all be the same as derived class object pointers under multiple inheritance:
f F; (void*) &f = = (void*) (c*) &f; (void*) &f < (void*) (e*) &f;
Translator Note: The above line shows that the C object pointer is the same as the F object pointer, and the following line shows that the E object pointer differs from the F object pointer.

Observe the class layout, you can see the E object embedded in F, and its pointer is not the same as the F pointer. As the following discussion of forced conversions and member functions points out, this offset can result in a small amount of invocation overhead.

specific compiler implementations are free to choose the layout of embedded base classes and derived classes. VC + + According to the declaration order of the base class to arrange the base class instance data, the last to arrange the derived class data. of course, the derived class data itself is also laid out in the order in which it is declared ( This rule is not immutable , and we see that the memory layout is not the case when some base class has a virtual function and some other base class does not).

2.5 Virtual Inheritance

Back to our discussion of the front-line manager class example. Let's consider this scenario: what happens if both the manager class and the worker class inherit from the employee class.
struct Employee {...}; struct Manager:employee {...}; struct Worker:employee {...}; struct Middlemanager:manager, Worker {...};
If both the manager class and the worker class are inherited from the employee class, naturally, each class obtains a copy of the data from the employee class. In the case of no special treatment, an instance of the first line manager class will contain two instances of employee classes, each from two employee base classes. if the Employee class member variable is not large, the problem is not serious; if there are many member variables, then the extra copy will cause the instance to incur serious overhead when it is generated. Worse, these two different employee instances may be modified separately, resulting in inconsistent data. Therefore, we need to have a special statement from the manager class and the workers that they are willing to share an employee base class instance data.

Unfortunately, in C + +, this "shared inheritance" is called "virtual inheritance" , making the problem seem abstract. The syntax for virtual inheritance is simple, and you can add the virtual keyword when you specify a base class.
struct Employee {...}; struct manager:virtual employee {...}; struct worker:virtual employee {...}; struct M Iddlemanager:manager, Worker {...}; The
uses virtual inheritance for greater implementation overhead, invocation overhead, than for single inheritance and multiple inheritance. Recall that in the case of single inheritance and multiple inheritance, the embedded base class instance address is either the same address (single inheritance, and the most Zoki class of multiple inheritance) than the derived class instance address, or the address is a fixed offset (multiple inherited non-Zoki class). However, when virtual inheritance occurs, in general, the offset between the derived class address and its virtual base class address is not fixed, because if the derived class is further inherited, the resulting derived class places the shared virtual base class instance data at a different offset from the previous layer derived class. See the following example:

struct g:virtual C {int g1; void gf ();
Translator Note: GDGVBPTRG (in G, the displacement of G's virtual base pointer to g) means: in G, the offset between the G object's pointer and G's Virtual base class table pointer is 0, Because the first item of the G object memory layout is the virtual base class table pointer, Gdgvbptrc (in G, the displacement of G's virtual base pointer to C) means: in G, the offset between the C object's pointer and G's Virtual base class table pointer, which can be See the 8.

struct h:virtual C {int h1; void hf ();

struct i:g, H {int i1; void _if (); The VBPTR member variable is not investigated for the time being. From the above graphs you can see intuitively that in the G object, the embedded C-class object's data immediately follows the data of G, and in H objects, the embedded C-Class object's data is immediately followed by H's data. However, in the I object, the memory layout is not the case. In the memory layout of VC + +, the offset between G object and C object in G object instance is different from that between G object and C object in the I object instance. When you use a pointer to access a virtual base class member variable, because the pointer can be a base class pointer to an instance of a derived class, the compiler cannot compute the offset based on the declared pointer type, and you must find another indirect way to compute the location of the virtual base class from the derived class pointer.
In VC + +, for each class instance that inherits from virtual base class, a hidden "virtual base class Table pointer" (VBPTR) member variable is added, so as to indirectly compute the virtual base class position. The variable points to a class-wide shared offset table in which the item records the offset between the virtual base class table pointer and the virtual base class for the class.
One of the other implementations is the use of pointer member variables in derived classes. These pointer member variables point to the virtual base class of the derived class, with one pointer to each virtual base class. The advantage of this approach is that when you get the virtual base class address, the code used is relatively small. However, when the compiler optimizes code, it is usually possible to take steps to avoid the repeated calculation of the virtual base class address. Moreover, there is a big drawback to this approach: when deriving from multiple virtual base classes, the class instance consumes more memory space, and when you get the address of the virtual base class of the imaginary base class, you need to use pointers more than once, which is less efficient, and so on.

In VC + +, G has a hidden "virtual base class table Pointer" member, point to a virtual base class table, the second item of the table is G dgvbptrc. (in G, the offset between the address of the virtual base class object C and the "Virtual base class table pointer" of G ( the prefix before "D" is omitted when the offset is constant for all derived classes)). For example, on a 32-bit platform, the GDGVPTRC is 8 bytes. Similarly, the G object instance in the I instance also has a "virtual base class table pointer", but the pointer points to a virtual base class table that applies to "g in I" , with a value of IDGVBPTRC 20.

Looking at the preceding G, H and I, we can get the following conclusions about VC + + virtual inheritance under the memory layout:
1 First arranges the base class instances of the non-virtual inheritance;
2 when there is a virtual base class, add a hidden vbptr for each base class, unless a vbptr has been inherited from a class that is not a virtual inheritance;
3 to rank new data members of derived classes;
4 at the end of the instance, an instance of each virtual base class is arranged.

The layout arrangement makes the location of the virtual base class "floating" as a result of a derived class, but the Non-virtual base class is thus pooled together, with each offset fixed to the same amount.

3 member variables

After describing the class layout, we then consider the cost of accessing the member variables for different inheritance ways.

no inheritance: when there is no inheritance relationship, the access member variable is exactly the same as the C language: from the pointer to the object, consider a certain offset.
c* pc; pc->c1; * (PC + DCC1);
The PC is a pointer to C.
A. To access the member variable C1 of C, simply add a fixed offset to the PC DCC1 (in C, the offset value between the C pointer address and its C1 member variable), and then get the contents of the pointer.

single inheritance: because the offset between the derived class instance and its base class instance is constant 0, the calculation can be simplified by directly using the offset relationship between the base class pointer and the base class member.
d* PD; pd->c1; * (PD + DDC + dCc1); * (PD + DDC1); pd->d1; * (PD + DDD1);
The Translator notes: D inherits from C and PD is a pointer to D.
A. When accessing the base class member C1, the calculated step should have been "PD+DDC+DCC1", that is, to first compute the offset between the D and C objects, and then add the offset between the C object pointer and the member variable C1. However, since DDC is constant to 0, it is possible to directly compute the offset between C object address and C1.
B. When accessing a derived class member D1, the offset is calculated directly.

Multiple inheritance : Although the offset between a derived class and a base class may not be 0, the offset is always a constant. As long as it is a constant, accessing the member variable, the calculation of the member variable offset can be simplified. Visible even for multiple inheritance, the cost of accessing member variables is still not significant.
F* PF; pf->c1; * (PF + DFC + dCc1); * (PF + dFc1); pf->e1; * (PF + DFE + dEe1); * (PF + dFe1); pf->f1; * (PF + dFf1);
F inherits from C and E,PF are pointers to f objects.
A. When accessing Class C member C1, the relative offset of F object and embedded C object is 0, and the offset of F and C1 can be computed directly.
B. When accessing Class E member E1, the relative offset of the F object and the embedded E object is a constant, and the offset between F and E1 can be simplified;
C. When accessing F's own member F1, the offset is calculated directly.

Virtual Inheritance: When a class has a virtual base class, accessing a member of a non-virtual base class is still a question of calculating a fixed offset. However, accessing the member variables of the virtual base class increases the overhead because the address of the member variable must be obtained through the following steps:
1. Get "Virtual base class table pointer";
2. Get the contents of a table item in a virtual base class table;
3. Add the offset indicated in the content to the address of the "Virtual base class table pointer".

However, it is not always the case. As the C1 member who accesses the I object below, if it is not accessed through a pointer, but directly through an object instance, the layout of the derived class can be statically obtained during compilation, and the offset can be computed at compile time, so it is not necessary to calculate indirectly based on the table entries of the Virtual base class table.
i* Pi; pi->c1; * (pi + digvbptr + (* (pi+digvbptr)) [1] + dCc1); pi->g1; * (pi + DIG + dGg1); * (pi + dIg1); pi->h1; * (pi + DIH + dHh1); * (pi + dIh1); pi->i1; * (pi + dIi1); I i; I.C1; * (&i + idic + dCc1); * (&i + IdIc1);
The virtual base class that inherits from G and H,g and H is C,pi is a pointer to the I object.
A. When accessing member C1 of virtual base class C, Digvbptr is "in I, the offset between the I object pointer and G's" Virtual base class table pointer ", * (pi + digvbptr) is the starting address of the virtual base class table, * (pi + digvbptr) [1] Is the content of the second item of the Virtual base class table (in the I object, the offset between the "Virtual base class table pointer" of the G object and the virtual base class), and the DCC1 is the offset between the C object pointer and the member variable C1;
B. When accessing the member G1 of the non-virtual base class G, the offset is calculated directly;
C. When accessing member H1 of a non-virtual base class H, the offset is calculated directly;
D. Direct use of offsets when accessing the i1 of its members;
E. when you declare an object instance, use the dot "." When an operator accesses a virtual base class member C1, the offset can be calculated directly because the layout of the object is fully known at compile time.

What happens when you access a class's inheritance hierarchy, a member variable of a multi-tiered virtual base class. For example, when accessing a member variable of a virtual base class of a virtual base. Some implementations are: Save a pointer to a direct virtual base class, and then you can find its virtual base class from the direct virtual base class and push it up and down. VC + + optimization of this process. VC + + adds some extra items to the virtual base class table, which holds the offset from the derived class to the virtual base class of its layers.

4 Forced conversion

If there is no problem with the virtual base class, it is not expensive to force a pointer into another type of pointer. If there is a "base class-derived class" relationship between the two pointers that require conversion, the compiler simply adds or subtracts an offset between the two (and often 0).
F* PF; (c*) PF; (c*) (PF. PF + dfc:0); (c*) PF; (e*) PF; (e*) (PF. PF + dfe:0);

C and E are the base classes of F, which converts the pointer pf of F to c* or e*, with the PF being added to a corresponding offset. When converting to a C type pointer c*, no calculation is required because the offset between F and C is 0. When converting to type E pointer e*, you must add a non-0 offset constant DfE to the pointer. The C + + specification requires that NULL pointers remain null after coercion , so VC + + checks to see if the pointer is null before making the required operation. Of course, this check only occurs if the pointer is displayed or implicitly converted to the relevant type of pointer; this check is not necessary when the method of the base class is called in a derived class object, so that the derived class pointer is converted into a const "This" pointer to a base class in the background, because at this point, The pointer must not be NULL.

As you can guess, when a virtual base class exists in an inheritance relationship, the cost of coercion is relatively large. Specifically, the overhead of accessing a virtual base class member variable is equivalent.
i* Pi; (g*) Pi; (g*) Pi; (h*) Pi; (h*) (pi. Pi + dih:0); (c*) Pi; (c*) (pi?) (Pi+digvbptr + (* (pi+digvbptr)) [1]): 0);
Pi is the pointer to the I object, G,h is the base class of I, and C is the virtual base class of G,h.
A. When the pi is g*, it does not need to be calculated because the address of g* and i* is the same;
B. When forced conversion pi is h*, only one constant deviation should be considered;
C. The cost of calculating and accessing a virtual base class member variable is the same as when a c* pi is in force, firstly, the virtual base class table Pointer of G is obtained, and then the offset of G to virtual base class C is removed from the second term of the virtual base class table, and the c* is calculated according to the deviation between pi, virtual base class table offset and virtual base class C and the virtual base class table pointer.

generally, when you access a virtual base class member from a derived class, you should first force the derived class pointer to be a virtual base class pointer, and then use the virtual base class pointer to access the virtual base class member variable. By doing this, you can avoid the overhead of calculating the virtual base class address each time. See the following example.

* Before: * * ... pi->c1 ... pi->c1 ...
/* Faster: * * c* pc = PI; ... pc->c1 ... pc->c1 ...
Translator Note: The former always use derived class pointer pi, so each access C1 has a large cost of computing virtual base class address, the latter first converts pi to virtual base class pointer pc, so subsequent calls can eliminate the cost of computing virtual base class address.

5 member functions

A C + + member function is just another member within the scope of a class. each non-static member function of the X class accepts a special hidden parameter--this pointer, and the type is x* Const. the pointer is initialized in the background to point to the object on which the member function works. Similarly, in a member function, the access of a member variable is performed in the background by the offset of the this pointer.

struct P {int p1; void pf ();//new virtual void PVF ();//new};

P has a Non-virtual member function pf (), and a virtual member function PVF (). It is obvious that a virtual member function causes an object instance to occupy more memory space because the virtual member function requires a virtual function table pointer. This point will be discussed later. It is particularly noted here that declaring a Non-virtual member function does not cause the memory overhead of any object instance. Now, consider the definition of P::p F ().
void P::p f () {//void P::p f ([P *const this]) ++p1;//+ + (THIS->P1);}

Here P:PF () accepts a hidden this pointer parameter , which the compiler automatically adds to each member function call. Also, note that member variable access may be more expensive than it seems, because member variable access takes place through the this pointer, and at some inheritance level, the this pointer needs to be adjusted, so the cost of access may be higher. On the other hand, however, the compiler usually caches the this pointer in the register, so the cost of accessing a member variable is no more efficient than accessing a local variable.
Translator Note: Access to local variables, you need to get the stack pointer to SP registers, plus the local variable and the top of the stack offset. In the absence of a virtual base class, if the compiler caches the this pointer in a register, the procedure for accessing the member variable will be similar to the cost of accessing the local variable.

5.1 overriding member functions

As with member variables, member functions are also inherited. Unlike member variables, a derived class can overwrite, or replace, a function definition of a base class by redefining the base class function in a derived class. whether the overlay is static (determined at compile time based on the static type of the member function) or dynamic (determined dynamically by the object pointer at run time) depends on if the member function is declared as a "virtual function."

Q Inherits member variables and member functions from p. Q declares the PF (), covering P::p F (). Q also declares the PVF (), covering the P::p VF () virtual function. Q also declares the new Non-virtual member function QF (), and the new virtual member function QVF ().

struct Q:P {int q1; void pf ();//Overrides P::p f void QF ();//new void PVF ();//Overrides P::p vf virtual void QVF (); NEW};
for member functions of Non-virtual , the invocation of which member function is in compilation , based on the type static decision of the pointer expression to the left of the "->" operator. In particular, even if PPQ points to the instance of Q, PPQ->PF () still calls P::p f (), because PPQ is declared as "p*". (Note that the pointer type to the left of the "->" operator determines the type of the hidden this parameter.) )
P p; p* pp = &p; Q q; p* PPQ = &q; q* PQ = &q; PP->PF (); Pp->p::p f (); P::p f (PP); PPQ->PF (); Ppq->p::p f (); P::p f (PPQ); PQ->PF (); Pq->q::p f (); Q::p F ((p*) PQ); (Error!) PQ->QF (); PQ->Q::QF (); Q::QF (PQ);
Translator Note: Mark "error" where p* seems to be q*. Because PF is not a virtual function, and the type of PQ is q*, it should be called on the PF function of q so that the function should require a q* const type this pointer.

For a virtual function call, the member function that is invoked is determined at run time . Regardless of the type of the pointer expression to the left of the "->" operator, the virtual function that is invoked is determined by the type of instance that the pointer actually points to. For example, although the type of PPQ is p*, when PPQ points to an instance of Q, the call is still Q::p VF ().
PP->PVF (); Pp->p::p VF (); P::p VF (PP); PPQ->PVF (); Ppq->q::p VF (); Q::p VF ((q*) PPQ); PQ->PVF (); Pq->q::p VF (); Q::p VF ((p*) PQ); Error )
Translator Note: Mark "error" place, p* seems to be q*. Because PVF is a virtual function, PQ is q*, and point to the example of Q, from which aspect should not be p*.

In order to implement this mechanism , a hidden vfptr member variable is introduced. a vfptr is added to the class (if it is not in the class), and the vfptr points to the virtual function table (vftable) of the class. Each virtual function in a class occupies an item in the Virtual function table of the class. Each item holds an address for the virtual function that is applicable to the class. Therefore, the process of calling virtual functions is as follows: Obtain the vfptr of the instance, obtain an item of the virtual function table by Vfptr, and invoke the virtual function indirectly through the function address of the item in the Virtual function table. in other words, virtual function calls require additional overhead in addition to the parameter passing, calling, and return instruction overhead of a normal function call.

Look back at the p and Q memory layout, you can find that VC + + compiler put the hidden VFPTR member variables at the beginning of P and Q instances. This makes the call to the virtual function as fast as possible. In fact,VC + + implementation is to ensure that any virtual function of the class is always the first item vfptr. this may require inserting a new vfptr before the base class when the instance is laid out, or requiring multiple inheritance, although on the right, the base class with Vfptr to the front of the base class with no vfptr on the left (see below).
Class CA {int A;}; Class CB {int b;}; Class Cl:public CB, public CA {int C;};
For the CL class, its memory layout is:
int b;
int A;
int C;
However, the transformation CA is as follows:
Class CA {int A; virtual void seta (int _a) {a = _a;}};
For CL with the same inheritance order, the memory layout is:
int A;
int b;
int C;

Many C + + implementations share or reuse vfptr inherited from the base class. For example, Q does not have an extra vfptr, pointing to a virtual function table that stores a new virtual function qvf (). The QVF item is simply appended to the end of the virtual function table of P. In this way, the cost of single inheritance is not high. Once an instance has vfptr, it does not need more vfptr. New derived classes can introduce more virtual functions that simply append new items to the end of the virtual function table of "each class one" that already exists.

5.2 Virtual functions under multi-heavy inheritance

If you inherit from multiple base classes that have virtual functions, an instance may contain multiple vfptr. Consider the following R and S classes:

struct R {int r1; virtual void pvf ();//new virtual void RVF ();//new};

struct S:p, r {int s1; void pvf ()//Overrides P::p VF and R::p VF void RVF ();//Overrides R::RVF void SVF ();//new};

Here r is another class that contains virtual functions. Since S has multiple inheritance from P and R, instances of s are embedded with instances of P and R, as well as data member s::s1 of S itself. Note that under multiple inheritance, the right base class R, whose instance's address and p are different from S. S::p VF covers P::p VF () and R::p VF (), S::RVF () covers R::RVF ().
s S; s* PS = &s; ((p*) PS)->PVF (); (* (p*) PS)->p::vfptr[0]) ((s*) (p*) PS) ((r*) PS)->PVF (); (* (r*) PS)->r::vfptr[0]) ((s*) (r*) PS) PS->PVF (); One of the above; Calls S::p VF ()
Translator Note:
 Call ((p*) PS)->PVF (), first in the virtual function table of P to remove the first item, and then convert PS to s* as the this pointer passed in;
 Call ((r*) PS)->PVF (), the first item is taken first in the virtual function table of R, and then the PS is converted to s* as the this pointer;

Because S::p VF () covers P::p VF () and R::p VF (), the corresponding item should also be overwritten in the virtual function table of S. However, we quickly noticed that not only can we use p*, we can also use r* to invoke PVF (). The problem arises: R has a different address from P and S. The expression (r*) PS and the expression (p*) PS point to different locations in the class layout. Because the function s::p VF wants to obtain a s* as the hidden this pointer parameter, the virtual function must convert the r* to s*. Therefore, in the copy of S to r virtual function table, the PVF function corresponds to an address of an " adjustment block " that uses the necessary calculations to convert the r* to the desired s*.
Translator Note: This is "thunk1:this-= SDPR;" Goto S::p VF "The thing to do. First, according to the offset of P and R in S, adjust this to p*, i.e. s*, and then jump to the corresponding virtual function to execute.

in Microsoft VC + + implementation, the adjustment block is used only when the derived class virtual function overrides the virtual function of multiple base classes for multiple inheritance with virtual functions.

5.3 Address points and "logical this adjustment"

Consider the next virtual function S::RVF (), which overrides R::RVF (). We all know that S::RVF () must have a hidden s* type of this parameter. However, because RVF () can also be invoked with r*, that is, the RVF virtual function slot of R may be used in the following way:
((r*) PS)->RVF (); (* ((r*) PS)->r::vfptr[1]) ((r*) PS) So, most implementations use another adjustment block to convert the r* that is passed to RVF to s*. There are also implementations that add a particular virtual function entry at the end of the virtual function table of S, which provides methods so that PS->RVF () can be invoked directly without first converting the r*. Msc++ implementation is not the case, msc++ intentionally compiles S::RVF to accept a nested R instance in s instead of a pointer to an S instance (which we call "the same type of pointer to the derived class as the first time the virtual function was introduced"). All of these occur transparently in the background, access to member variables, and the this pointer to the member function, all proceed with "logical this adjustment".

Of course, the this adjustment must be compensated for in the debugger.
PS->RVF (); ((r*) PS)->RVF (); S::RVF ((r*) PS): When invoking a RVF virtual function, it is given directly to r* as the this pointer.

Therefore, when overriding a virtual function that is not the leftmost base class, msc++ generally does not create an adjustment block or add additional virtual function entries.

5.4 Adjustment Block

As already described, there are times when you need to adjust the block to adjust the value of this pointer (this pointer is usually located under the return address on the stack, or in registers), add or subtract a constant offset on the this pointer, and then call the virtual function. Some implementations, especially those based on Cfront, do not use the adjustment block mechanism. They add additional offset data to each virtual function table entry. Whenever a virtual function is invoked, the offset data (usually 0) is added to the address of the object, and the address of the object is then passed in as the this pointer.

PS->RVF (); struct {void (*PFN) (void*); size_t disp;}; (*PS->VFPTR[I].PFN) (PS + ps->vfptr[i].disp); When the RVF virtual function is invoked, the preceding sentence represents a virtual function table each item is a struct, and the structure contains an offset; the last sentence means that when I invoke the first virtual function, The this pointer adjusts by using the offset of item I in the virtual function table.

The disadvantage of this method is that the virtual function table is enlarged, and the call of virtual function is more complicated.

Modern pc-based implementation generally uses the "adjust-jump" technology:
S::p vf-adjust://msc++ this-= SDPR; Goto S::p VF () of course, the following code sequence is better (however, there is currently no implementation to adopt this method):
S::p vf-adjust:this-= SDPR; Fall into S::p VF () s::p VF () {...} IBM's C + + compiler uses this method.

5.5 virtual functions under virtual inheritance

T virtual inheritance p, which covers the virtual member function of P, declares a new virtual function. If you add a new item at the end of the base class virtual function table, accessing the virtual function always requires access to the virtual base class. In VC + +, in order to avoid acquiring the virtual function table, thenew virtual function in T is obtained by a new virtual function table, which brings a new virtual function table pointer. The pointer is placed at the top of the T instance.

struct t:virtual p {int t1; void pvf ();//Overrides P::p vf virtual void TVF ();//new}; void T::p vf () {++P1;//((p*) this)->p1++;//vbtable lookup! ++t1;//this->t1++;}
As shown above, even in virtual functions, the member variable accessing the virtual base class is calculated by acquiring the offset of the virtual base class table. This is necessary because the virtual function may be overridden by a further inherited class, and the position of the virtual base class changes in the layout of the class that is further inherited. Here's a class like this:

struct U:t {int u1;};

Here you add a member variable, which changes the offset of p. Because VC + + implementation, T::p VF () is a nested in t of the pointer, so you need to provide an adjustment block, the this pointer adjusted to T::T1 after (where is the position of P in t).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.