C + + Object memory model

Source: Internet
Author: User

The functions of data and processing data in C language are defined separately, and each function of processing data implements corresponding algorithms. However, C + + provides classes that can achieve better data and the encapsulation of the algorithm that handles the data, and this encapsulation has some cost compared to the C language, which is largely constrained by the memory model that C + + objects implement to support the corresponding features.
There are static and nonstatic two data members in C + +, with static, nonstatic, and virtual three member functions. The memory layout of objects of a class is largely evolved.

1. Simple object model of memory layout method

Each object is a series of slots, and each slot holds a pointer to each member (including all data members, member functions). All member are arranged in a declarative manner.

      / +-------+   -----> [non-staticmember]      | +-------+   -----> [staticmember]slots=| +-------+   -----> [staticmemberfunction]      | +-------+   -----> [non-staticmemberfunction]      \ +-------+   memberfunction]

In this simple model, it is possible to avoid problems with different member types that require different storage spaces, while the size of an object can be easily computed. This simplest model has a deep application in the implementation of member pointers.

Tabular memory model

Put the data member all in one table, containing the actual storage of data member; member function is placed in another table, with each item pointing to the corresponding member function. For each object's memory layout, only two pointers are pointed to the two tables respectively.
This kind of tabular memory model is an effective scheme to implement the virtual function mechanism of C + +, and it is also based on this scheme in the current C + + implementation.

2. C + + Object memory layout

Based on the above two basic models of evolution and development, C + + object memory layout of the basic way is:
-Object contains all non-static data members, static data members, and all function members are not placed inside object objects
-Static data members are placed in the global data area
-The Non-virtual function member (divided into static and non-static) is the same as the normal non-member function, where
-Static Direct call
-Non-static need to use the this pointer of object to invoke
-Virtual function members are implemented with virtual function tables (virtual function table, VTBL) and virtual functions table pointers (virtual pointer, vptr), and bptr are generally placed at the beginning of the first address of object

The Virtual function table pointer (VPTR) is only added by the compiler when the class has a virtual function, and is also set by the compiler for constructor, destructor, and copy Assignmeh operator functions. At the same time, the virtual function table generated by the virtual function's class will be set in the first item with the type info information associated with the class to complete the C + + Rtti feature.
The main advantage of the above memory layout is that space utilization and access time are more efficient. The main drawback is that if the application code has not changed, but the Non-static data member of object has been modified, the object will have to be recompiled because it stores the data members directly.
The memory layout size of a C + + object is determined mainly by the following factors:
1. Sum of nonstatic data members defined in a class
2. Due to the padding of the alignment requirements compiler
3. Additional burdens to support the virtual feature (VPTR and Vbptr)

In addition, C + + objects can only be done with references or pointers in order to support polymorphic properties, which cannot be done when the actual object is called. There are three ways to achieve this:
1. Inheriting class pointers implicitly converted to base class pointers
2. Base-class pointers call virtual functions
3. dynamic_cast and typeid operator completion
The following is a discussion of the memory layout for C + + class objects under different usage scenarios.

Classes with no polymorphism

For a class that does not have polymorphic properties, it can be a separate class, or private inheritance, etc. in order to implement "Has-a", "is implemented in terms of", this case, the data member and the function member is not different from the C language struct, the data member accesses the same, The invocation of function members is the same and the efficiency is the same, so in this sense C + + does not introduce an additional burden for the implementation of the class.

class A{int0private A{public:    voidmf1();    void mf2();private:    int a;    int b;    //...其他数据成员};

The layout of the object for B is as follows:

The layout of the above objects is not different from the C language struct, but it supports inheritance. Data members within the same access level are arranged in the order in which they are declared. If you remove the inherited class, a, the data member has no other change except for step a, that is, the memory layout of the individual class is exactly the same as the struct in C.

Single inheritance of polymorphism

For a single inheritance that requires polymorphism, the base class needs to define a polymorphic virtual function, which introduces virtual function tables and virtual function pointers. The principle here is that the base class virtual function table contains the first entry for the type info information, followed by the address of all virtual functions. each item in the virtual function table of a subclass is first copied from the base class virtual function tables , the first type info information is replaced with the subclass of type info, and for each subsequent virtual function, if the subclass is redefined , replace with the newly defined virtual function address the corresponding table entry in the virtual function table, otherwise it will not change . At the same time, if the subclass also customizes a new virtual function, the new table entry is added to the current virtual function table, filling in the new defined virtual function address.

structb{intv1;Virtual~b () {cout<<"B destructor\n";}Virtual voidF () {cout<<"F in b\n";}};structD: Publicb{intv2;Virtual voidFD () {cout<<"FD in d\n";}voidF () {cout<<"F in d\n";}};/// test codetypedef void(*fun) ();The //function pointer is used to invoke the member function F, FDD D;int* Vptr = (int*)(*(int*) (&d));((Fun) * (vptr+2)) ();((Fun) * (vptr+3))();

The memory layout of the above object D is as follows:

The above test code first defines a function pointer type, which is used for coercion type conversion, the function pointer type to the member function F, FD match, so cast before the corresponding function can be called.
Getting the address of the object is the address of the virtual table pointer added by the compiler, because the run environment is 32 bits, the same size as int, by converting it to a pointer of type int, you can find the table entry in the virtual function table by index 2 and 3, that is, the address of the corresponding function. After the final conversion, the results are as follows:

F in D
FD in D
B destructor

Multi-Inheritance of polymorphism

Multiple inheritance is the same as the basic implementation of the previous single inheritance: The parent class produces a virtual function table, and the parent object inserts the VPTR virtual table pointer at the header. Multiple inherited parent classes are arranged in the order of the inherited statements in the memory layout of the subclass object, so that the vptr of each parent class is received by the quilt class. The subclass immediately adds its own data members to the subsequent memory space.
Since multiple inheritance may occur in a diamond inheritance system, where multiple parents share the same ancestor when inheriting a class for multiple inheritance, there will be multiple subobject of this ancestor in the inheriting class, which will result in two semantics, thus providing virtual inheritance.

Non-diamond Multiple inheritance

This inheritance is similar to polymorphic single inheritance, where the number of base classes is more than one, the memory layout as shown, the rest of the access is similar, the only difference is. After copying the virtual function table of the parent class and the vptr of the subobject, the substitution rule is that all subclasses with the same name as the redefined virtual function will be replaced with the address of the new subclass function , which also indicates that the virtual destructor table entries for each base class are replaced with the destructor addresses of the current subclass. Another important difference is that the address of a newly defined virtual function of a subclass is simply inserted at the end of the virtual function table of the first parent class , and the second to the next parent class's virtual function table is not inserted.

structb1{intv1;Virtual~b1 () {cout<<"B1 destructor\n";}Virtual voidF () {cout<<"F in b1\n";}};structb2{intv2;Virtual~b2 () {cout<<"B2 destructor\n";}Virtual voidF () {cout<<"F in b2\n";}};structD2: PublicB1, PublicD2=intVdVirtual voidFD () {cout<<"FD in d2\n";}voidF () {cout<<"F in d2\n";}};/// test codetypedef void(*fun) ();D 2 D;int* Vptr = (int*)(*(int*) (&d));((Fun) * (vptr+2))();//Call virtual function F through vptr of B1 child object(Fun) * (vptr+3))();//Call vptr virtual function of B1 sub-object FDVptr = (int*)*((int*)((Char*) (&AMP;D) +sizeof(B1)));( (Fun) * (vptr+2))();//Call virtual function F through vptr of B2 child object

The results of the above test execution are as follows:

F in D2
FD in D2
F in D2
B2 destructor
B1 destructor

The memory layout of object D is as follows:

Since B1 is declared before, the object is arranged in front. The virtual function table calls F and fd,f are overwritten by the newly defined function address in the D2 by B1 the vptr of the child object, and the FD is the newly defined virtual function, so it is inserted at the end of the table.
Because the front is B1, to find B2 vptr, you need to offset through the B1 size, here first the first address of object D to (char*)(&d) type, and then use sizeof (B1) to get the offset of the number of bytes to get the first address of B2 sub-object, that is B2 vptr. And then convert it back to int* just like the previous action. This calls the third discovery to execute the function f for the new defined F in D2, so the F address in B2 is also replaced. Also, the fourth item is called "Segment fault", stating that the end of the B2 child object's virtual function table does not have a new virtual function FD that is inserted D2 definition.

Diamond Virtual Multiple Inheritance

Diamond inheritance, if you do not use the virtual inheritance mechanism, it will follow the previous multiple inheritance, so that multiple parent subobject inside will contain the common ancestor subobject, so that using subclasses to refer to members of the common ancestor will have ambiguous:

d2.memParent//是B1还是B2中的memParent?//表面上的解决方法d2.B1::memParent;//调用B1 subobject内的成员d2.B2::memParent;//调用B2 subobject内的成员

Although the above solution can temporarily eliminate ambiguity, but it is not in essence eliminate this problem, logically also very inconvenient to the understanding of the program. Also, there is a waste of two public ancestor members in space. Therefore, virtual inheritance was born to solve this problem.
First of all, from the virtual single inheritance mechanism to see how it is arranged.

Virtual Single Inheritance
structb{intAVirtual~b () {cout<<"B destructor\n";}Virtual voidF () {cout<<"F in b\n";}Virtual voidNF () {cout<<"NF in b\n";}};structB1: Public Virtualb{intv1;Virtual~b1 () {cout<<"B1 destructor\n";}Virtual voidF () {cout<<"F in b1\n";}Virtual voidFB1 () {cout<<"FB1 in b1\n";}};//TestB1 b1;b1.v1 =Ten; b1.a =1;cout<<"Size of B1:"<<sizeof(B1) << Endl;int* Pvptr = (int*) (&AMP;B1);cout<<"B1::V1 ="<< * (pvptr+1) <<", b::a ="<< * (pvptr+3) << Endl;int* VP = (int*) (*PVPTR);((Fun) * (vp+2)) ();((Fun) * (vp+3)) (); VP = (int*) (* (Pvptr +2));((Fun) * (vp+2)) ();((Fun) * (vp+3))();

The results of the operation are as follows:

Size of B1:16
B1::v1 = ten, b::a = 1
F in B1//call through B1 vptr
FB1 in B1//Vptr call via B1
F in B1//Vptr call via B
NF in B//via B's vptr call
B1 destructor
B destructor

From the above results can be inferred, virtual inheritance if the subclass has a newly defined virtual function, the subclass will re-establish a new vptr and virtual function table, and arranged in the object header, the parent class subobject is placed at the very end, including the parent class vptr and its virtual function table. At the same time, when the parent virtual function table is copied to the child object, the virtual function table entry of the subclass with the same name is still overwritten.

Diamond Virtual Inheritance

For the above virtual inheritance results, the virtual inheritance of the parent class will be placed at the end of the child object, the rest in accordance with the multiple inheritance in the order of declaration sequence, subclass if the new virtual function defined, the first parent class will be extended at the end of the virtual function table.

structb{intAVirtual~b () {cout<<"B destructor\n";}Virtual voidF () {cout<<"F in b\n";}Virtual voidNF () {cout<<"NF in b\n";}};structB1: Public Virtualb{intv1;Virtual~b1 () {cout<<"B1 destructor\n";}Virtual voidF () {cout<<"F in b1\n";}Virtual voidFB1 () {cout<<"FB1 in b1\n";}};structB2: Public Virtualb{intv2;Virtual~b2 () {cout<<"B2 destructor\n";}Virtual voidF () {cout<<"F in b2\n";}};structD2: PublicB1, Publicb2{Virtual voidFd2 () {cout<<"fd2 in d2\n";}Virtual voidF () {cout<<"F in d2\n";}};//TestD2 d2;d2.v1 =Ten; D2.v2 = -; d2.a =1;cout<<"Size of D2:"<<sizeof(D2) << Endl;int* Pvptr = (int*) (&AMP;D2);cout<<"B1::V1 ="<< * (pvptr+1) << Endl;cout<<"B2::v2 ="<< * (pvptr+3) << Endl;cout<<"b::a ="<< * (pvptr+5) << Endl; ((Fun) * ((int*) (*PVPTR) +2)) ();((fun) * (int*) (*PVPTR) +3)) ();((fun) * (int*) (*PVPTR) +4))();int* B2 = pvptr +2;((Fun) * ((int*) (*B2) +2)) (); b2 = Pvptr +4;((Fun) * ((int*) (*B2) +2)) ();((fun) * (int*) (*B2) +3))();

The output is:

Size of D2:24
B1::V1 = 10
B2::v2 = 20
B::A = 1
F in D2//called by B1 Vptr, F in B1 is replaced
FB1 in B1//Vptr by B1, B1 in FB1 is not replaced
FD in D2//called by B1 Vptr, FD in D is added to the first base class virtual function table
F in D2//called by B2 Vptr, F in B2 is replaced
F in D2//vptr called by B, F in B is replaced
NF in b//Vptr call via B, no redefinition in D, no replacement of NF in B

Through the analysis of the different cases and the example verification, we can have a clearer understanding of C + + memory layout. In particular, compared to the implementation of the C language where the additional burden is introduced, this can be used for C + + coding for a variety of reference.

Note: All of the above code on the ubuntu12.04 32bit system, the C + + compiler is g++4.6, different compilers may have implementation differences, and if it is a 64-bit system, the conversion of the pointer needs to be corrected, this is hereby explained.

C + + Object memory model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.