C + + object model

Source: Internet
Author: User
Tags abstract exception handling inheritance

Today I saw a very good article on the internet, which is about the C + + class inheriting the memory layout. After looking at the benefit of a lot, now turned in my own blog inside, as a later review of the use.

--On VC + + object model
(US) Jane Gray
Chenghua translation

Preface of the Translator

A C + + programmer who wants to improve the level of technology should learn more about the semantic details of some languages. For the use of VC + + programmers, you should also know some VC + + for the interpretation of C + +. Inside the C + + Object model Although is a good book, however, the book more space, and the specific VC + + relationship smaller. Therefore, in terms of space and content, the translator thinks this article is a good starting point to understand the C + + object model in depth.
This article was seen before the very good, the old text reread, feel more understanding, so produced a translation, and share the idea. Although the article is not long, but time is limited, and several times in the translation nap asleep, procrastination spent one months.
On the one hand because my level is limited, on the other hand because the translation often nap, the wrong place is afraid many, welcome everybody criticize correct.

The original source of this article is MSDN. If you install MSDN, you can search for C + + Under the Hood. Otherwise, you can find http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarvc/html/jangrayhood.asp on the website.

1 Preface

Knowing how your programming language is implemented can be particularly meaningful for C + + programmers. First of all, it removes the mystery of the language we use, so that we are not completely incredulous about what the compiler does, and, above all, it gives us more certainty when we debug and use the advanced features of the language. This knowledge can also help us well when it comes to improving the efficiency of the Code.

This article focuses on answering some of these questions:
How the 1* class is laid out.
2* How member variables are accessed.
3* how member functions are accessed.
4* so-called "adjustment block" (adjuster thunk) is going on.
5* the cost of using the following mechanism:
* Single inheritance, multiple inheritance, virtual inheritance
* Virtual function call
* Cast to base class, or cast to virtual base class
* Exception Handling
First, we examine the layout of C-compatible structures (struct), single inheritance, multiple inheritance, and virtual inheritance.
Next, we talk about the access of member variables and member functions, of course, where the bread contains virtual functions;
Next, we examine how constructors, destructors, and special assignment operator member functions work, and how the arrays are constructed and destroyed dynamically;
Finally, the support for exception handling is simply introduced.

For each language feature, we will briefly describe the motives behind the feature, its own semantics (of course, this article is not "Introduction to C + +", we should be fully aware of this), and this feature in Microsoft's VC + + is how to achieve. Note here to distinguish between abstract C + + language semantics and their specific implementations. Other C + + vendors outside of Microsoft may provide a completely different implementation, and we occasionally compare the implementation of VC + + with other implementations.

2 class Layout

This section discusses the different memory layouts that are caused by different inheritance patterns.

2.1 c structure (struct)

C + + is also "basically" compatible with C, because it is based on C. In particular, the C + + specification uses the same as C on the structure, the simple single memory layout principle: The member variables are arranged in the order in which they are declared, aligned on the memory address according to the alignment principle specified in the implementation. all of the C + + vendors guarantee that their C/C + + compiler will have the exact same layout for the effective structure of the structures. Here, A is a simple C structure whose member layout and alignment are at a glance

View plain Copy to clipboard print?      struct A {char c;   int i;   }; struct A {char c; int i;};

Translator Note: From the figure above, a in memory occupies 8 bytes, according to the order of the declared members, the first 4 bytes contain a character (actually occupy 1 bytes, 3 bytes empty, complement), after 4 bytes contains an integer. A's pointer points to the byte at which the character begins.

2.2 C-structure with C + + features

Of course, C + + is not a complex c,c++ is essentially an object-oriented language: Contains inheritance, encapsulation, and polymorphism. The original C structure has been transformed into the cornerstone of the object-oriented world-class. In addition to member variables, C + + classes can encapsulate member functions and other things. Interestingly, however, unless, in order to implement hidden member variables introduced by virtual functions and virtual inheritance, the size of the C + + class instance depends entirely on the member variables of a class and its base class. The member function basically does not affect the size of the class instance.

The b provided here is a C structure, however, the structure has some C + + features: the "public/protected/private" keyword that controls member visibility, member functions, static members, and nested type declarations. While looking at the dazzling, in fact, only member variables occupy the space of class instances . It is important to note that the C + + Standard committee does not limit the order in which the segments separated by the "public/protected/private" keyword are implemented, so the memory layouts implemented by different compilers may not be the same. ( in VC + +, member variables are always listed in the order in which they are declared).

View plain Copy to clipboard print?   struct B {public:int bm1;   Protected:int bm2;      Private:int bm3;      static int BSM;      void Bf ();      static void BSF ();      typedef void * BPV;   struct N {};   }; struct B {public:int bm1; protected:int bm2; private:int bm3; static int bsm; void Bf (); static void BSF (); typedef VO id* BPV; struct N {}; };

Translator Note: In B, why the static int BSM does not occupy memory space. Because it is a static member, the data is stored in the data section of the program , not in the class instance.

2.3 Single Inheritance

C + + provides inheritance for the purpose of extracting commonalities between different types. For example, scientists classify species, thus having a kind of, genus, outline, etc. With this hierarchy, it is possible to classify something of a certain nature into the most appropriate classification level, such as "a mammal with a child". Since these attributes can be inherited by the quilt class, we can easily point out that "whales and humans can conceive children" simply by knowing that "whales and humans" are mammals. Those exceptions, such as the Platypus (egg-laying mammals), require us to overwrite the default attributes or behaviors.
The inheritance syntax in C + + is simple, adding ": Base" to the subclass. The following d inherits from base class C.

View plain Copy to clipboard print?      struct C {int C1;   void CF ();   }; struct C {int C1; void CF ();

View plain Copy to clipboard print?      struct D:C {int d1;   void DF ();   }; struct D:C {int d1; void df ();

Since the derived class retains all the properties and behavior of the base class, naturally, an instance of each derived class contains a complete base class instance data. In D, it is not that the base class C data must be placed in the data of D, but in this way, the C object address in D is guaranteed to be the first byte of the D object address. Under this arrangement, with a pointer to a derived class D, you do not have to compute the offset to get a pointer to base class C. This memory arrangement is used by almost all well-known C + + vendors (the base class member is in the front). at the single inheritance class level, each new derived class simply adds its own member variable to the base class's member variable. look at the above figure, the C object pointer and the D object pointer point to the same address.

2.4 Multiple Inheritance

In most cases, a single inheritance is sufficient. However, C + + provides multiple inheritance for our convenience.

For example, we have an organizational model, which has a manager class (Task), worker class (work). Then, for the first line of managers, that is, both from the superior manager to pick up the task of work, but also to subordinate workers to the role of task, how to express in the class level. Single inheritance is a bit Libushine here. We can arrange the manager class to inherit the worker class first, the line manager class again inherits the manager class, but this kind of hierarchy structure mistakenly lets the manager class inherit the worker class's attribute and the behavior. Vice versa. Of course, the line manager class can also be inherited only from one class (manager class or worker human), or one does not inherit, re-declare one or two interfaces, but such implementations do harm too much: polymorphism is not possible; The existing interface cannot be reused; most seriously, when the interface changes, it must be maintained in multiple places. The most plausible scenario seems to be that the first line manager inherits attributes and behaviors from two places--managers, workers.

C + + allows multiple inheritance to be used to solve such a problem:

View plain Copy to clipboard print? struct Manager ...   { ... }; struct Worker ...   { ... };   struct Middlemanager:manager, Worker {...}; struct Manager ... { ... }; struct Worker ... { ... }; struct Middlemanager:manager, Worker {...};

What kind of class layout does such inheritance cause? Let's use the word "alphabet" to illustrate the following:


View plain Copy to clipboard print?      struct E {int e1;   void EF ();   }; struct E {int e1; void ef ();


View plain Copy to clipboard print?      struct F:C, E {int f1;   void ff ();   }; struct F:C, E {int f1; void ff ();
Structure F is derived from multiple inheritance of C and E. As with single inheritance, the F instance copies all the data for each base class. Unlike single inheritance, the object pointers for embedded two base classes cannot all be the same as derived class object pointers under multiple inheritance:
View plain Copy to clipboard print?   f F;    (void*) &f = = (void*) (c*) &f;    (void*) &f < (void*) (e*) &f; f F; (void*) &f = = (void*) (c*) &f; (void*) &f < (void*) (e*) &f;
Translator Note: The above line shows that the C object pointer is the same as the F object pointer, and the following line shows that the E object pointer differs from the F object pointer.

Observe the class layout, you can see the E object embedded in F, and its pointer is not the same as the F pointer. As the following discussion of forced conversions and member functions points out, this offset can result in a small amount of invocation overhead.

specific compiler implementations are free to choose the layout of embedded base classes and derived classes. VC + + According to the declaration order of the base class to arrange the base class instance data, the last to arrange the derived class data. of course, the derived class data itself is also laid out in the order in which it is declared ( This rule is not immutable , and we see that the memory layout is not the case when some base class has a virtual function and some other base class does not).

2.5 Virtual Inheritance

Back to our discussion of the front-line manager class example. Let's consider this scenario: what happens if both the manager class and the worker class inherit from the employee class.
View plain Copy to clipboard print?   struct Employee {...};   struct Manager:employee {...};   struct Worker:employee {...};   struct Middlemanager:manager, Worker {...}; struct Employee {...}; struct Manager:employee {...}; struct Worker:employee {...}; struct Middlemanager:manager, Worker {...};
If both the manager class and the worker class are inherited from the employee class, naturally, each class obtains a copy of the data from the employee class. In the case of no special treatment, an instance of the first line manager class will contain two instances of employee classes, each from two employee base classes. if the Employee class member variable is not large, the problem is not serious; if there are many member variables, then the extra copy will cause the instance to incur serious overhead when it is generated. Worse, these two different employee instances may be modified separately, resulting in inconsistent data. Therefore, we need to have a special statement from the manager class and the workers that they are willing to share an employee base class instance data.

Unfortunately, in C + +, this "shared inheritance" is called"Virtual inheritance", the problem seems to be very abstract. The syntax for virtual inheritance is simple, and you can add the virtual keyword when you specify a base class.
View plain Copy to clipboard print?   struct Employee {...};   struct manager:virtual Employee {...};   struct worker:virtual Employee {...};   struct Middlemanager:manager, Worker {...}; struct Employee {...}; struct manager:virtual Employee {...}; struct worker:virtual Employee {...}; struct Middlemanager:manager, Worker {...};
With virtual inheritance, there is greater implementation overhead, invocation overhead, than single inheritance and multiple inheritance. Recall thatin the case of single inheritance and multiple inheritance, the embedded base class instance address is either the same address as the derived class instance address (single inheritance, and the most Zoki class for multiple inheritance), or the address is a fixed offset (multiple inherited non-Zoki class). However, when virtual inheritance occurs, in general, the offset between the derived class address and its virtual base class address is not fixed, because if the derived class is further inherited, the resulting derived class places the shared virtual base class instance data at a different offset from the previous layer derived class. Take a look at the following example:


View plain Copy to clipboard print?      struct g:virtual C {int G1;   void GF ();   }; struct g:virtual C {int g1; void gf ();
Translator Note:GDGVBPTRG (in G, the displacement of G's virtual base pointer to g) means: in G, the offset between the G object's pointer and G's Virtual base class table pointer is 0, as the G object Memory layout first is the virtual base class table pointer; Gdgvbptrc (in G, the displacement of G's virtual base pointer to C) means: in G, the offset between the C object's pointer and the G's Virtual base class table pointer is 8.


View plain Copy to clipboard print?      struct h:virtual C {int h1;   void HF ();   }; struct h:virtual C {int h1; void hf ();

View plain Copy to clipboard print?      struct i:g, H {int i1;   void _if ();   }; struct i:g, H {int i1; void _if (); The VBPTR member variable is not investigated for the time being. From the above graphs you can see intuitively that in the G object, the embedded C-class object's data immediately follows the data of G, and in H objects, the embedded C-Class object's data is immediately followed by H's data. However, in the I object, the memory layout is not the case. In the memory layout of VC + +, the offset between G object and C object in G object instance is different from that between G object and C object in the I object instance. When you use a pointer to access a virtual base class member variable, because the pointer can be a base class pointer to an instance of a derived class, the compiler cannot compute the offset based on the declared pointer type, and you must find another indirect way to compute the location of the virtual base class from the derived class pointer.
in VC + +, for each class instance that inherits from virtual base class, a hidden "virtual base class Table pointer" (VBPTR) member variable is added, so as to indirectly compute the virtual base class position. The variable points to a class-wide shared offset table in which the item records the offset between the virtual base class table pointer and the virtual base class for the class.
One of the other implementations is the use of pointer member variables in derived classes. These pointer member variables point to the virtual base class of the derived class, with one pointer to each virtual base class. The advantage of this approach is that when you get the virtual base class address, the code used is relatively small. However, when the compiler optimizes code, it is usually possible to take steps to avoid the repeated calculation of the virtual base class address. Moreover, there is a big drawback to this approach: when deriving from multiple virtual base classes, the class instance consumes more memory space, and when you get the address of the virtual base class of the imaginary base class, you need to use pointers more than once, which is less efficient, and so on.

In VC + +, G has a hidden "virtual base class table Pointer" member, point to a virtual base class table, the second item of the table is G dgvbptrc. (in G, the offset between the address of the virtual base class object C and the "Virtual base class table pointer" of G ( the prefix before "D" is omitted when the offset is constant for all derived classes)). For example, on a 32-bit platform, the GDGVPTRC is 8 bytes. Similarly, the G object instance in the I instance also has a "virtual base class table pointer", but the pointer points to a virtual base class table that applies to "g in I" , with a value of IDGVBPTRC 20.

Looking at the preceding G, H and I, we can get the following conclusions about VC + + virtual inheritance under the memory layout:
1 First arranges the base class instances of the non-virtual inheritance;
2 when there is a virtual base class, add a hidden vbptr for each base class, unless a vbptr has been inherited from a class that is not a virtual inheritance;
3 to rank new data members of derived classes;
4 at the end of the instance, an instance of each virtual base class is arranged.

The layout arrangement makes the location of the virtual base class "floating" as a result of a derived class, but the Non-virtual base class is thus pooled together, with each offset fixed to the same amount.

3 member variables

After describing the class layout, we then consider the cost of accessing the member variables for different inheritance ways.

no inheritance: when there is no inheritance relationship, the access member variable is exactly the same as the C language: from the pointer to the object, consider a certain offset.
View plain Copy to clipboard print?   c* pc; pc->c1;    * (PC + DCC1); c* pc; pc->c1; * (PC + DCC1);
The PC is a pointer to C.
A. To access the member variable C1 of C, simply add a fixed offset to the PC DCC1 (in C, the offset value between the C pointer address and its C1 member variable), and then get the contents of the pointer.

single inheritance: because the offset between the derived class instance and its base class instance is constant 0, the calculation can be simplified by directly using the offset relationship between the base class pointer and the base class member.
View plain Copy to clipboard print?   d* PD; pd->c1; * (PD + DDC + dCc1);    * (PD + DDC1); pd->d1;    * (PD + DDD1); d* PD; pd->c1; * (PD + DDC + dCc1); * (PD + DDC1); pd->d1; * (PD + DDD1);
The Translator notes: D inherits from C and PD is a pointer to D.
A. When accessing the base class member C1, the calculated step should have been "PD+DDC+DCC1", that is, to first compute the offset between the D and C objects, and then add the offset between the C object pointer and the member variable C1. However, since DDC is constant to 0, it is possible to directly compute the offset between C object address and C1.
B. When accessing a derived class member D1, the offset is calculated directly.

Multiple Inheritance : Although the offset between a derived class and a base class may not be 0, the offset is always a constant. As long as it is a constant, accessing the member variable, the calculation of the member variable offset can be simplified. Visible even for multiple inheritance, the cost of accessing member variables is still not significant.
View plain Copy to clipboard

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.