Call of common member functions
Starting from this part, we will not only use memory information printing for exploration, but also track and observe the compilation code generated by the compiler to understand how the compiler implements these language features. The Assembly knowledge is beyond the scope of this article. I only parse the assembly code discussed with us. Understanding the knowledge to be discussed in this article does not require a complete compilation knowledge, but you must understand the minimum concept.
Next, let's take a look at the impact of virtual inheritance. To make a comparison, Let's first look at the call of common member functions.
Run the following code:
C010 OBJ;
Print_obj_adr (OBJ)
OBJ. Foo ();
C012: sfoo ();
C010 * PT = & OBJ;
Pt-> Foo ();
The result is as follows:
OBJ's address is: 0012f843
This is the memory address of the OBJ object.
First, let's look at the object's common member function call, obj. Foo ();, the corresponding assembly code is:
00422e09 Lea ECx, [EBP + fffff967h]
00422e0f call 0041e289
Row 3 stores the object address in the ECX register. After executing this command, we will see that the value in ECx is 0x0012f843, Which is the value printed above. If the function needs to pass parameters, we will see some push commands in front. In row 3, we can see that the call is a direct address, which is static binding. That is, the call address of the function has been determined by the compiler during compilation.
After tracking, we want to see that it is a jump command. If we continue to execute the command, we can see the real function code section as follows (Note: I added a line number before the first line for convenience of discussion ):
01 00425fe0 push EBP
02 00366fe1 mov EBP, ESP
03 0010000fe3 sub ESP, 0cch
04 00425fe9 push EBX
05 00425fea push ESI
06 00425feb push EDI
07 00425fec push ECx
08 0020.fed Lea EDI, [EBP + ffff34h]
09 000000ff3 mov ECx, 33 H
10 00366ff8 mov eax, 0 cccccccch
11 00366ffd rep STOs dword ptr [EDI]
12 000000fff pop ECx
13 00426000 mov dword ptr [ebp-8], ECx
14 00426003 mov eax, dword ptr [ebp-8]
15 00426006 mov byte PTR [eax], 2
16 00426009 pop EDI
17 00000000a pop ESI
18 00000000b pop EBX
19 00000000c mov ESP, EBP
20 00000000e pop EBP
21 00000000f RET
Let's take a look at Row 3 and import the ECX Register into the stack. the last four lines initialize the part of the function stack that saves local variables. The value of ECx is displayed in row 12th. The value of ECx is the memory address of the object stored before the function call. Row 13th stores the value of this pointer as a local variable. In this way, we know that vc7.1 does not pass this pointer through the pressure stack as it passes through common functions, but through the ECX register. Rows 14th and 15 use this pointer to assign values to the member variables of the object.
Let's look at the assembly code for calling static member functions:
00422e14 call 0041dd84
It is very direct because it does not need to process this pointer and traces the assembly code of the function. We can see that this pointer does not need to be processed. The specific code is not listed here.
Let's take a look at calling the common member function Pt-> Foo (); through a pointer. The resulting assembly code is as follows:
00422e25 mov ECx, dword ptr [EBP + fffff958h]
00422e2b call 0041e289
It is similar to the code used to call common member functions through objects. However, when the object address is stored in the ECX register, the address of the object is found by referencing the PT pointer.
Virtual function call
Let's look at the call of the virtual member function. Class c041 contains a virtual member function. Its definition is as follows:
Struct c041
{
C041 (): C _ (0x01 ){}
Virtual void Foo () {C _ = 0x02 ;}
Char C _;
};
Run the following code:
C041 OBJ;
Print_detail (c041, OBJ)
Print_vtable_item (OBJ, 0, 0)
OBJ. Foo ();
C041 * PT = & OBJ;
Pt-> Foo ();
The result is as follows:
The detail of c041 is 14 B3 45 00 01
OBJ: objadr: 0012f824 vpadr: 0012f824 vtadr: 0045b314 vtival (0): 0041df1e
We printed out the memory layout of the c041 object and its virtual table information.
Let's take a look at the assembly code of obj. Foo:
004230df Lea ECx, [EBP + fffff948 H]
004230e5 call 0041df1e
It is the same as the assembly code generated by calling common member functions as described in the fifth article. This shows that the function is called through an object. Even if the called function is a virtual function, it is also statically bound, that is, the address of the function is determined during compilation. No polymorphism occurs.
Let's take a look at the assembly code of the function.
01 0000003f0 push EBP
02 003473f1 mov EBP, ESP
03 00100003f3 sub ESP, 0cch
04 003473f9 push EBX
05 003473fa push ESI
06 003473fb push EDI
07 003663fc push ECx
08 003663fd Lea EDI, [EBP + ffff34h]
09 00426403 mov ECx, 33 H
10 00426408 mov eax, 0 cccccccch
11 001000040d rep STOs dword ptr [EDI]
12 0020.40f pop ECx
13 00426410 mov dword ptr [ebp-8], ECx
14 00426413 mov eax, dword ptr [ebp-8]
15 00426416 mov byte PTR [eax + 4], 2
16 0042641a pop EDI
17 00400001b pop ESI
18 0042641c pop EBX
19 004da-1d mov ESP, EBP
20 0020.41f pop EBP
21 00426420 RET
It is worth noting that lines 14th and 15 are supported. Row 3 moves the value of this pointer to the eax register, and row 3 assigns values to the first member variable of the class, in this case, we can see that [eax + 4] is used to get the variable address, that is, the 4-byte virtual table pointer at the beginning of the object layout is skipped.
Next let's take a look at the virtual function calling Pt-> Foo (); Through the pointer. The resulting assembly code is as follows:
01 004230f6 mov eax, dword ptr [EBP + fffff900h]
02 004230fc mov edX, dword ptr [eax]
03 004230fe mov ESI, ESP
04 00423100 mov ECx, dword ptr [EBP + fffff900h]
05 00423106 call dword ptr [edX]
In row 1st, the address directed by PT is moved into the eax register, so that eax saves the memory address of the object and is also the address of the class virtual table pointer. Row 3 retrieves the value pointed to by the pointer in eax (note that it is not the value of eax) to the edx register, which is actually the address of the virtual table. After executing these two commands, let's take a look at the values in eax and EDX, which are exactly the same as the vpadr and vtadr values in the OBJ virtual table information we printed earlier, which are 0x0012f824
And 0x0045b314. The second row also uses the ECX register to save and pass the object address, that is, the value of this pointer. In the call command line 5th, we can see that the destination address is a direct function address instead of calling through an object. Instead, the value in edX is used as a pointer for indirect calls. We already know that the real address in edX is the address of the virtual table. We also know that the virtual table is actually a pointer array. In this way, the call to the row 5th is actually the first entry in the virtual table, namely, the address of the c041: Foo () function. If the index of the virtual table entry corresponding to the called virtual function is not 0, the offset value after an index number 4 is added to edX is displayed. We can find that the PTR [edX] value is 0x0041df1e, which is the same as the vtival (0) value we printed. As mentioned above, this address is actually not a real function address. It is a jump command, and the execution proceeds to the real function code section (that is, the Code listed above ).
The process we see above is the process of dynamic binding. Because we use pointers to call virtual member functions, dynamic binding is generated even if the pointer type is the same as the object type. To ensure the semantics of polymorphism, the compiler determines a definite address value during compilation when generating call commands, unlike static binding. Instead, it finds the virtual table of the corresponding type of the object and the function address stored in the corresponding entries in the virtual table by using the virtual pointer pointing to the object. In this way, the specific function called is irrelevant to the pointer type and only related to the specific object, because the virtual pointer is stored in the specific object, the virtual table is only related to the object type. This is why polymorphism occurs.
Recall the c071 class discussed earlier (in the second article). When the subclass overrides the virtual function inherited from the parent class, the virtual table content of the subclass changes, and the difference between the content of the parent-class virtual table (see the information of the subclass and parent-class virtual table printed in article 2 ). Specifically, you can call the call process when calling the virtual function that has been overwritten by the quilt class by pointing to the parent class pointer of the subclass object. If you are interested, debug it yourself. It is not listed here.
In addition, we discussed the dynamic conversion of pointer types in the fourth article. Here we will use the c041, c042, and c051 classes to see the dynamic conversion of pointer types. For the definitions of these classes, see section 3. Classes c051 are inherited from c041 and c042, and both classes have virtual functions. Run the following code:
C051 OBJ;
C041 * pt1 = dynamic_cast <c041 *> (& OBJ );
C042 * pt2 = dynamic_cast <c042 *> (& OBJ );
Pt1-> Foo ();
Pt2-> foo2 ();
The compilation code corresponding to the first dynamic transformation is:
00404b59 Lea eax, [EBP + fffff8ech]
00404b5f mov dword ptr [EBP + fffff8e0h], eax
Because you do not need to adjust the pointer position, the pointer is directly assigned after the object address is retrieved.
The second dynamic transformation involves the adjustment of pointer positions. Let's take a look at its assembly code:
01 00404b65 Lea eax, [EBP + fffff8ech]
02 00404b6b test eax, eax
03 00404b6d je 00404b7d
04 00404b6f Lea ECx, [EBP + fffff8f1h]
05 00404b75 mov dword ptr [EBP + fffff04ch], ECx
06 00404b7b JMP 00404b87
07 00404b7d mov dword ptr [EBP + fffff04ch], 0
08 00404b87 mov edX, dword ptr [EBP + fffff04ch]
09 00404b8d mov dword ptr [EBP + fffff8d4h], EDX
The code is much more complex. After the & OBJ operation, a pointer is obtained. The first three lines of commands determine whether the pointer is null. The strange thing is that row 4th does not adjust the pointer position based on the address in eax (that is, the starting address of the object), but directly retrieves the address of [EBP + fffff8f1h] To the ECX register. In the 1st line command, [EBP + fffff8ech] Actually obtains the object address. The number that EBP adds is actually a negative number (complement), that is, the object's offset address. The difference between the two numbers is found to be 5 bytes. In this case, the 4th rows directly obtain the address after the pointer is adjusted, that is, the Pointer Points to the part of the object that belongs to c042. The subsequent Code uses a temporary variable and the edX register to save the adjusted pointer value to the pt2 pointer.
The code can be actually optimized into two lines:
Lea eax, [EBP + fffff8f1h]
MoV dword ptr [EBP + fffff8d4h], eax
In the third article, we mentioned that the c051 class has two virtual tables, and the corresponding object has two virtual table pointers. The reason why the c051 class is not merged is to process the dynamic conversion of pointer types. In combination with the previous discussion on polymorphism, we can better understand it. Pt2-> foo2 (); During the call, the object type is still c051, but after the pointer dynamic conversion pt2 points to the starting point of the part of the object that belongs to c042, that is, the second virtual table pointer. In this way, no additional processing is required for function calling. Let's take a look at the assembly code generated by pt1-> Foo (); and pt2-> foo2.
01 00404b93 mov eax, dword ptr [EBP + fffff8e0h]
02 00404b99 mov edX, dword ptr [eax]
03 00404b9b mov ESI, ESP
04 00404b9d mov ECx, dword ptr [EBP + fffff8e0h]
05 00404ba3 call dword ptr [edX]
06 00404ba5 cmp esi, ESP
07 00404ba7 call 0041 ddde
08 00404bac mov eax, dword ptr [EBP + fffff8d4h]
09 00404bb2 mov edX, dword ptr [eax]
10 00404bb4 mov ESI, ESP
11 00404bb6 mov ECx, dword ptr [EBP + fffff8d4h]
12 00404bbc call dword ptr [edX]
13 00404bbe cmp esi, ESP
14 00404bc0 call 0041 ddde
The first 7 actions are pt1-> Foo (); and the last 7 actions are pt2-> foo2 ();. The only difference is that the Pointer Points to different addresses, and the call mechanism is the same.
(Pending)