Exploring the principles of coredump: virtual functions and virtual functions in Section 6.4 of Linux x86
In the previous section, we have explored the arrangement of class member variables. Now let's take a look at the arrangement of virtual function tables and member variables and the arrangement between virtual functions.
Let's take a look at an example:
1 #include <stdio.h> 2 class xuzhina_dump_c06_s3 3 { 4 private: 5 int m_a; 6 public: 7 xuzhina_dump_c06_s3() { m_a = 0; } 8 virtual void inc() { m_a++; } 9 virtual void dec() { m_a--; } 10 virtual void print() 11 { 12 printf( "%d\n", m_a ); 13 } 14 }; 15 16 int main() 17 { 18 xuzhina_dump_c06_s3* test = new xuzhina_dump_c06_s3; 19 if ( test != NULL ) 20 { 21 test->inc(); 22 test->inc(); 23 test->print(); 24 } 25 return 0; 26 }
Assembly code:
(gdb) disassemble mainDump of assembler code for function main: 0x08048560 <+0>: push %ebp 0x08048561 <+1>: mov %esp,%ebp 0x08048563 <+3>: push %ebx 0x08048564 <+4>: and $0xfffffff0,%esp 0x08048567 <+7>: sub $0x20,%esp 0x0804856a <+10>: movl $0x8,(%esp) 0x08048571 <+17>: call 0x8048450 <_Znwj@plt> 0x08048576 <+22>: mov %eax,%ebx 0x08048578 <+24>: mov %ebx,(%esp) 0x0804857b <+27>: call 0x80485cc <_ZN19xuzhina_dump_c06_s3C2Ev> 0x08048580 <+32>: mov %ebx,0x1c(%esp) 0x08048584 <+36>: cmpl $0x0,0x1c(%esp) 0x08048589 <+41>: je 0x80485c1 <main+97> 0x0804858b <+43>: mov 0x1c(%esp),%eax 0x0804858f <+47>: mov (%eax),%eax 0x08048591 <+49>: mov (%eax),%eax 0x08048593 <+51>: mov 0x1c(%esp),%edx 0x08048597 <+55>: mov %edx,(%esp) 0x0804859a <+58>: call *%eax 0x0804859c <+60>: mov 0x1c(%esp),%eax 0x080485a0 <+64>: mov (%eax),%eax 0x080485a2 <+66>: mov (%eax),%eax 0x080485a4 <+68>: mov 0x1c(%esp),%edx 0x080485a8 <+72>: mov %edx,(%esp) 0x080485ab <+75>: call *%eax 0x080485ad <+77>: mov 0x1c(%esp),%eax 0x080485b1 <+81>: mov (%eax),%eax 0x080485b3 <+83>: add $0x8,%eax 0x080485b6 <+86>: mov (%eax),%eax 0x080485b8 <+88>: mov 0x1c(%esp),%edx 0x080485bc <+92>: mov %edx,(%esp) 0x080485bf <+95>: call *%eax 0x080485c1 <+97>: mov $0x0,%eax 0x080485c6 <+102>: mov -0x4(%ebp),%ebx 0x080485c9 <+105>: leave 0x080485ca <+106>: ret End of assembler dump.
The code above shows that after the constructor is executed, the value of m_a of test will change to 0. From the Assembly above, we can see that the this pointer is placed in the ebx register before and after calling the constructor.
When 0x08048578 and 0x08048580 are all broken points, check whether the address pointed to by this pointer is such a result.
(gdb) tbreak *0x08048578Temporary breakpoint 1 at 0x8048578(gdb) tbreak *0x08048580Temporary breakpoint 2 at 0x8048580(gdb) rStarting program: /home/buckxu/work/6/3/xuzhina_dump_c6_s3 Temporary breakpoint 1, 0x08048578 in main ()(gdb) x /4x $ebx0x804a008: 0x00000000 0x00000000 0x00000000 0x00020ff1(gdb) cContinuing.Temporary breakpoint 2, 0x08048580 in main ()(gdb) x /4x $ebx0x804a008: 0x080486d0 0x00000000 0x00000000 0x00020ff1
It is very strange that according to the content in the previous section, the address 0x804a008 should be stored in m_a and will be initialized to 0. What has the constructor like xuzhina_dump_c06_s3 done? What is 0x080486d0?
Take a look at the constructor class xuzhina_dump_c06_s3:
(gdb) disassemble _ZN19xuzhina_dump_c06_s3C2EvDump of assembler code for function _ZN19xuzhina_dump_c06_s3C2Ev: 0x080485cc <+0>: push %ebp 0x080485cd <+1>: mov %esp,%ebp 0x080485cf <+3>: mov 0x8(%ebp),%eax 0x080485d2 <+6>: movl $0x80486d0,(%eax) 0x080485d8 <+12>: mov 0x8(%ebp),%eax 0x080485db <+15>: movl $0x0,0x4(%eax) 0x080485e2 <+22>: pop %ebp 0x080485e3 <+23>: ret End of assembler dump.
From the constructor compilation, we can see that the value 0x80486d0 is set in the constructor, but it is not clear what it is. While
0x080485d8 <+12>: mov 0x8(%ebp),%eax 0x080485db <+15>: movl $0x0,0x4(%eax)
But it exactly corresponds
7 xuzhina_dump_c06_s3() { m_a = 0; }
That is to say, if the first member variable m_a of the class xuzhina_dump_c06_s3 is placed at the point of the offset this pointer, What Is 0x80486d0 and occupies the position m_a?
Let's take a look at the compilation of the main function:
(gdb) disassemble mainDump of assembler code for function main: 0x08048560 <+0>: push %ebp 0x08048561 <+1>: mov %esp,%ebp 0x08048563 <+3>: push %ebx 0x08048564 <+4>: and $0xfffffff0,%esp 0x08048567 <+7>: sub $0x20,%esp 0x0804856a <+10>: movl $0x8,(%esp) 0x08048571 <+17>: call 0x8048450 <_Znwj@plt> 0x08048576 <+22>: mov %eax,%ebx 0x08048578 <+24>: mov %ebx,(%esp) 0x0804857b <+27>: call 0x80485cc <_ZN19xuzhina_dump_c06_s3C2Ev>=> 0x08048580 <+32>: mov %ebx,0x1c(%esp) 0x08048584 <+36>: cmpl $0x0,0x1c(%esp) 0x08048589 <+41>: je 0x80485c1 <main+97> 0x0804858b <+43>: mov 0x1c(%esp),%eax 0x0804858f <+47>: mov (%eax),%eax 0x08048591 <+49>: mov (%eax),%eax 0x08048593 <+51>: mov 0x1c(%esp),%edx 0x08048597 <+55>: mov %edx,(%esp) 0x0804859a <+58>: call *%eax 0x0804859c <+60>: mov 0x1c(%esp),%eax 0x080485a0 <+64>: mov (%eax),%eax 0x080485a2 <+66>: mov (%eax),%eax 0x080485a4 <+68>: mov 0x1c(%esp),%edx 0x080485a8 <+72>: mov %edx,(%esp) 0x080485ab <+75>: call *%eax 0x080485ad <+77>: mov 0x1c(%esp),%eax 0x080485b1 <+81>: mov (%eax),%eax 0x080485b3 <+83>: add $0x8,%eax 0x080485b6 <+86>: mov (%eax),%eax 0x080485b8 <+88>: mov 0x1c(%esp),%edx 0x080485bc <+92>: mov %edx,(%esp) 0x080485bf <+95>: call *%eax 0x080485c1 <+97>: mov $0x0,%eax 0x080485c6 <+102>: mov -0x4(%ebp),%ebx 0x080485c9 <+105>: leave 0x080485ca <+106>: ret End of assembler dump.
By
0x0804857b <+27>: call 0x80485cc <_ZN19xuzhina_dump_c06_s3C2Ev> 0x08048580 <+32>: mov %ebx,0x1c(%esp)
Esp + 0x1c is used to store this pointer.
Let's take a look at these commands:
0x0804858b <+43>: mov 0x1c(%esp),%eax 0x0804858f <+47>: mov (%eax),%eax 0x08048591 <+49>: mov (%eax),%eax 0x08048593 <+51>: mov 0x1c(%esp),%edx 0x08048597 <+55>: mov %edx,(%esp) 0x0804859a <+58>: call *%eax 0x0804859c <+60>: mov 0x1c(%esp),%eax 0x080485a0 <+64>: mov (%eax),%eax 0x080485a2 <+66>: mov (%eax),%eax 0x080485a4 <+68>: mov 0x1c(%esp),%edx 0x080485a8 <+72>: mov %edx,(%esp) 0x080485ab <+75>: call *%eax 0x080485ad <+77>: mov 0x1c(%esp),%eax 0x080485b1 <+81>: mov (%eax),%eax 0x080485b3 <+83>: add $0x8,%eax 0x080485b6 <+86>: mov (%eax),%eax 0x080485b8 <+88>: mov 0x1c(%esp),%edx 0x080485bc <+92>: mov %edx,(%esp) 0x080485bf <+95>: call *%eax
Because it is a sequential structure, we can see that these three commands exactly correspond
21 test->inc(); 22 test->inc(); 23 test->print();
Analyze the third assembly:
0x080485ad <+77>: mov 0x1c(%esp),%eax 0x080485b1 <+81>: mov (%eax),%eax 0x080485b3 <+83>: add $0x8,%eax 0x080485b6 <+86>: mov (%eax),%eax 0x080485b8 <+88>: mov 0x1c(%esp),%edx 0x080485bc <+92>: mov %edx,(%esp) 0x080485bf <+95>: call *%eax
It can be seen that eax is just a pointer to the print virtual function. This pointer is finally obtained by esp + 0x1c. By
0x080485ad <+77>: mov 0x1c(%esp),%eax 0x080485b1 <+81>: mov (%eax),%eax
We can see that it is obtained from the first member of the this pointer, that is, this member is a virtual function table pointer. According to the above analysis, we can see that the pointer value of this virtual function table is 0x80486d0. To verify whether it is a virtual function table pointer.
(gdb) x /4x 0x80486d00x80486d0 <_ZTV19xuzhina_dump_c06_s3+8>: 0x080485e4 0x080485f8 0x0804860c 0x75783931(gdb) shell c++filt _ZTV19xuzhina_dump_c06_s3vtable for xuzhina_dump_c06_s3(gdb) info symbol 0x080485e4xuzhina_dump_c06_s3::inc() in section .text of /home/buckxu/work/6/3/xuzhina_dump_c6_s3(gdb) info symbol 0x080485f8xuzhina_dump_c06_s3::dec() in section .text of /home/buckxu/work/6/3/xuzhina_dump_c6_s3(gdb) info symbol 0x0804860cxuzhina_dump_c06_s3::print() in section .text of /home/buckxu/work/6/3/xuzhina_dump_c6_s3
It can be seen that 0x80486d0 points to a virtual function table, and the table item order in it is exactly the same as the declared order of the virtual function.
According to the above analysis, the memory layout of the objects pointed to by test is as follows: