Reference: http://blog.csdn.net/wanwenweifly4/article/details/6739687
Red is added by me, and others are from the original author.
After reading the above article "Understanding the C ++ reference implementation mechanism from the underlying compilation layer", I think it is good. At the same time, the programs in the text are verified on their own machines.
G ++ version used: G ++ (GCC) 4.5.1 20100924
To view the relationship between the compiled code and the source code, I use the following methods:
Use G ++ to generate the target file with debugging information: G ++-g-C ref. CC
Then run the objdump command to view the target file ref. O: objdump-s ref. O.
What is the reference type? What does it have to do with pointers? Does it occupy memory? Let's analyze these questions. First look at the Code:
# Include <stdio. h>
# Include <iostream>
Using namespace STD;
Void main ()
{
Int x = 1;
Int & B = X;
}
|
Int main () { Int x = 1; Int & B = X; Return 0; } |
View the Code through Assembly:
9: int x = 1;
00401048 mov dword ptr [ebp-4], 1
10: Int & B = X;
0040104f Lea eax, [ebp-4]
00401052 mov dword ptr [ebp-8], eax
|
00000000 <main>: Int main () { 0: 55 push % EBP 1: 89 E5 mov % ESP, % EBP 3: 83 EC 10 sub $0x10, % ESP Int x = 1; 6: C7 45 F8 01 00 00 00 movl $0x1,-0x8 (% EBP) Int & B = X; D: 8d 45 F8 lea-0x8 (% EBP), % eax 10: 89 45 FC mov % eax,-0x4 (% EBP) Return 0; 13: B8 00 00 00 mov $0x0, % eax } 18: C9 leave 19: C3 RET |
We can know that the address of X is ebp-4, and the address of B is ebp-8, because the variable memory in the stack is allocated from high to low. So B's address is lower than X's.
Lea eax, [ebp-4] This statement puts the address ebp-4 of X into the eax register
MoV dword ptr [ebp-8], eax this statement puts the value of eax into B's address ebp-8
The purpose of the above two compilations is to store the address of X into variable B. Isn't this the same as storing the address of a variable into the pointer variable?
Therefore, at the Assembly level, the reference is indeed implemented through pointers.
Next we will verify through the program. We know that at the program layer, as long as we directly involve the operation of referencing variables, we always operate onReferenced variable,That is to say, the compiler helps us to add * before the reference *. Therefore, to read the true "value of referenced variables", we must adopt certain policies. Well, we will bypass this feature of the compiler based on the distribution of variables in the stack.
[CPP]View plaincopyprint?
# Include <stdio. h>
# Include <iostream>
Using namespace STD;
Void main ()
{
Int x = 1;
Int y = 2;
Int & B = X;
Printf ("& X = % x, & Y = % x, & B = % x, B = % x \ n", & X, & Y, & Y-1, * (& Y-1 ));
}
Output result: & X = 12ff7c, & Y = 12ff78, & B = 12ff74, B = 12ff7c |
# Include <cstdio> Int main () { Int x = 1; Int y = 2; Int & B = X; Printf ("& X = % x, & Y = % x, & B = % x, B = % x \ n", & X, & Y, & Y-1, * (& Y-1 )); Return 0; } The output result is: & X = bfe1b308, & Y = bfe1b304, & B = bfe1b300, B = 8048460 The results here are different from those of the author. You can refer to the subsequent explanations. |
Void main ()
{
Int x = 1;
Int & B = X;
Printf ("& X = % x, & B = % x \ n", & X, & B );
}
Output result: & X = 12ff7c, & B = 12ff7c. |
# Include <cstdio> Int main () { Int x = 1; Int & B = X; Printf ("& X = % x, & B = % x \ n", & X, & B ); Return 0; } Output result: & X = bfe74aa8, & B = bfe74aa8 |
The address of B cannot be obtained through & B, because the compiler will interpret & B as: & (* B) = & X, so & B will get & X. It also verifies all the operations on B, which is equivalent to the operations on X.
But we can indirectly get the address of B through & Y-1, so as to get the value of B: * (& Y-1) from the result can know that the value of B is the address of X, as a result, we can see from the ground implementation that the referenced variable does store the address of the referenced object, but it is transparent to senior programmers, and the compiler shields the difference between reference and pointer.
The following is the distribution of program variables in the memory stack. The referenced variables also occupy the memory space and should be 4 bytes of space.
Although at the bottom layer, the essence of reference is pointer, from the perspective of high-level languages, we cannot say that reference is Pointer. They are two completely different concepts. Some people say that reference is a restricted pointer. I do not agree with this statement,Because at the language level, the pointer has no relationship with the reference, and the reference is the alias of another variable.Any referenced operation is equivalent to an operation on the referenced variable. From the language level, we should not consider its underlying implementation mechanism, because these are transparent to you. Therefore, during the interview, if the interviewer asks this question, he can first talk about the reference at the language level, and then analyze the underlying implementation mechanism. Without any conditions, the reference is a pointer, and there is no difference.
For the following program:
# Include <cstdio>
Int main ()
{
Int x = 1;
Int y = 2;
Int & B = X;
Printf ("& X = % x, & Y = % x, & B = % x, B = % x \ n", & X, & Y, & Y-1, * (& Y-1 ));
Return 0;
}
My result is & X = bfe1b308, & Y = bfe1b304, & B = bfe1b300, B = 8048460
Different from the original author, I will compile the above program to get the following results:
00000000 <main>: # Include <cstdio> Int main () { 0: 55 push % EBP 1: 89 E5 mov % ESP, % EBP 3: 83 E4 F0 and $0xfffffff0, % ESP 6: 83 EC 30 sub $0x30, % ESP Int x = 1; 9: C7 44 24 28 01 00 00 movl $0x1,0x28 (% ESP)Set1 is assignedX (X in the stack0x28) 10: 00 Int y = 2; 11: C7 44 24 24 02 00 00 movl $0x2, 0x24 (% ESP)Set2. assign a valueY (Y in the stack0x24) Int & B = X; 19: 8d 44 24 28 Lea 0x28 (% ESP), % eaxSetAddress of X0x28 pass to register% Eax 1D: 89 44 24 2C mov % eax, 0x2c (% ESP)Set% Eax value assigned to stack0x2c (important here) Printf ("& X = % x, & Y = % x, & B = % x, B = % x \ n", & X, & Y, & Y-1, * (& Y-1 )); 21: 8d 44 24 24 Lea 0x24 (% ESP), % eaxStack0x24 address to register% Eax 25: 83 E8 04 Sub $0x4, % eaxSet% Eax Value Loss4 28: 8B 10 mov (% eax), % edX RegisterThe content indicated by the address in % eax is sent to the Register.% EdX 2a: 8d 44 24 24 Lea 0x24 (% ESP), % eaxStack0x24 address to register% Eax 2e: 83 E8 04 Sub $0x4, % eaxSet% Eax Value Loss4 31: 89 54 24 10 mov % edX, 0x10 (% ESP)Set% EdX content to stack0x10 35: 89 44 24 0C mov % eax, 0xc (% ESP)Set% Eax content to stack0xc 39: 8d 44 24 24 Lea 0x24 (% ESP), % eaxStack0x24 address to register% Eax 3D: 89 44 24 08 mov % eax, 0x8 (% ESP)Set% Eax content to stack0x8 41: 8d 44 24 28 Lea 0x28 (% ESP), % eaxStack0x28 address to register% Eax 45: 89 44 24 04 mov % eax, 0x4 (% ESP)Set% Eax content to stack0x4 49: C7 04 24 00 00 00 00 movl $0x0, (% ESP) 50: E8 fc ff call 51 <main + 0x51> Return 0; 55: B8 00 00 00 mov $0x0, % eax } 5A: C9 leave 5b: C3 RET |
Each of the above statements is explained.
The assembly code generated on my machine is to assign the address of X to the next address unit where the address in the stack is located.
From the assembly code generated by printf, we can also see that it is calculated in reverse order (& * (& Y-1), & Y-1, & Y, X) this also confirms that the function parameters mentioned in the C standard are written into the stack in reverse order.
To verify the above idea, change * (& Y-1) in the original program to * (& x + 1)
Result:
& X = bf9a74c8, & Y = bf9a74c4, & B = bf9a74cc, B = bf9a74c8
This is in line with the author.