The difference between the array name and the address of the group name
What kind of log does the following code print?
[CPP]View Plaincopy
- #include <stdio.h>
- int a[2] = {n};
- int main () {
- printf ("a =%p\n", a); //I
- printf ("&a =%p\n", &a); //II
- printf ("A + 1 =%p\n", a + 1); III
- printf ("&a + 1 =%p\n", &a + 1); IV
- return 0;
- }
Native (Linux) result output:
A = 0x804a014
&a = 0x804a014
A + 1 = 0x804a018
&a + 1 = 0x804a01c
Yes, the address above I and II prints is the same, and IV is 4 bytes greater than the address space of the third. Here is my explanation of this phenomenon, if there is something wrong, please prawns must be pointed out:
First, we cite the theory in the p141 of C and pointers :
In C, in almost all expressions that use an array, the value of the array name is a pointer constant, which is the address of the first element of the array. Its type depends on the type of the array element: if they are of type int, then the type of the array name is a constant pointer to int.
Seeing here I think I should know why I and III are the result.
For the II and IV are special cases, in the "C and pointers" P142 said, in the following two occasions, the array name is not represented by a pointer constant, that is, when the array name as the sizeof operator and the monocular operator & operand. sizeof returns the length of the entire array, not the length of the pointer to the array. Taking an address of an array name results in a pointer to an array, not a pointer to a pointer constant.
So the pointer returned after &a is a pointer to an array, which is different from the type of pointer to a (a pointer to a[0]).
then we use the symbol table and the assembler code to see how the compiler differentiates &a and a, and converts it into assembly code :
The symbol table is obtained by NM a.out as follows:
[Plain]View Plaincopy
- 。。。。。。。 Omitted a number of variables unrelated to this topic
- 0804a01c A _edata
- 0804a024 A _end
- 080484EC T _fini
- 08048508 R _FP_HW
- 080482BC T _init
- 08048330 T _start
- 0804a014 D A//A variable is stored in virtual address 0x0804a014
- 0804a01c b completed.7021
- 0804a00c W Data_start
- 0804a020 b dtor_idx.7023
- 080483C0 T Frame_dummy
- 080483e4 T Main//Address of main function
- U [email protected] @GLIBC_2.0
Call Gcc-s xx.c to get the assembly code:
[CPP]View Plaincopy
- . File "name_of_array.c"
- . Globl A
- . Data
- . Align 4
- . Type A, @object
- . Size A, 8 //From here we know that sizeof (a) equals 8
- A:
- . long 1 //From here you can see that the compiler directly converts the int in the. c file to a long type
- . Long 2
- . section. Rodata
- . LC0:
- . String "a =%p\n"
- . LC1:
- . String "&a =%p\n"
- . LC2:
- . String "A + 1 =%p\n"
- . LC3:
- . String "&a + 1 =%p\n"
- . text
- . GLOBL Main
- . type Main, @function
- Main
- PUSHL%EBP
- MOVL%esp,%EBP
- Andl $-16,%esp
- Subl $16,%esp
- MOVL $. LC0,%eax //I corresponding assembly code
- MOVL $a, 4 (%ESP)
- Movl%eax, (%ESP)
- Call printf
- MOVL $. LC1,%eax //II corresponding assembly code
- MOVL $a, 4 (%ESP)
- Movl%eax, (%ESP)
- Call printf
- MOVL $. LC2,%eax //III corresponding assembly code
- MOVL $a +4, 4 (%ESP)
- Movl%eax, (%ESP)
- Call printf
- MOVL $a +8,%edx //IV corresponding assembly code
- MOVL $. LC3,%eax
- MOVL%edx, 4 (%ESP)
- Movl%eax, (%ESP)
- Call printf
- MOVL,%eax
- Leave
- Ret
- . size main,.-main
- . Ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
- . section. Note. Gnu-stack,"", @progbits
assembly code corresponding to I MOVL $a, 4 (%ESP)
$ for the address, by the symbol table we know that a corresponds to 0x0804a014, so this code will print 0x0804a014. But we clearly write in the code is printf ("a =%p\n", a), (if A is not an array name but the general meaning of int variable, the corresponding assembler code should be MOVL A, 4 (%ESP) How to compile the assembly code will be a take address? I guess the compiler automatically adds a value to a, which translates to $ A.
Conclusion: For the user does not explicitly give the & code, the compiler translates automatically to the variable a plus the value of $, where the address of a gets the pointer type is determined by the array element.
II Skip
III movl $a +4, 4 (%ESP)
The A plus accessor gets a $ A, because the array element type is int, so the pointer needs to move four bytes of address space at a time. So c code A + 1 translates to assembly $a + 4
IV movl $a +8,%edx
The corresponding user code is printf ("a =%p\n", &a + 1), according to the theory in "C and pointers", when a is preceded by A & operator, the compiler will treat the address in a corresponding symbol table as a pointer to an array, and sizeof (a) is 8,
Thus &a + 1 will be translated to $ A + 8
Conclusion: For the user to explicitly give the & code, the compiler will take the address of a to get the pointer type as a pointer to the array.
Summary: The compiler determines the type of the pointer variable by whether the user gives &, and then translates it into the corresponding assembler code. Or, in other words, the,& character is just a value that indicates that the variable a takes the address and is treated as a pointer to what type, rather than to a fetch address operation.
The C Expert Programming P201 page refers to the array name in the expression (different from the declaration) that the compiler treats as a pointer to the first element of the array////////////////////////
1#include"stdafx.h"2#include"iostream"3 using namespacestd;4 5 int_tmain (intARGC, _tchar*argv[])6 {7 int*p =NULL;8 Chara[Ten];9cout<< ((int(p+1) -int(p))) <<Endl;Ten Onecout<< ((int(&a+1) -int(&a))) <<endl;//equivalent to the following A Char(*B) [Ten];//b represents a pointer to an array, a pointer to the cursor. -B = &A; -cout<< ((int(b +1) -int(b))) <<Endl; the -cout<< ((int(A +1) -int(a))) <<endl;//equivalent to the following -cout<< ((int(&a[0]+1) -int(&a[0])) <<Endl; - + return 0; -}
Print output:
4
10
10
1
1
The difference between the array name and the address of the group name