Example: The following program is known
1#include <stdio.h>2 Main ()3 { 4 intx[3][4] = {1,3,5,7,9, One,2,4,6,8,Ten, A} ;5 int(*p) [4] = x, k =1, m, n =0;6 for(m=0; M <2; m++) 7n + = * (* (P+M) +k);8printf"%d\n", n);9}
Try to write out the output value of the program. (Although I hate to do this kind of writing, I also believe that the programming language is practiced in practice, but the problem is still relatively classic, so still take as an example to explain what the array pointer is exactly what the thing)
When I first learned C, I was plagued by these two nouns. In fact, also blame the profound Chinese language, two words exchange position, the meaning of the expression is not the same. If directly from English, the pointer array is called array of pointers, the obvious focus is the array, as to what kind of array, is to store the pointers array. And the array pointer is called the pointer of an array, the focus is pointer, then what this pointer points to, is an array. Of course, this pointer to the array exactly what, also need square brackets dimension description, as well as the preceding type description.
Then go back to the previous example, X is a two-dimensional array of definitions, p is an array pointer to an array of length 4, one to the first line of X (the line of x is an int array of length 4) and the next for loop, then the p+1 value, plus K (actually 1) and then the value, and add it to the variable N. The loop performs a total of 2 times, taking the first element (i.e. x[0][1],x[1][1]) of line 1th and line 2nd (corresponding to the first subscript 0 and 1), so the final output is 3+11=14.
It is clearly not enough to analyze the light from the paper. The GCC compiler generates the following code for the above program
10x401340Push%EBP20x401341mov%ESP,%EBP30x401343 and$0xfffffff0,%esp40x401346Sub$0x50,%esp50x401349Pager0x4019d0 <__main>6 0x40134e movl $0x1,0x10 (%ESP)7 0x401356 movl $0x3,0x14 (%ESP)8 0x40135e movl $0x5,0x18 (%ESP)9 0x401366 movl $0x7,0x1c (%ESP)Ten 0x40136e movl $0x9,0x20 (%ESP) One0x401376 MOVL $0xb, 0x24 (%ESP) A 0x40137e movl $0x2,0x28 (%ESP) - 0x401386 movl $0x4,0x2c (%ESP) - 0x40138e movl $0x6,0x30 (%ESP) the 0x401396 movl $0x8,0x34 (%ESP) - 0x40139e movl $0xa,0x38 (%ESP) - 0x4013a6 movl $0xc,0x3c (%ESP) -0x4013aeLea0x10 (%ESP),%eax +0x4013b2mov%eax,0x44 (%ESP) - 0x4013b6 movl $0x1,0x40 (%ESP) + 0x4013be movl $0x0,0x48 (%ESP) A 0x4013c6 movl $0x0,0x4c (%ESP) at0x4013cejmp0x4013f9 <main+185> -0x4013d0mov0x4c (%ESP),%eax -0x4013d4Lea0x0 (,%eax,4),%edx -0x4013dbmov0x40 (%ESP),%eax -0x4013dfAdd%edx,%eax -0x4013e1Lea0x0 (,%eax,4),%edx in0x4013e8mov0x44 (%ESP),%eax -0x4013ecAdd%edx,%eax to0x4013eemov(%eax),%eax +0x4013f0Add%eax,0x48 (%ESP) - 0x4013f4 addl $0x1,0x4c (%ESP) the 0x4013f9 Cmpl $0x1,0x4c (%ESP) *0x4013feJle0x4013d0 <main+144> $0x401400mov0x48 (%ESP),%eaxPanax Notoginseng0x401404mov%eax,0x4 (%ESP) - 0x401408 movl $0x403024, (%ESP) the0x40140fPager0x401c40 <printf> +0x401414Leave A0x401415ret
Where the 4th line compiler allocates memory space 0x50 bytes to the local variable (auto) on the stack, 6~17 the line, and the compiler initializes the two-dimensional array x, where the address of x[0][0] is%esp+10. 19~22 lines are initialized for P,k,m,n, respectively. (as can be seen, p initialization uses the Leal instruction takes the address of the first element, and p occupies only 4 bytes, that is, from the data size, the array pointer is essentially a pointer)
Now you want to study how the compiler operates on an array of pointers, which can be anchored to a 24~35 line through the JLE directive. In the original C language code, the body-statement of the For loop has only one compound statement, and the last operation obviously corresponds to the summation, that is, the add instruction of 32 lines (33 lines of ADDL is obviously the counter summation, because 34 lines use the Cmpl command to determine the size). In the add instruction of 32 rows, the%esp+48 corresponding variable n,31 line is addressed with the value of%eax as the address, and the value of the address%eax is placed in%eax, which clearly corresponds to the outermost one in the C language statement. The value of the%eax after the add instruction for 30 lines is clearly the expression: * (p+m) +k value.
The point is to understand how the compiler parses the expression. 24 Rows Take%esp+0x4c (the value of M), 25 lines with Leal instruction will m*4 and put into the%edx register, 26 rows take%esp+0x40 (k value) into the register%eax, 27 will add%eax and%edx value, get the entire offset address 4m+k, 28 The entire offset address is multiplied by 4 to get the actual byte offset address, and 29 rows are added to the address of the first element of the array, resulting in the value of the expression * (p+m) +k. Thus, the 25-line Leal instruction obtains a coefficient of 4, which corresponds exactly to the length of the defined array pointer 4. if (*p) [4] is changed to (*P) [3] in the original title, then the compiler gets the following code (only within the loop):
10x4013d0mov0x4c (%ESP),%edx20x4013d4mov%edx,%eax30x4013d6Add%eax,%eax40x4013d8Add%eax,%edx50x4013damov0x40 (%ESP),%eax60x4013deAdd%edx,%eax70x4013e0Lea0x0 (,%eax,4),%edx80x4013e7mov0x44 (%ESP),%eax90x4013ebAdd%edx,%eaxTen0x4013edmov(%eax),%eax One0x4013efAdd%eax,0x48 (%ESP) A 0x4013f3 addl $0x1,0x4c (%ESP) - 0x4013f8 Cmpl $0x1,0x4c (%ESP) -0x4013fdJle0x4013d0 <main+144>
Here the compiler uses two add instructions to calculate the array length 3 instead of the original Leal instruction to calculate the length of the array 4 (the compiler will often choose the appropriate instruction to reduce the cost, such as the shift and addition instructions instead of constant multiplication, but will make the sink code and C code correspondence is not very obvious), Then the code is the same as the original.
As you can see, the array pointer points to an array, the array pointer is self-increment, and the actual address is pointed to the next dependent array. Since the two-dimensional array is actually stored in memory as a "row-first" rule in a linear array of one-dimensional, the compiler, in interpreting the array pointer, first calculates the length of the array pointed to by the array pointer (determined when the array pointer is defined), and then calculates the offset address based on the length of the array being pointed to. Add it to the address of the first element of the array to which it was initialized, adding it to the base address that was associated with a level two pointer. Therefore, the length of the array pointer and the row and column lengths of the actual two-dimensional array associated with it do not need to be strictly consistent, but for ease of use, the length of the array pointed to by the array pointer corresponds to the size of the two-dimensional array that is actually required to operate.
In fact, when accessing elements of the I row J column in a two-dimensional array D (defined as ElementType d[r][c]), the general addressing method is
&d[i][j]=xd+l (c i+j), where XD is the first address of the two-dimensional array, the size of the element data type for the arrays, and C for the definition of the length of the president.
The addressing of array pointers is essentially consistent. In the initial example, the formula is xd=p,i=m,j=k.
Reference: In-depth understanding of computer system Second Edition, p158.3.8 section array allocation and access.
Array pointers and the addressing of two-dimensional arrays