Inverse Knowledge 13 Speaking, the representation of arrays in the Assembly, and the restoration of arrays
To understand the properties of an array before explaining the array
1. Data has continuity
2. Same data type
Like what:
int ary[3] = {0,1,2};
As we can see, the array defined above, the data is contiguous, where each data type size is of type int (the same type)
To identify an array in a compilation:
1. Address Continuous
2. With proportional factor Addressing (Lea Reg32,[xxx + 4 *xxxx])
The representation of a one-dimensional array in the assembly
First, the array addressing formula, easy to explain below
Formula: Array first address + sizeof (type) * n
Pseudo code:
int ary[3] = {n/a};
Ary[n] = 1;
sizeof (type): This is the type of the array element, such as the above is an array of int type, we seek the type of array element sizeof (ary[0]);
The value of n = = is the subscript operation, for example, we require the first item in the array, (the element is 2, starting from zero),
Substituting formula: Ary + sizeof (ary[0]) * 1
= Ary +4 * 1
= Ary + 4 The content is the element 2.
See Example:
Advanced Code:
int Main (intChar* argv[]) { int ary[3] = {0,1 ,2}; int 0 ; scanf ("%d",&i); 3; // this sentence will produce an array addressing formula return ary[i];}
Assembly code under Debug:
The important code
The red area also has the following add esp,8 belong to scanf above the code, to the array initialization and so on, the important code belongs to the pink box
1. Assigning local variables to ECX
2.[EBP + ecx * 4 + var_c], write 3, where EBP + Var_c is the first address of the array, 4 is sizeof (type), and ECX is the N value.
Then we'll substitute our array addressing formula.
Ary + sizeof (type) * n
= [Ebp + var_c + 4 * ECX]
Only the scale factor addressing changes and the formula is the same, where sizeof () evaluates to a constant.
If you like the assembly of this form of expression, you can change the array formula,
Into:
Ary + (n * sizeof (type)) compilation is this, in fact, the same.
Release under the assembly
Release is the same, and may not be the same as the debug assembly, but its essence is the same as the array addressing formula.
Ary + sizeof (type) * n
ary+ (n*sizeof (type))
There may be questions here about why ESP + Var_c is the first address of the array, not the +18h?
Because under vc6.0, it's ESP addressing, and this 18h just makes the adjustment, and Ida shows up like this to tell us that I'm going to use var_c, but because I'm an ESP address, I'm going to have to tweak it to find var_c.
In the higher version, the EBP is addressed directly. It's not important, just know it.
Two. The representation of two-dimensional arrays in the Assembly
Array addressing formulas are the same, but different,
1.sizeof (type) has changed. The value of type changes to its own low-dimensional
2. Not only for high-dimensional, low-dimensional also requires
The array addressing formula now changes to:
int ary[m][c];
Array first address +sizeof (Type[c]) * i + sizeof (type) * j; I and J are the values of subscript operations, such as ary[3][4] = 1, 3 is I,j is 4, do not mix with MC, MC is the value of the array definition.
The sizeof (Type[c]) becomes the lower dimension of the two-dimensional array.
If an array is:
int Ary[2][3] = {{1,2,3},{4,5,6}};
I asked for the location of 4,
We print the time to enter Ary[1][0] can print out 4
So we can figure out the position by hand.
Concluded
Ary + sizeof (type[c]) * i + sizeof (type) * j
Simplified formula:
Ary + C * sizeof (type) *i + sizeof (type) * J will go to this step at debug
Simplified formula:
ary + sizeof (type) * (i * C + j); Under release will be optimized for this step, because the common factor sizeof (type) has been found, can be proposed to
The surrogate formula gets:
ary + 4 * 1 * 3 + 0
= ary + 12
This means that the first address of the array + 12 is the location of the 4 address.
+12 in high-level languages, because we want%4 to align, so we have to/4
So 12/4 = 3, then if the pointer is to the first address of the array, then only +3 can get the element 4 of the array, which is also a one-dimensional array to access the two-dimensional array elements of the formula.
Code:
To summarize:
The first thing to know is the address formula of the array, because the dimension array is one dimension, so it is required that the high dimension also requires a low latitude. And the type value is to take its own lower dimension
Formula Array First address + sizeof (TYPE[C]) * i + sizeof (type) *j important, must understand
For example, learn the difference between the assembly under Debug and the array addressing formula under release
Advanced Code:
intMainintargcChar*argv[]) { intary[2][3] = {{1,2,3},{4,5,6}}; inti =0; intj =0; scanf ("%d%d",&i,&j); ARY[I][J]=9;//an array addressing formula is generated return 0;}
Compilation under Debug
Using our array addressing formula to derive
1.edx is the value of get I
2. edx * C corresponds to the value of the addressing formula sizeof (TYPE[C]) *i in our array.
The value of the first address + sizeof (type[c]) is calculated at 3.lea.
4.ECX gives the value of J
5.eax + 4 * ecx equivalent to the first address of the array + sizeof (type) *j
This familiarity with the array addressing formula is simple enough to see the assembly code.
So the array formula under debug will become
Array first address + sizeof (TYPE[C]) * i + sizeof (type] * j
Release under the assembly
As stated above, under release will optimize our original formula for
The form of the first address of the array + sizeof (type) * (C * i + j)
We'll look into the assembly.
1.eax draws the value of I
2.edx the value of the first address of the array
The value of the initial address of the array of 3.ECX + I * 2
4.add Eax,ecx re-write will eax,eax = array First address + I * 2 + I then can be simplified to the first address of the group + i * 3.
5. Use the array addressing formula [ESP + VAR18 + 4 * EAX] ESP + VAR18 to get the first address of the array + i + 4
Because the value of our J is 0, it is not the first address of the array that we imagined at release, + i * 3 * + 0,+0 optimized out.
The representation of three-dimensional arrays in the Assembly
In fact, the two-dimensional array describes how to request a high-dimensional array, status quo.
Has a three-dimensional array
int Ary[m][c][h]
Subscript Operation:
Ary[i][j][k] = 1;
The array addressing formula is:
Ary + sizeof (type[c][h]) * i + sizeof (Type[h]) *j + sizeof (type) *k the original mode under Debug
The formula is optimized under release:
Ary + sizeof (type) *c*h*i + sizeof (type) *h*j + sizeof (type) *k
Discover common factor continues to optimize
Ary + sizeof (type) * (C*h*i + h*j + k)
Found two h
Continue to simplify
Ary + sizeof (type) * (h* (c*i + j) + K);
So above is the final formula.
Advanced Code:
intMainintargcChar*argv[]) { intary[2][3][4] ={NULL}; inti =0; intj =0; intK =0; scanf ("%d%d%d",&i,&j,&k); ARY[I][J][K]=9;//an array addressing formula is generated return 0;}
Disassembly code under Debug:
The formula is first put out:
Ary + sizeof (type[c][h]) * i + sizeof (Type[h]) *j + sizeof (type) *k
Surrogate Formula See Assembly
1.eax = value of I
2. EAX * 30, equivalent to seek sizeof (TYPE[C][H]) * I
3. Find the first address of the group +eax, that is, find out the position of ary[m], give ecx assignment
4. Find out the value of J
5. Move Left 4, equivalent to 2^4, which is 16. This step is equivalent to the value of sizeof (Type[h])
6.ARY[M] + sizeof (Type[h]) value of ary[m][c]
7. Find out the value of K
8. The value of the array addressing formula ARY[M][C] + 4 * k.
The formula can be placed under Debug.
Release under the assembly
It says, release under the Assembly will be optimized, that is, our formula will be optimized.
Optimized to:
Ary + sizeof (type) * (h* (c*i + j) + K);
You can do this by substituting the formula.
Four-dimension array, high-dimensional array, array formula above, just pay attention to two points
1.sizeof (type) type types such as their own low-dimensional
2. To add a new formula
Like what
int ary[a][b][c][d];
The subscript is
I j k L
The array formula is:
Ary + sizeof (type[b][c][d]) * i + sizeof (Type[c][d]) * j + sizeof (Tyep[d]) * j + sizeof (type) *k
You can optimize yourself
Summarize:
The array addressing formula is familiar with the simplest array addressing formula, because the higher latitude is also derived from the above formula, except that the type changes,
The array addressing formula can be said that you use the pointer to any of the high-dimensional array values, the value can be used. Because in the high-dimensional in memory is also linear storage, that is, the representation of a one-dimensional array.
Inverse Knowledge 13 Speaking, the representation of arrays in the Assembly, and the restoration of arrays