Learning Guide for assembly language (IV.)

Source: Internet
Author: User
Tags integer relative valid

Assembly analysis of high-level language programs

In high-level languages, such as C and Pascal, we no longer operate directly on the hardware resources, but rather on the solution of the problem, which is mainly embodied in the abstraction of data and the structure of the program. For example, we use variable names to access data and no longer care where the data is in memory. In this way, the use of hardware resources is completely given to the compiler to deal with. However, some basic rules still exist, and most compilers follow some specifications, which makes it easier for us to read the disassembly code. This is a major part of the assembly code that corresponds to a high-level language.

1. Ordinary variables. The variables that are usually declared are stored in memory. The compiler links the variable name to a memory address (note that the so-called "identified address" is a temporary address that is calculated at compile time for the compiler). A series of adjustments, such as relocation, when connected to an executable file and loaded into memory for execution, to generate a real-time memory address, but this does not affect the logic of the program, so do not care about these details, as long as you know all the function name and variable name corresponds to a memory address on the line. Therefore, the variable name in the assembly code is represented as a valid address, that is, the number of operands placed in square brackets. For example, in the C file, declare:

int my_age;

There is a specific memory location for the variable of this integral type. Statement my_age= 32; May appear in disassembly code as:

mov word ptr [007e85da], 20

So the valid address in square brackets corresponds to the variable name. Another example:

Char my_name[11] = "lianzi2000";

Such a description also identifies an address that corresponds to the my_name. If the address is 007E85DC, then [007e85dc]= ' L ' in memory], [007e85dd]= ' I ', etc. The access to My_name is the data access at this address.

The pointer variable itself also corresponds to an address, because it is also a variable itself. Such as:

Char *your_name;

At this time also determine the variable "your_name" corresponding to a memory address, assumed to be 007e85f0. Statement your_name=my_name; it is likely to behave as:

mov [007e85f0], 007E85DC your_name content is the address of my_name.

2. Register variable

Allow the description of register variables in C and C + +. register int i; Specifies that I is an integer variable for the register. Typically, the compiler places register variables in ESI and EDI. Registers are the internal structure of the CPU, access to it is much faster than memory, so the use of frequently used variables in registers can increase program execution speed.

3. Array

Regardless of the number of dimensions of the array, in memory always put all the elements are continuously stored, so in memory is always one-dimensional. For example, int i_array[2][3]; An address is identified in memory, and 12 bytes starting from that address are used to store the elements of the array. So the variable name I_array corresponds to the starting address of the array, which is the first element that points to the array. The order of storage is generally i_array[0][0],[0][1],[0][2],[1][0],[1][1],[1][2] that is, the rightmost subscript changes the fastest. When you need to access an element, the program converts from a multidimensional index value to a one-dimensional index, such as access i_array[1][1], and a one-dimensional indexed value in memory is 1*3+1=4. This conversion may be determined at compile time, or it may be determined at run time. In any case, if we load a generic register as a base address for the I_array addresses, access to the array elements is a question of calculating valid addresses:

; i_array[1][1]=0x16

Lea ebx,xxxxxxxx; i_array corresponding address mount EBX
mov edx,04; access i_array[1][1], which is determined at compile time
mov word ptr [ebx+edx*2], 16;

Of course, depending on the compiler and the context of the program, the implementation may be different, but the basic form is OK. You can also see the effect of the scaling factor here (remember that the scaling factor is 1,2,4 or 8?) Because simple variables always occupy the length of 1,2,4 or 8 bytes in the current system, the presence of scaling factors provides great convenience for look-up table operations in memory.

4. Structure and objects

The structure and the members of the object are stored continuously in memory, but sometimes there may be slight adjustments in order to align the word boundary or the two-word boundary, so the size of the object should be calculated with the sizeof operator rather than the size of the member. When we declare a struct variable or initialize an object, the struct variable and the object's name also correspond to a memory address. An example is provided:

struct TAG_INFO_STRUCT
{
int age;
int sex;
float height;
float weight;
} Marry;

The variable marry corresponds to a memory address. At the beginning of this address, there are enough bytes (sizeof (marry)) to hold all the members. Each member corresponds to an offset relative to this address. This assumes that all members of this structure are kept continuously, then the relative address of age is 0,sex 2, and the height is 4,weight 8.

; marry.sex=0;

Lea ebx,xxxxxxxx; marry corresponding memory address
mov word ptr [ebx+2], 0
......

Objects are basically the same. Note that the specific implementation of the member function is in the code snippet, where the object holds a pointer to the function.

5. Function call

When a function is defined, it is also determined that a memory address corresponds to the function name. Such as:

Long comb (int m, int n)
{
Long temp;
.....

return temp;
}

In this way, the function comb corresponds to a memory address. The invocation of it behaves as follows:

Call xxxxxxxx; comb the corresponding address. This function requires two integer parameters and is passed through the stack:

; Lresult=comb (2,3);

Push 3
Push 2
Call xxxxxxxx
mov dword ptr [yyyyyyyy], eax yyyyyyyy is the address of the long integer variable lresult

Please note two points here. First, in the C language, the parameters of the stack order is the opposite of the parameter order, that is, the back of the parameter first pressure stack, so first execute push 3. Second, in the 32-bit system we're talking about, if you don't specify the parameter type, the default is to press the 32-bit word. Therefore, the two push instructions have a total of two double words, or 8 bytes of data. Then execute the call instruction. The call instruction returns the address, that is, the next instruction (mov dword ptr ...). 32-bit address, and then jump to xxxxxxxx to execute.

At the entrance to the Comb subroutine (XXXXXXXX), the state of the stack is this:

03000000 (please recall small endian format)
02000000
YYYYYYYY <--esp point to return address

As I mentioned earlier, the standard starting code for subroutines is this:

Push EBP; Save the original EBP
MOV ebp, esp; Building frame pointers
Sub ESP, XXX, reserving space for temporary variables
.....

After the push EBP is executed, the stack is as follows:

03000000
02000000
Yyyyyyyy
Old EBP <----ESP point to original EBP

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.