The variable of C language from the assembly point of view

Source: Internet
Author: User
Tags types of functions

1. Basic research

The program is compiled and connected, and then loaded with Debug.

We look at the contents of the main function at offset address 1FA:

Execution to 1FD, found that the offset address of N is 01a6, the segment address is stored in the DS register, for the 07C4.

Then look at the function F2:

The values of parameters A and B are passed by stack, and their segment addresses are stored in the SS register:

The value of the local variable c is stored here with the SI register, because c is exactly the int type, then is the local variable defined in the sub-function stored with a register? Let's add an assignment statement here to see what happens:

As can be seen, the local variable d is placed in the stack, and C is placed in the register Si, only the function to return C, the value of C is assigned to AX. What if the return value is not of type int? We have already studied this problem before: if it is 1 bytes of data, it is stored with Al, if it is 4 bytes of data, the high 16 bits are passed with DX, and the low 16 bits are passed with ax.

That is, the segment address of the global variable n is in the DS register, the segment address of the local variable A, B, and D is in the SS register, and the value of the local variable c is stored in the register SI instead of in memory, with no segment address. So the global variable n is stored in the data section where the program starts, and the local variable c is stored in the stack segment. The parameters A and b are stored in the stack segment. The return value of the function is stored in the register ax and DX by the size of the value. The storage space of a global variable is allocated at the beginning of the program, and is freed after the entire program is executed, and the assignment and release should be done by the function in the c0s.obj. When is the storage space for local variables allocated? We will increase the function of the local variable D F2 compared with the previous function F2, found a statement "sub sp,2", then the assignment of D to "mov word ptr [bp-2],4", which indicates that "sub sp,2" is to allocate the local variable d for the stack space instructions, Local variables are allocated at the beginning of the execution of a child function, are local variables allocated at the entrance to the function, or are they allocated at the local variable definition in the function? Because the C standard used by TC2.0 requires all definitions of variables to be used at the beginning of the function, the two methods are the same here. The "mov sp,bp" command at the end of the function restores the SP's value, which is the space for the local variable d, so the local variable's storage space is released at the end of the function. As can be seen from the program, the function parameter storage space is allocated when the function is called in the main function, that is, the value of the parameter into the stack, and after the function is returned, the parameter is freed from the stack with pop cx.

The main function calls the F3 function to use the statement is "call 076a:0239", that is, the address of the direct calling function + offset address, we look at the contents of the F3 function:

It was found that F3 returned with RETF, that is, the IP and CS are out of the stack. So for the far type function, the call is called with the address + offset address, return to use RETF the segment address and offset address are out of the stack.

Take another look at program 2:

Observe the contents of the function f:

The storage space of n is found to be SI register, and A's storage space is two bytes with ds:0194 as address. When are their storage spaces allocated? We know that the storage space of the local variable n is allocated at the beginning of the function, while the storage space of a is a fixed memory space, not a stack segment, the space at the end of the function is freed and A's space is not freed. Online access to information that static local variables and global variables to allocate storage space is the same, and has the same life cycle, only static local variables can only be used in defined functions.

Observing the main function, there is no statement to release the static local variable, and the storage space of the static local variable is also allocated and released by the function in C0s.obj.

Our observation of the execution results of the program can also be found:

No matter how many times the F function executes, the value of each output n is 1, because it is a local variable, the F function is released after the end, and a is a static local variable, which is equivalent to a global variable, its value can be continuously accumulated.

Take another look at program 3:

The contents of the main function are:

The A, B, C, A1, A2 are all global variables, except that they are of different types. Are their storage spaces adjacent? Look at the contents of the data segment at offset address 194:

You can see that the data segment stores 5 values of 1, and their storage space is immediate.

The integer has a storage space of 2 bytes, a character type of 1 bytes, and a long integer of 4 bytes.

In addition to the 1 operation, the integer is the inc word PTR, which operates on 1 characters of data; The character type is inc byte PTR, which operates on 1 bytes of data, the long integer is the low four bits of data first, and then the bitwise operator ADC to the high four-bit operation results.

See Program 4 below:

Observe the contents of the main function:

We pay attention to the assignment of variables a and B to each data item: A has a fixed memory address for each item of data, and B's data items are stored in the stack because a is a global variable and B is a local variable. and the respective storage space of the data items in a and B is adjacent.

It is observed that, after the assignment, the program has a large section of instructions that are used to perform the functions of the printf function.

See program 5 below:

The contents of the main function are:

The LEA instructions appear in the discovery procedure, and the function of the LEA instruction is to take the offset address. There are a lot of calls in the program, after the experiment, found that the call F function is called 0256, called the Func function is the call 0266.main function is how to transfer structure data A to function f? Let's take a look at where the struct data called in F is:

The data called in the stack, A.A is bp+4,a.b is Bp+6,a.c is bp+8.

Then the value of the main function should be the process of pressing the data item stack.

But we found that in the main function there is no pressure stack between the statement call 0266 and 0256, and only two functions are called: the calls 076a:1085 and calling 076A:10A1, these two functions must be the structure of the data and stack processing, But I find it difficult to read and understand what they are. So instead of thinking about it, let's look at where the Func () returned content is, and here's what the function func is:

We found that func, after assigning a value to the data item, also called the function at 076a:1085, and compared with the main function, the main function is to stack the DS, AX Register, and here is the SS, BX register stack, will be the segment address of the data item and the offset address of the first item to press the stack, Then call 076a:1085 for processing. But how does this function work? I can't come to a conclusion yet. Find the following passage online:

If the struct is large when the function returns a struct in C, the temporary variable of the struct is produced in the calling function, and the first address of the variable is passed to the called function, and the contents of this temporary variable are modified according to the address when the called function returns. The variable is then copied to the user-defined variable in the calling function, which is exactly How the so-called value transfer works in the C language.
If the struct is small , the temporary variable used to return the function can be saved in the register, and the value of the register will be copied to the user-defined variable when returned.

My understanding of this passage is that function 076a:1085 creates a temporary variable that copies the data of the struct object a of the local variable into this temporary variable, After the function func ends, the variable a in Func is freed from the stack, and then the main function is called again 076a:10a1, Pushes the value of this temporary variable to the function F use.

Then look at the contents of 076a:10a1:

Observing the function content, it is found that this is a move function that moves the data item from its original location to the location specified in the stack for the function call. 076a: The function of 1085 is similar. So the data from the function transfer structure is called the transport function, with the movsw or movsb instruction to carry the data item to the stack for the function call. Because of the time, this can not be studied carefully, and then continue to improve.

2. Expand Research

Problem:

(1) Program 1 function F2 076a:0234 the statement in JMP 0236 points to the next statement, this is not meaningless? What role does it play?

A: Here the JMP statement is jump to the release of the local variables of the closing sentence, so my guess is as follows: 1, the compiler in order to avoid errors in the program, you need to use JMP to jump to the end of the exact statement. 2. The compiler has reserved an interface for the program to hold other functions.

This is the case where the function return statement is at the end of the function, if it is a SELECT statement or if there are multiple return statements.

(2) The local variables in the function are the first defined in the SI register, the other in the stack section?

A: No, after the experiment, only if the local variable needs to be returned, it is stored in the SI register, otherwise it is stored in the stack.

(3) The difference between a static local variable and a global variable is that it can be accessed in all functions of the entire program, while the former can only be accessed in the defined function?

Answer: The most obvious difference is the difference in scope.

(4) loading the 3rd chapter of the 5 procedures. Look at the command at offset address 1FA, why some programs have "push BP" and "mov bp,sp" two instructions, some programs do not?

A: My 5 programs have a protection statement, if it is not possible that the compiler problem.

If compiled with TC2.0, there is, if you compile with TCC, this happens.

(5) In Program 1, the global variable n is defined by the statement "unsigned int n", or by the statement "N=0" in the main function?

A: It should be defined by the former, the variables defined outside the function, whether or not static, without initialization, the system is initialized to 0 by default. If you print n before the n=0 statement, you are able to print out its value.

(6) Why does the storage of data items in a struct not operate with push, pop instructions?

A: The meaning of the title should be the structure of the data parameter transfer and return is how to achieve, we are known to be implemented by the transport function, the storage structure function of the data section of the value of the whole move to a stack segment. So why not push, pop implementation? I think the reasons are as follows: 1, C language is the structure as a data type, and int, char and other data types, so it is the same way of processing and other data types, that is, to deal with it as a whole, if the use of push, pop, it will be in the data items in the separate processing, This is not in line with our intention to build the structure data type. 2, if you want to deal with the data within it, you need to know what data items there are, there are several, then it is necessary to do statistics, this is not good implementation (I have not found the way to achieve). 3, we only need to achieve the goal of the value of the transfer and do not need to deal with the data in this process, it is necessary to choose the simplest and fastest, least expensive method, it is clear that block handling is the best way.

(7) In program 4, after the declaration of the local variable struct stu B, if a char variable is defined later, the number of bytes occupied is 6 (char data occupies the number of bytes + local variable struct stu B of the data item); If you define an integer variable later, The number of bytes occupied is 8, at this point there are 1 bytes of padding, why?

For:

E is type int, Eee is char, the first 5 variables occupy 7 bytes, plus eee is 8 bytes.

Local variables are the same, with an int type E and a char type of EEE also 8 bytes.

This can happen if you add a separate int data after the structure data. The results of memory alignment can occur outside the structure.

(8) Re-study, different types of variables, the allocation of storage space.

A: The char variable accounts for 1 bytes, the variable of type int is 2 bytes, and the long type is 4 bytes. The int type occupies 2 bytes in the TC and 4 bytes in the VC bytes. Because the TC simulates a 16-bit DOS operating system, the VC simulates a 32-bit operating system.

(9) To study the program 5 again, find each C statement corresponding to the assembly code.

Struct n A;

Int b;--------------------Sub sp,6

A=func ();----------push SS;

Push BX;

Call 0266;

Push DS;

Push Ax;

MOV cx,6;

Call 076a:1085;

B=f (a);------------Lea Bx,[bp-6]

Mov Dx,ss

Mov AX,BX

Mov cx,6

Call 076A:10A1

Call 256

Printf ("%d", b);---mov si,ax

Push si

MOV ax,194;

Push Ax

Call 093a

Printf ("%d", F (func ()));---call 266

Mov Dx,ds

Mov cx,6

Call 076A:10A1

Call 256

ADD sp,6

Push Ax

Mov ax,198

Push Ax

Call 93a

Func ():

Struct n A;-----------Sub sp,6

a.a=1;----------------mov word ptr [bp-6],1

a.b=2;----------------mov word ptr [bp-4],2

a.c=2;----------------mov word ptr [bp-2],3

Return A;-------------mov bx,426

Push ds;

Push BX;

Lea Bx,[bp-6]

Push SS;

Push BX;

Mov cx,6

Call 076a:1085

(10) What is the general meaning of the difference between global variables and local variable storage methods?

A: We understand the local variables here as dynamic local variables. The storage space of global variables is fixed, local variables are allocated dynamically, their storage mode determines their characteristics: 1, scope. Global variables are available everywhere in the program, and local variables can only be used in defined functions. 2, life cycle. The life cycle of a global variable is the same as the whole program, and the life cycle of a local variable is the same as a function, which is released as the function ends. This method is more advantageous to reduce the memory cost of the program, avoid the variable definition error, ensure the independence of the function, make the program modular, easy to write and debug.

The global variable is placed in the data segment, and the local variable is placed in the stack segment. For example, a program has 100 functions, each with 5 local variables, if all in the data segment, it will cause too much memory overhead, bad management and call, so to use the stack segment to store local variables, this is the core mechanism of high-level language.

3. Research Summary

This chapter studies the storage of variables of various types of functions, which is a more important chapter.

The variable of C language from the assembly point of view

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.