6. assembly language basics-Summary and comparison of call stacks and various call conventions

Source: Internet
Author: User

Call Stack

The concept of stack is explained in detail in the data structure.

List some key points:

1. First in first out.

2. Data can always be stored or retrieved from the top of the stack.

 

In the x86 processor, push the stack command. Pushing an item to the top of the stack will reduce the top pointer of the stack by four bytes. The stack top pointer is stored in register esp. Correspondingly, the register name is the abbreviation of stack pointer.

 

Stack pressure

When the stack is pressed, the following events occur in sequence:

1. The top stack pointer ESP is reduced by 4 bytes.

2. Data to be pushed into the stack is copied to the address pointed to by ESP.

We can see that the stack is growing to a lower address, that is, the more content in the stack, the smaller the stack top pointer.

 

Outbound Stack

The following operation is performed once when the stack is output:

1. Data in the stack pointed to by the stack top pointer (ESP) is retrieved.

2. Four bytes are added to the top pointer of the stack.

In this way, the top pointer ESP always points to the available data in the next stack.

 

In sequence, we load data A, B, and C to the stack in sequence.

 

Thread Stack

In Win32, every time a thread is created, the virtual memory of 1 MB (1 million, one thousand, one thousand) is retained as the stack space, used by this thread. The ESP register points to the top of the reserved memory (the end with the largest address), so that the stack is initialized. To check the value of the current stack pointer, you can use the 'R' command in the debugger. The ESP register value is the stack pointer value.

 

Nested function call

ProgramIt is rarely a very long single function. Generally, different functions are responsible for different functions and will be called by the main function as needed. In the first five examples of assembly language basics, this situation is very good. So how does one call a function and then return to execute the program?ArticleThe detailed analysis andCodeAnnotations.

 

Call conventions

If the code you write uses the same language and the same compiler, it is useless to call the convention. If you use a variety of languages and different compilers, the problem of processing standards in different languages will arise. To enable programmers to handle multi-language mixed programming, Niu people have established rules to specify how to call functions and pass parameters. These rules are called call conventions.

For example, a programmer writes a function named SQRT in Pascal. This function carries a floating point parameter and returns the square root of the parameter.

Function SQRT (numargument: Real): real;

C programmers need to know how Pascal programmers expect to pass the numargument parameter to the SQRT function. To handle such mixed language programming, Pascal defines the Pascal call convention. This Convention is the stdcall we often see, and Pascal is outdated and no longer used.

 

The start and end codes are automatically added by the compiler to save registers, set the stack bottom, and restore the stack State at the end of the call. These two pieces of code are related to the CPU architecture and compiler. Understanding this part of the code is very important for understanding how to debug. The following example shows the start and end code automatically generated when stdcall is called.

Typical start code of the called function (Prolog)

Push

EBP

Frame pointer)

MoV

EBP, ESP

Change the stack base to the current stack top position

Sub

ESP, 8

Move the top of the stack up to eight bytes to save local variables.

Push

Push

Push

EBX

ECX

EdX

Save the value in the register

Typical caller end code (epilog)

Pop

Pop

Pop

EdX

ECX

EBX

Recovery register value

MoV

ESP, EBP

Restore the top pointer of the stack to point it to the bottom of the stack. In this way, the space occupied by local variables is released.

Pop

EBP

Restore EBP (frame pointer)

RET

0x8

Move 8 bytes down at the top of the stack to release the function parameters in the stack. (Let's take a look at the description of the RET command below)

 

Some functions need to save some register values when calling another function, so that they can be used after the call is resumed. Stack is a good place to temporarily save registers, so the code will do some additional stack operations to save some register values that will be used later.

 

The figure above is very clear. Then, let's take a closer look at the description of the RET command, so that we can better understand each meaning of the RET command.

RET: Command enable -- return.

1. return within a segment. First, send the words at the top of the stack to the IP address, and then increase SP by 2. If there is an immediate number, SP plus the immediate number (discard some parameters that enter the stack before the call is executed ).

2. Return between segments. After the words at the top of the stack are sent to the IP address (SP increases by 2), then the words at the top of the stack are sent to CS, and SP increases by 2. If there is an immediate number, SP plus the immediate number.

 

In the first article of this series, it shows that Win32 uses planar addressing without segment registers. However, for the sake of integrity, the outdated segment register table is listed below.

1. Code segment register Cs: stores the base values of the segments where the currently running program code is located. This indicates that the currently used instruction code can be obtained from the memory segments specified by this register, the corresponding offset value is provided by the IP address.

2. Data Segment register DS: Specifies the lowest address of the Data Segment used by the current program, that is, the Base Value of the Data Segment.

3. Stack segment register SS: Specifies the bottom address of the current stack, that is, the Base Value of the stack segment.

4. Additional segment register ES: indicates that the current program uses the Segment Base Address of the additional data segment, which is the segment of the destination string in the string operation command.

 

Stdcall call conventions

This call Convention has the following features:

* The parameter is input from right to left.

* The caller is responsible for clearing the stack.

* The function name starts with an underscore.

* Add a @ symbol after the function name, followed by the parameter size.

* The Name Process of a function is case-insensitive.

Functions with variable parameters cannot be processed.

 

Caller's Responsibilities

Top right parameters of the stack

Press next

The leftmost Parameter

Call function functionx

Responsibilities of the called Function

Push EBP

MoV EBP, ESP

Sub ESP, local_size

...

MoV ESP, EBP

Pop EBP

The number of bytes of the RET parameter.

 

Cdecl call conventions

This call Convention has the following features:

* Pass the parameter from right to left

* The caller is responsible for clearing the stack.

* The function name starts with an underscore.

* Case-insensitive conversion is not performed.

Therefore, the function named functioncall is recorded as _ functioncall in the symbol table.

* A variable number of parameters can be processed.

* Is the default call Convention for C and C ++ programs.

 

Is this statement correct?

Because the parameters are pressed from the right to the left, the initial parameter is at the position closest to the top of the stack. Therefore, when an indefinite number of parameters are used, the position of the first parameter in the stack must be known. As long as the number of parameters can be determined based on the explicit parameters of the first and later, the parameter can be used.

The analysis is as follows:

If stdcall is applied from right to left, why cannot stdcall process variable parameters? The most important difference between stdcall and cdecl is who cleans up stack parameters. Stdcall is cleared by the called function, while cdecl is cleared by the caller. The call convention is a responsibility assignment for the function call stack agreed between the caller and the called.

Msdn explains that cdecl can handle Variable Parameter Function calls in one sentence: Because the stack is cleaned up by the caller, it can doVarargFunctions. It seems that cdecl can implement variable parameters not because of the parameter pressure stack sequence.

Let's first find out a problem, where is the difficulty of variable parameters? How does one obtain the type and number of parameters?

The caller must know how many parameters are to be transferred. Specifically, the caller knows the size of the parameters to be passed in. The caller pushes them to the stack in turn. How do I know the size and number of parameters when the caller takes over the control? As mentioned above, the first parameter can be obtained, and the type can be accessed. Like the printf function, the first parameter contains a successor parameter indicator such as % d and % F. By using these parameters, you can obtain the type and number of the following parameters. The type and number of parameters are all known, and the size of all parameters is also known. The type and number of parameters depend on the function implementation. The callers of stdcall and cdecl can always obtain the size and number of parameters. This has nothing to do with who is going to clean up the stack. Note that the compiled function does not have code to calculate the size of the parameter passed to it.

I still didn't answer that question. Where is the difficulty of variable parameters?

Assuming that in the case of stdcall, The called function containing variable parameters successfully obtains all the parameters and completes the operation, but the return result is not calculated, there is no place to record the size of the stack space it accepts. He does not know how to clean up stacks. The caller is not responsible for clearing the stack. So it is troublesome.

The cdecl caller is responsible for clearing the stack. The caller knows how many parameters it has passed and can clean the stack smoothly so that the program can continue to run.

This section references: http://blog.csdn.net/ZhouHM/archive/2004/04/07/14721.aspx

Http://hi.baidu.com/dtzw/blog/item/cc17ba119eb39374cb80c4eb.html

 

Caller's Responsibilities

Top right parameters of the stack

Press next

The leftmost Parameter

Call function functionx

Increase ESP to the parameter size.

 

Function call responsibility (functionx)

Push EBP

MoV EBP, ESP

Sub ESP, local_size

...

MoV ESP, EBP

Pop EBP

RET

 

Fastcall call Convention

This call Convention means that when possible, the parameter will be placed in the register. Because not all parameters have a stack operation, it is faster than stdcall and cdecl, so it is called fastcall. Haha.

This call Convention has the following features:

* Pass the parameter from right to left

* Parameters in the called function clearing Stack

* Add the @ prefix before the function name, followed by a @, and add the parameter's number of bytes in the format of @ name @ number.

* The Case sensitivity of the function name is not converted.

* Functions with variable parameters cannot be processed.

 

Caller's Responsibilities

Press the rightmost parameter on the stack

Press the next parameter into edX

Press the first parameter into ECx

Call functionx

Accused by the caller (functionx)

Push EBP

MoV EBP, ESP

Sub ESP, local_size

Do function processing...

MoV ESP, EBP

Pop EBP

RET <number_of_pushed_arguments>; note that if there are two parameters, the RET command here will not contain the immediate number parameter, because no parameter is pushed to the stack.

 

Thiscall call conventions

This is a member function with a fixed number of parameters in C ++, the default call convention. This call convention cannot be specified because thiscall is not a keyword like stdcall or cdecl. In the thiscall convention, this pointer is passed to the called function through the ECX register.

 

Naked Function

A function that declares the naked attribute. No Prolog or epilog code is inserted. In this way, you can use the inline validator to write your own Prolog or epilog command sequence. The naked function is provided with advanced features. They allow you to declare such a function-the function is in a non-C or C ++ context (not a C or C ++ function), and there is no agreement on where the parameter is placed, it is unclear which registers are retained. In short, it is all done by programmers themselves.

It is suitable for writing C language programs that communicate with existing systems written in assembly languages.

What is inline aggreger? From Wikipedia

In computer programming, the inline runner er is a feature of Some compilers that allows very low level code written in assembly to be embedded in a high level language like C or Ada.

 

Compare several call conventions

Project/call Method _ Stdcall (win32) _ Cdecl _ Fastcall Thiscall (native C ++) Com _ Declspec (naked)(_ Declspec is a keyword of Microsoft, which may not exist on other systems)
Parameter pressure stack Sequence From right to left From right to left From right to left,

Arg1 in ECx,

Arg2 in edX

From right to left,

This pointer in ECx
From right to left,

Finally press this pointer
Programmer-defined
Parameter location Stack Stack Stack + register Stack, register ECx   Programmer-defined
Function used to clear parameters in the stack Called Caller Called Called Called Programmer-defined
Variable parameters are supported. No Yes No No No Programmer-defined
Function Name format _ Name @ number _ Name @ Name @ number     Programmer-defined
Function Name example _ Ntwaitforsingleobject @ 12 _ Printf @ Afpfsddispclosevol @ 4     Programmer-defined
Case-sensitive Conversion No No No No   Programmer-defined
Differences in function name modification conventions during C ++ Compilation

The function name starts with "@ YG" to identify the parameter table, followed by the parameter table;

The start ID of the parameter table is "@ ya "; The start ID of the parameter table is "@ Yi ".     Programmer-defined

 

Exercise:

1. Q: What do prologue and epilogue code do?

A: The opening and ending codes are automatically generated by the compiler. They are used to help programs save registers during execution, establish a stack framework, and restore stacks after function calls.

2. Q: What types of calling convention are there?

A: stdcall, cdecl, fastcall, thiscall, and nakedfunction.

3. Which call conventions depend on the called function to clean up stack parameters?

A: stdcall, fastcall, thiscall.

4. Which call conventions are generally used for applications written in C ++?

A: cdecl and thiscall

5. Which register is generally used to transmit this pointer on intel?

A: ECx.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.