ASM 32/64

Source: Internet
Author: User

I wrote it using NASM, running on 32-bit Windows and Linux hosts, but later the demand increased and needed to run on 64-bit Windows and Linux, and Windows itself had a WOW (Windows on Windows) mechanism, 32-bit programs can run on 64-bit machines without porting at all, while Linux does not have a LOL mechanism (Linux on Linux, not laugth out loud ha, hehe ~), but Linux can install Ia-libs libraries (IA should be Intel x86 Archive to the LOL effect, however, compiling ELF64 and Win64obj is also interesting to me, so I want to transplant the program!

The first is to understand the CPU, register, basically all of the 32-bit registers are upgraded, eax into a rax,ebx into RBX, and so on, their bandwidth has become longer, use naturally also cool, one processing 8 bytes, one step can do a lot of previous operations need a few steps. Register to increase the R8,R9,R10,R11,R12,R13,R14,R15, so many registers, but also less how much memory to do intermediate variables, high efficiency, can save their own use is R12-R15, formerly generally only esi,edi,ebx three registers as their own save, Now, there are R12-R15,RBX, a total of 5! Why not RSI and RDI? Well, in a Linux system, these two registers are used as parameter passing on a 64-bit CPU, so they are not generally used for saving, but it is important to rsi,rdi the two registers, and the LODSB,STOSB and the like are still in the RSI, Rdi Save the source address and destination address. This, I think did very bad, why not to take the new register to pass parameters, biased to use my beloved RSI and RDI register it ... I don't do cpu, I can't complain! Complaints to complain, in this case, to facilitate the transplant, it is best not to use the LODSB and other instructions, but directly with the base address and address the way to access memory.

Next is the function call, the Unix 6 ABI rules the use of RDI,RSI,RDX,RCX,R8,R9 to pass on the first two parameters, less than 6, in the order of the above, to a few with a few, more than 6, the first 6 in the order of the above in the Register, the remaining from the back forward to press into the stack, and then, Set the rax=0, and finally use the call command to invoke the function, if more than 6 parameters, after the function returned need to repair the stack, you previously pressed a few parameters, put the top pointer back a few * 8 bytes to balance the stack. Note that the ABI rules of windows are not the same!

Another 64-bit CPU does not support the 32-bit register directly into the stack, so, sorry, your push eax is not available, use the push Rax,pop Rax. However, the direct manipulation of the stack pointer Rsp/esp is a way to compile on both 32-bit and 64-bit CPUs, without problems, and to continuously push multiple numeric values (such as function calls), often one-time minus the ESP/RSP, and then with the base address add the address of the form of the parameters, will be more efficient than a single push parameter! When GCC makes API calls, that's how it's done, so in fact it's better to write a compilation than GCC, and without notice, the C program compiled by GCC will be more efficient than the program written by the sink. I generally formal projects are in C language, but NASM can let me understand deeper, this is speechless!!

And the function of its own implementation, or can be used in the previous C-call way, as follows:

12345678910 function: %define param1   rbp+16 %define param2   rbp+24 %define param3   rbp+32 enter 16,0 %define local1   rbp-8 %define local2   rbp-16 ... leave ret

Finally, the problem that bothered me at the time of porting is the return value of the C function, the return value of the C function in the 64-bit CPU is not in Rax, but in the edx:eax. In fact, most of the functions are no problem, generally in return-1, the problem is out, Edx:eax is-1, but Rax is not-1, high 32 bits are all 0. Low 32 bits are all 1.

Now time is not much, next time write an article detailed discussion.

Before the end, reference part of the C language document.

==========================================

Interfacing HLL code with ASM

C Calling Convention–standard stack frame

Arguments passed to a C function was pushed onto the stack, right to left, and before the function is called. The first thing the called function does is push the (e) BP Register and then copy (e) SP into it. This creates a data structure called, the standard C stack frame.

32-bit Code 16-bit code, TINY, SMALL, or COMPACT memory models 16-bit code, MEDIUM, LARGE, or HUGE memory models
Create standard stack frame, allocate-bytes for local variables, save registers Push EBP

MOV Ebp,esp

Sub esp,16

Push EDI

Push ESI

...

Push BP

MOV bp,sp

Sub sp,16

Push di

Push SI

...

Push BP

MOV bp,sp

Sub sp,16

Push di

Push SI

...

Restore registers, destroy stack frame, and return ...

Pop esi

Pop EDI

MOV ESP,EBP

Pop EBP

Ret

...

Pop si

Pop di

MOV sp,bp

Pop bp

Ret

...

Pop si

Pop di

MOV sp,bp

Pop bp

Retf

Size of ' slots ' in stack frame, i.e. stack width Bits + Bits + Bits
Location of stack frame ' slots ' [Ebp + 8]
[Ebp + 12]
[Ebp + 16] ...
[bp + 4]
[bp + 6]
[bp + 8] ...
[bp + 6]
[bp + 8]
[bp + 10] ...

If the argument passed to a function is wider than the stack, it'll occupy more than one ' slot ' in the stack frame. A 64-bit value passed to a function (long long or double) would occupy 2 stack slots in 32-bit code or 4 stacks slots in 16- Bit code.

Function arguments is accessed with positive offsets from the BP or EBP registers. Local variables is accessed with negative offsets. The previous value of BP or EBP is stored at [BP + 0] or [ebp + 0]. The return address (IP or EIP) is stored at [bp + 2] or [EBP + 4].

C Calling Convention–return values

A C function usually stores its return value in one or more registers.

32-bit Code 16-bit code, all memory models
8-bit return value AL AL
16-bit return value Ax Ax
32-bit return value EAX Dx:ax
64-bit return value Edx:eax Space for the return value was allocated on the stack of the calling function, and a ' hidden ' pointer to this space is pass Ed to the called function
128-bit return value Hidden pointer Hidden pointer

C Calling Convention–saving Registers

GCC expects functions to preserve the Callee-save registers:

EBX, EDI, ESI, EBP, DS, ES, SS

You need not save these registers:

EAX, ECX, EDX, FS, GS, eflags, floating point registers

In some OSes, FS or GS is used as a pointer to thread local storage (TLS), and must is saved if you modify it.

C calling Convention–leading underscores

Some C compilers (those for DOS and Windows, and those with COFF output) prepend a underscore to the names of C functions and global variables. If a C global variable, e.g. conv_mem_size, is accessed by ASM code, it should being declared with a leading underscore in th E ASM Code:

EXTERN _conv_mem_size; NASM syntax

mov [_conv_mem_size],ax

Linux ELF does not use underscores. Watcom C uses trailing underscores for function names, and leading underscores for global variables.

If your GCC supports it, leading underscores can be turned off with the compiler Option-fno-leading-underscore

Pascal Calling conventions

Function arguments is pushed onto, the stack from left to right before the function is called. C-style variable-length argument lists is not possible in Pascal. (Look in file Stdarg. H and think about it.)

In C, the calling function must "clean up the Stack" (remove function arguments from the stack after the called F Unction returns). In Pascal, the called function must does this, before returning.

Pascal identifiers is case-insensitive. Mykewlproc () would be stored in the object code file as Mykewlproc

Other calling conventions

the __stdcall calling convention, used by Windows, is a hybrid of the C and Pascal calling conventions. Like C, function arguments is pushed right-to-left. Like Pascal, the called function must the stack. Exception: The caller must clean up the stack for functions that accept a variable number of arguments, e.g. Prin TF (const char *format, ...);

Watcom C uses a register-based calling convention. See sections 7.4, 7.5, 10.4, and 10.5 in Cuserguide.pdf in the Watcom documentation. Individual functions can is declared to use the normal, stack-based calling convention.

GCC can is made to use a register calling convention by compiling with gcc-mregparm=nnn ...
See the GCC documentation for details.

ASM 32/64

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.