Recently, I have been intermittently engaged with some knowledge about 64-bit assembly. Here I will summarize some of them. One is a review of the Phase learning, and the other is hope to help new users with 64-bit assembly. I am also new to this kind of knowledge. I am sure there are some mistakes in this article. I am very grateful.
The title of this article includes four main content:
(1) Windows: This article is an assembly program design in Windows. The debugging environment is Windows Vista 64-bit edition, and all windows APIs are called.
(2) x64: This article discusses x64 assembly. x64 represents amd64 and Intel em64t, not IA64. You can search for the differences between the three.
(3) Assembly: as the name suggests, the programming language discussed in this article is assembly. 64-bit programming in other advanced languages is not in the scope of discussion.
(4) Getting Started: it is not comprehensive. First, there is a lot of knowledge in this article, just to the end, more in-depth study stay for future efforts. Second, it is easy for me to get started with x64 assembler.
Debugging environment for all the code in this article: Windows Vista x64, Intel Core 2 Duo.
1. Establish a Development Environment
1.1 compiler Selection
Different x64 compilation tools have different development environments. The most common tool is Microsoft mass. In x64environment, the corresponding compiler has been renamed ml64.exe and released along with Visual Studio 2005. Therefore, if you are a loyal fans of Microsoft, you can install vs2005 directly. During running, you only need to open the corresponding 64-bit command line window (Figure 1), you can use ml64 for compilation. The second recommended compiler is goasm, which contains three files: goasm compiler, golink linker, and gorc resource compiler, and comes with the include directory. Its maximum advantage is small, and it does not need to install several G Vs in order to learn 64-bit assembly. Therefore, the code in this article is compiled under goasm.
The third yasm is not familiar, so I will not repeat it again. If you are interested, test it on your own.
The syntax varies with compilers.
1.2 ide Selection
I searched the internet and found no ide supporting asm64, or even no editor. Therefore, the simplest method is to modify the MASM syntax file of editplus on your own. This is also the method I used, at least to obtain syntax highlighting. Of course, if you are too lazy to do it, use notepad.
Without IDE, You need to manually enter a lot of parameters and options for each compilation, just do a batch.
1.3 hardware and Operating System
The hardware requirement is a 64-bit CPU. The operating system must also be 64-bit. If a 32-bit operating system is installed on a 64-bit CPU, the program cannot be run even if the compilation is successful. 2. Register change
Assembly is a language that deals directly with registers. Therefore, hardware has a great impact on the language. First, let's take a look at what x64 and x32 are much more on the hardware and what has changed (figure 2 ).
X64 has eight more general registers: R8, R9, R10, R11, R12, R13, R14, and r15. Of course, they are all 64-bit. In addition, 8 128-bit XMM registers are added, which is usually not required.
The original registers in x32 are extended to 64-bit in x64, and the first letter of the name is changed from E to R. However, we can still call 32-bit registers in 64-bit programs, such as Rax (64-bit), eax (low 32), and ax (low 16-bit), Al (8 lower bits), AH (8 to 15 bits), corresponding to R8, r8d, r8w, and r8b. However, do not use registers such as Ah in the program, because such usage on amd cpu may conflict with some commands. The first x64 Assembler
In this section, we start to write our first x64 assembler. Before that, let's talk about the change of calling convention.
3.1 API call Method
Putting calling convention in the first lecture represents its importance. In a 32-bit assembly, stdcall is used when we call an API. It has two features: first, all parameters are imported into the stack and passed through the spine stack; second, the called API is responsible for recovering the stack pointer (ESP). We do not need to add ESP, 14 h after calling MessageBox, because MessageBox has been restored.
In x64 compilation, both aspects have changed. First, the first four parameter analyses are passed through four registers: rcX, RDX, R8, and R9. If there are more parameters, they are passed through the spine stack. Second, the caller is responsible for allocating and recycling the space of the spine stack.
The following code shows a simple MessageBox. Note the operations on RSP:
Code: program code:; sample code 1.asm
; Syntax: goasm
Data Section
Text dB 'Hello x64! ', 0
Caption dB 'my first x64 application', 0
Code Section
Start:
Sub RSP, 28 h
XOR r9d, r9d
Lea R8, caption
Lea RDX, text
XOR rcX, rcX
Call messageboxa
Add RSP, 28 h
The RET code is compiled in goasm. The instruction part goasm is similar to ml64. The key is that some macro definitions are different. For example, the. Code Section in MASM becomes the code section. Next let's talk about the difference. Compile first. Compile in goasm in two steps:
(1) Compilation: goasm/x64 1.asm
(2) Link: golink 1.obj user32.dll
If some operations are normal, the content in Figure 3 should be displayed in the command line. This code is compiled in goasm. The instruction part goasm is similar to ml64. The key is that some macro definitions are different. For example, the. Code Section in MASM becomes the code section. Next let's talk about the difference. Compile first. Compile in goasm in two steps:
(1) Compilation: goasm/x64 1.asm
(2) Link: golink 1.obj user32.dll
If some operations are normal, the content in Figure 3 should be displayed in the command line. Program code:; sample code 2.asm
; Syntax: ml64
Extrn messageboxa: Proc
. Data
Text dB 'Hello x64! ', 0
Caption dB 'my first x64 application', 0
. Code
Main proc
Sub RSP, 28 h
XOR r9d, r9d
Lea R8, caption
Lea RDX, text
XOR rcX, rcX
Call messageboxa
Add RSP, 28 h
RET
Main endp
Endml64 2.asm/link/subsystem: Windows/entry: Main user32.lib. If it is normal, it should be interesting. In a 64-bit system, we still call the USER32 API. It may be because Microsoft is too old to change its name.
3.2 64-bit spine Stack
It is worth noting that sub RSP, 28h, and add RSP, 28 h. How does the 28h value come from?
First, in x64, the spine stack is extended to 64 bits. Secondly, when messageboxa is called, four parameters must be added with a return address, so 8 bits) * 5 = 40 = 28 h.
Note that amd64 does not support the push 32bit register command. The best way is to use 64-bit registers for push and pop. How about em64t? After reading Intel's development manual, each instruction can be divided into three situations: Pure 32-bit, pure 64-bit, and 32-64-bit mixture. The following is a snippet of the manual:
Opcode * instruction 64-Bit mode compat/LEG mode description
FF/6 push R/M16 valid push R/M16.
FF/6 push R/M32 N. E. Valid push R/m32.
FF/6 push R/M64 valid N. E. Push R/m64.
Default operand size 64-bits.
There is no other good method. Pay more attention to it. Try to use the 64-bit register in the 64-bit program.
4. Some references
After writing the first Hello world, this article stops. I still want to write some content, but I am not very familiar with it. Please wait for the next time. I feel that some materials have to be put out in the first article, because they are the best teaching materials for learning x64 compilation. Many codes and knowledge points in this article also come from these materials.
(1) The moving to Windows x64, from: http://www.ntcore.com/Files/vista_x64.htm
(2) goasm help documentation, which is currently the best 64-bit Assembly tutorial. From: www. jorgon. freeserve. co. uk
(3) The start of 64-bit Windows System Programming before the need to understand all information, from: http://www.microsoft.com/china/MSDN/library/Windev/64bit/issuesx64.mspx
(4) two articles from codegurus
Cycler & win64,
Http://www.codegurus.be/codegurus/Programming/assembler&win64_en.htm
Bout rip relative addressing
Http://www.codegurus.be/codegurus/Programming/riprelativeaddressing_en.htm
(5) amd development manual
(6) intel development manual, attention is the new "ntel 64 and IA-32 ubuntures software developer's Manual"
The 64-bit technology is not mature yet, and there is no debugger, but we are always curious and enthusiastic about new things. This reason is enough for us to start learning 64-bit Assembly now! OK, let's go on.
1. Besides, calling convention
I have mentioned some API calling methods in entry (1), but I feel it is necessary to talk about two more points. The first is the framework of the spine stack when the API is called, that is, the stack frame, and the second is to use the 64-bit C/C ++ program to study the calling convention.
Let's talk about stack frame first. Figure 1 is a general spine stack framework.
In a 32-bit program using stdcall, four tasks of stack frame are as follows:
(1) Call of input parameters;
(2) When returning caller, callee is responsible for balancing the cervical stack;
(3) provide space for local variables;
(4) ensure that the values of the four registers EBX, ESI, EDI, and EBP remain unchanged (this register is called non-volatile ).
In a 64-bit environment, the task of balancing the vertebral stack is missing, because caller is responsible for the work of the balancing disc stack, so callee's stack frame has only three tasks left:
(1) store the parameters passed in the register and more than four other parameters on the spine stack (in the stack );
(2) provide space for local variables;
(3) ensure that the non-volatile register values remain unchanged, including EBP, EBX, RDI, RSI, R12 to R15, xmm6 to xmm15.
Therefore, the following code is often found at the beginning of a function:
MoV [RSP + 8 h], rcX
MoV [RSP + 10 h], RDX
MoV [RSP + 18 h], R8
MoV [RSP + 20 h], R9
Push RBP
MoV RBP, RSP
The following code is returned:
Lea RSP, [RBP]
Pop RBP
RET