This article casually collation, we on the computer principle before the assembly, helpless school with the textbook garbage beyond imagination, is not for beginners to learn to use, the whole copy of the product of paste, no head without end, inexplicable. So I had to sort out Intel's knowledge of the assembly. This post is simply a list of the role of others is not very important.
First, data: Configure the MASM basic concept program to run in Visual Studio
Load and execute: OS find files and get their basic information OS to load a file into memory OS execute a branch transfer instruction, make the CPU execute from the first machine instruction of the program, and create a process OS insight process and respond to the resource request process after the end OS delete handle, release resources IA32 Architecture operation mode protection mode: All instructions and features available, the program has a separate memory segment, the processor block access to the allocated segment of the memory real address mode: You can access the memory System Management mode: to the development of the operating system for the people Basic execution Environment
Address space: Protected mode 4G, Real mode 1M
Basic registers: 8 Universal Registers: EAX,EBX,ECX,EDX,EBP (base address pointer), ESP (stack pointer), ESI (Source index), EDI (destination index). The first four groups can be addressed to 8-bit (Eax,ax,ah,al), and the last four groups can address only 16-bit (ESI,SI) 6 segment registers: CS (Code), SS (Stack), DS (Data), es,fs,gs eflags instruction Pointer EIP (program counter)
Detailed description: EAX in multiplication and division method will be automatically used some instructions to use ECX to do the loop counter on the ESP address stack of data, generally not used for arithmetic calculations or data transfer esi and EDI have high-speed memory data transfer instructions to use (Chinese translation as extension source/target pointer) EBP is used by advanced languages to make function parameters and local variable references on the stack, and should not be used for arithmetic operations and data transfer
The real-mode section register is used to store the segment base address. A pointer to a descriptive chart is stored in protected mode.
EFlags is a series of flags that represent set at 1. 0 o'clock means reset, the signs include: CF carry-unsigned operation when the number of purposes cannot fit is 1 of overflow-a signed operation overflows with 1 SF notation-minus 1 ZF 0-when the arithmetic or logical operation is 0, the 1 AC auxiliary carry-arithmetic operation is 1 PF when the 3rd bit of the 8-digit number is rounded to the 4th digit Parity-the least effective byte of the result is 1 memory management Real mode when the number of digits with 1 is even.
Space for 0~FFFFF, segment and offset are 16-bit, segment address x16+ = Absolute address 08f1h:0100h = 09010H protection mode
The flat mode is used in space 4gb,0~ffffffff,masm. The segment registers are the descriptors in the descriptor table. All segments are mapped to 32-bit physical addresses.
A program uses at least 2 segments: Code Snippets and data segments. Segment Description Fu Cun is a 64-bit value in the Global Descriptor table. The segment bounds represent the number of physical memory in the system. Each process has its own independent space in the multiple-segment mode, and the boundary pair has its own space.
In addition, there are paging modes
Assemble base elements
Constants
Default decimal, can add suffix 10H, 10D, 10O, 10B
available basic integer Expressions
() +-*/MOD
character and string constants
' A ', ' a ', ' Goodnight ', ' Goodnight '
reserved word instruction mnemonics, MOV ... Pseudo-directive attribute BYTE ... operator predefined symbols, @data, returning an integer constant value at compile time
identifiers
Identify variables, constants, procedures ...
pseudo Directive
Embedded special code, such as DWORD. Data. Code. Stack 100h (indicates the size of the runtime stack)
Explain the assembler structure according to the book example
Title MASM Template (main.asm); title is a pseudo instruction of annotation whole line; Description:;; Revision date:include irvine32.inc. Data segment Mymessage BYTE "MASM Program Example", 0dh,0ah,0. Code snippet main PROC mov ea X, 10000h add eax, 40000h sub eax, 20000h call Dumpregs exit; exit indirectly calls a Ms-windows function to terminate the program, which is defined by the irvine32.inc as main ENDP; Remember main end of main; End indicates that this row is the last line of the assembler, and the compiler ignores all the contents of the line, followed by main indicating the name of the program entry point
Another more detailed way of writing this program:
TITLE MASM Template (main.asm); Description:;; Revision Date:. 386; The minimum CPU requirements are 80386. Model flat, stdcall. Model flat specifies that the assembler generates code for Protected mode, StdCall allows the ms-windows function to be invoked. Stack 4096 E Xitprocess PROTO, Dwexitcode:dword; Proto indicates that the process prototype used by this program exitprocess is the MSWindows function, which is used to exit the process. The process of Dumpregs or Irvine32 register dump Dumpregs PROTO. Code main PROC mov eax, 10000h add eax, 40000h sub eax, 50000h call Dumpregs INVOKE exitprocess, 0; Call procedure, parameter 0 as return value main ENDP end main
Program Skelton
TITLE Program name (file name); Desc; Author:; create date;; Modified:; MoD Date:author:INCLUDE irvine32.inc. data; var. code main PROC; Executable code exit main ENDP; Other proc End Main
assembly Process assembler from source file generation target file and list file connector read target file copy the required procedures and target files from the link library to become an executable file, you can also generate an image file
The list file includes the program source code, line number, offset address, translated machine code and a symbol table for easy reading
A text file that contains segmented information for the connected program when the image file
Defining Data
MASM basic internal data Types byte-8bits unsigned integer sbyte-8bits signed integer word-16bits unsinged integer real mode near pointer NEAR Sword-16bits signed integer dword-32bits (close pointer in protected mode) sdword-32bits FWORD-48 bit integer (Protected mode remote pointer) qword-64bits inte Ger tbyte-80bits integer real4-32bits IEEE float real8-64bits. Real10-80bits.
There are also db,dw,dd (32bits integers or real numbers), DQ (64bits integers or real numbers), DT (10 bytes)
Defined
var1 byte ' A ' var2 byte?; uninitialized Vard WORD 65535 vard1 SWORD-32768 varf real4-2.1 varf1 REAL10 4.6E-400 varl1 byte 10, 20, 30,40; Initializes multiple values, each backward offset one address; Define string str1 byte "Go", 0 str3 Byte ' no ', 0; define string in multiline Str4 byte "HEY, This is a" byte multi-li Ne string ", 0dh,0ah,; 0DH 0ah indicates/r/n byte "Reault" 0; initializes multiple spaces Var3 byte DUP (0); Define 20 bytes All initialized to 0 VAR4 byte 4 DUP ("STACK"); "Stackstackstackstack"; Intel uses little endian; 12345678H; 0000:78; 0001:56; 0002:34; 0003:12
Note that while the text says that string multiline definitions can be, in fact, it seems that multiple lines define the result of a row in the future with Lengthof operations. This needs further textual research.
Symbolic Constants
Symbolic constants do not actually occupy storage space and cannot modify
There are three kinds of pseudo directives for this: = EQU textequ
; name = expression name will be replaced directly, = can be redefined, count = mov ax, count; the compilation result is MOV ax,100; you can also combine beautiful definition arrays with DUP array1 COUNT DUP (0) , calculating the size of the array list DWORD 10,30,33 listsize = ($-list)/4; $ indicates the current statement address offset value,/4 because of DWORD4 byte;----------------------------------; name Equ experssion; expression is an expression; name equ <text>; Text is a literal; equ cannot be redefined;-----------------------------------
base of Assembly statement Basic Operations
Basic Operation Instruction Inc/dec since the number of self reduction add, the source of the number of SUB to the target number, the original NEG for the opposite number (in-position after the reverse + 1, that is, to fill)
Sign Bit impact CF indicates whether unsigned integer operations overflow an indicates whether a signed integer operation overflows the ZF indicates whether the result is 0 SF indicates whether the result is negative PF indicates whether 1 of the lowest byte of the destination operand is an even number
mov Series
MOV series are data transfer instructions, including MOV, MOVZX, MOVSX
MOV instructions have some rules to follow: two operands with the same two operands cannot be the number of memory operations at the same time the operand is not allowed to be CS,EIP,IP immediate number cannot send segment registers directly (protection mode does not allow operation segment registers)
For data transfer between different dimensions, the MOVZX and MOVSX,MOVZX must be used to transmit the 0 extension, which is applicable to the transfer of unsigned integers, and the instruction will fill the high position with 0. For signed numbers, the MOVSX is used.
Similar to a function is the xchg instruction, which is used to exchange two operands, but note that its two operands cannot be memory operands at the same time.
Other data Operators Offset returns a variable relative to the start address of the segment the default dimension TYPE of the PTR overload variable returns the size of each original book in the Array (in bytes) Lengthof returns the number of elements in an array SIZEOF the number of bytes used to return an array initialization = Lengthof * Type
Addressing
; Indirect addressing. data var BYTE 10h. Code mov esi, offset var; ESI deposit Var's migration address mov al, [esi]; mov al, var; Indirection can also be used for convenient traversal of arrays. Data arr BYTE 10,30,50,80. Code mov esi, OFFSET arr mov al, [esi]; Al= Inc. ESI mov ah, [esi]; Ah = 30; the variable-address operand (indexed operand) adds a constant and a register to a valid address, using any 32-bit universal register as a change-address register; Format Constant[reg] or [constant + reg] Array[esi]; equivalent [array + esi]; For elements that are not one byte array2[esi*4]; fourth DWORD
Loop and condition jmp unconditional hop loop uses ECX to do counter 1 per cycle. The concrete step is to ECX 1, then see if it is 0, not adjust to the destination address, otherwise do not jump. Loop's jump range is -128~127 byte (about 42 instructions)
Example of using loop to sum an array
TITLE sum (sum.asm) INCLUDE irvine32.inc. Data intarr WORD 100h, 200h, 300h, 400h. Code main PROC mov edi, OFFSET intarr; Intarr address mov ecx, Lengthof intarr; Loop counter mov eax, 0; Accumulator Clear Zero l1:add eax, [edi]; sum + + Intarr[edi] Add edi, TYPE intarr; Array subscript +1 loop L1 exit main ENDP End Main
process and condition processing Prerequisite Knowledge
Run time Stack
CPU directly managed memory array, using SS and ESP two register protection mode, the SS segment Select Sub ESP is a 32-bit offset value that only wants a specific position within the stack, and is the top of the stack. No manual operation is generally required. Runtime stack growth is negative, that is, each pressure into a value, the top of the stack pointer esp reduced (generally 4)
Operations on the runtime stack have push POPs, PUSHFD popfd, Pushad pusha Popad Popa These push into 16-bit or 32-bit operands, always 32-bit in protected mode. The value of the PUSHFD used to press into the 32-bit eflags registers Pushad pressed into registers in order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI pusha Similar presses: AX, CX, DX, BX, SP, BP, SI, DI definition and use procedures
The process in the assembly is equivalent to a function in a high-level language.
ProcName PROC procname ENDP; A process other than main should be returned with a ret; Main calls the exitprocess end process; The return value and parameters of the process are usually stored in registers; C and C + + typically use Al to return a 8-bit value, AX returns a 16-bit value, EAX returns a 32-bit value
Therefore, the calling procedure cannot be to assign a value to a register, and then call again.
In addition, you can use the uses operator to specify the registers to use for the proc pseudo instruction, which automatically generates commands for the push and pop corresponding registers. This operator only needs to be written after the proc pseudo instruction to conditionally process the relevant instruction and dest, Src. The overflow and carry flags are always cleared and the operand dimensions must be the same or. Always clears overflow and carry flag XOR. Exclusive or not REG[MEM]. Take the counter TEST by bitwise. Performs implicit and operation between each pair of operands, and places the flag bit. The difference between and and is not modifying the intent operand. Bt,btc,btr,bts. 。。。 CMP is not available for the time being. An implicit subtraction operation is performed between each pair of operands without modifying the operand.
; Test tests for bit test A1, 00001001b; Test 0 and 3 bits are 0, only if two is 0 o'clock, ZF = 1, CMP test source and target are equal, implied target number-source, unsigned number of cases are as follows: MOV ax, 5 cmp ax, 10; zf= 0 CF =1 cmp ax, 5; ZF = 1 CF = 0 cmp ax, 3; ZF = 0 CF = 0; Signed numbers are as follows: MOV ax,-5 cmp ax,-3; sf!= of CMP Ax,-8; SF = of CMP ax,-5; ZF = 1 conditional jump directives
Conditional jumps are all pairs, like JZ. A jnz,masm requires the destination address of the jump in this procedure jz:if (ZF = 1) Jump Jc:if (CF = 1) Jump Jo:if (of = = 1) Jump Js:if (SF = = 1) Jump Jp:if (PF = 1) Jump JE: equal (cmp result) jump jne:cmp unequal Jump jcxz:cx = 0 Hop------------unsigned number comparison ja:if (Left > RI ght) Jump Jae:if (left >= rhight) Jump Jb:if (left < right) jump Jbe:. Jnbe = ja-------------signed number comparison jl:cmp less than the jump jg:cmp is greater than the jump Jge:.
------------------------------------------is adjourned-------------------------------------------------
P.S posted a temporary program, from the given string to calculate the number of occurrences of a substring, write a bad, welcome to correct.
--short Assembly Program for counting the occurrences of specific substring--; Author:wu Yanxiang; id:0715232024; Env:vc++ + MASM 8.0; TEST case:pass TITLE substringcnt (main.asm) INCLUDE irvine32.inc. Data inputstr DB "SOME day HAVE SHINING SUN doesnt MEA NS it A SUNNY Day unless the sun make ME so Confortable ", 0DH, 0ah, 0 substrc db" Sun ", 0 OUTPUT db" Sun: ", 0 substrcnt D D 0; Occurrence of substrc fori DD 0; for int i = 0 ... Lensub DD 0; Length of sub string lentotal DD 0; Length of entire string. Code main PROC Call HASSUBSTR mov edx, OFFSET OUTPUT call writestring call Writeint; Invoke Syste M API by process defined in IRVINE32 exit main ENDP;-----------------------------------------------------hassubstr PROC USES esi ecx edx ebp;; Count the occurrence times of substrc in Inputstr; Return:eax-----------------------------------------------------mov lentotal, Lengthof inputstr mov lensub, Lengthof Substrc Dec lentotal Dec lensub mov ecx, lentotal sub ecx, LENSUB inc ECX mov esi, 0 l1:mov DL, 1 indicates whether find a substring push ECX mov ecx, lensub mov edi, 0 mov esi, fori L2:mov bl, Inputstr[esi] mov-BH, Substrc[edi] Inc, Inc ESI, CMP BL, BH jne notequal loop L2 jmp l1p2 Notequal:mov DL, 0 jmp l1p2-------l2end---------------l1p2:pop ecx cmp dl, 0 jne Conhas inc fori Loop L1 jmp l1exit conhas:inc substrc NT mov ebx, lensub add Fori, ebx sub ecx, EBX inc ECX Loop L1 L1exit:mov eax, substrcnt ret hassubstr ENDP End main