Learning Guide for assembly language (III.)

Last Update:2017-02-27 Source: Internet

Author: User

Tags valid

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

"Assembly language" as a language, corresponding to the compiler of high-level language, we need an "assembler" to assemble the original file of assembly language into machine executable code. Advanced compilers such as MASM, TASM, etc. provide us with many features similar to high-level languages, such as structure, abstraction, etc., for our writing assembler program. In such an environment, the assembler is written in a large part of the assembler-oriented pseudo instruction, already similar to the high-level language. Now the assembly environment has been so advanced, even if all the assembly language to write Windows application is feasible, but this is not the strength of assembly language. The strength of assembly language is to write programs that are efficient and require precise control of the machine's hardware. And I think the people here to learn the purpose of the compilation is mostly in order to read disassembly code in the crack, very few people really want to get assembly language programming it? (Khan ...) ）

Okay, here we are. Most assembly language books are oriented to assembly language programming, my posts are machine-oriented and disassembly, hoping to play a complementary role. With the previous two foundation, the assembly language book on most of the instructions should be able to understand, understand. Here are a few more common and more complicated instructions. I am talking here about the hard instructions of the machine, not for any assembler.

Unconditional Transfer Instruction JMP:

There are three types of jump instructions: short (shorter), near (near) and far (far). Short means that the destination address to jump to is no more than 128 bytes before and after the current address. Near refers to the target address of the jump and the current address in a paragraph, that is, the value of CS unchanged, only change the value of EIP. To jump to another code snippet to execute, Cs/eip to change. Short and near in the coding of different, in the assembly instructions in general rarely explicitly specified, as long as the write jmp target address, almost any assembler will be based on the distance of the target address using the appropriate encoding. Far-shifting is rarely seen in 32-bit systems, as explained earlier, because there is enough linear space, a program rarely needs two pieces of code, the system modules used to be mapped to the same address space.

The jmp operand is naturally the destination address, which supports direct addressing and indirection. Indirect addressing can be divided into register indirection and memory indirect addressing. Examples are as follows (32-bit systems):

JMP 8e347d60, direct addressing segment jump
JMP EBX Register indirection: can only jump within paragraph
JMP DWORD ptr [EBX]; memory indirection, jump within segment
JMP DWORD ptr [00903DEC]; ditto
JMP fward ptr [00903df0]; memory indirection, jump between segments

Explain:
In a 32-bit system, the full destination address consists of a 16-bit segment selector and a 32-bit offset. Because the register width is 32 bits, the register indirection can only give a 32-bit offset, so it can only be in-segment proximity transfer. In memory indirection, the instruction is followed by a valid address in square brackets, where the destination address of the jump is stored. For example, the following data is available at [00903DEC]: 7C A7 9F 01

Memory bytes are stored continuously, how to determine how much to take as the destination address? DWORD PTR indicates that the valid address indicates a double word, so take
0059827C for paragraph jump. Conversely, Fward PTR indicates that the following valid address is pointing to a 48-bit full address, so take 19f:658501a7 to do a far jump.

Note: In protected mode, if the transition between segments involves priority changes, there is a series of complex protection checks that are not now heeded. In the future, you can learn by yourself after your skill promotion.

Conditional transfer Instruction JXX: can only be transferred within a segment, and only direct addressing is supported.

=========================================
Call Command:

Call is addressed in the same way as JMP, but in order to return from a subroutine, the instruction presses the address of its next instruction to the stack before jumping. If the call is within a paragraph (the destination address is a 32-bit offset), then the indentation is only an offset. If the call is between paragraphs (the destination address is a 48-bit full address), the full address of the next instruction is also pressed. Similarly, if the transition between segments involves priority changes, there is a series of complex protection checks.

The corresponding RETN/RETF instruction is returned from the subroutine. It obtains the return address from the stack (which is pressed in by the call instruction) and jumps to that address for execution. RETN take 32-bit offset as a paragraph return, RETF take 48-bit full address for the return between paragraphs. Retn/f can also be followed by an immediate number as an operand, which in effect discards the parameters in the stack by automatically returning the stack pointer esp with the specified number of numbers (in words) from the number of arguments passed to the subroutine on the stack. *2 The details here are left to the next story.

Although call and RET are designed to work together, there is no inevitable connection between them. In other words, if you push a number directly into the stack and then perform a RET, he will also jump to the number you're pressing in as a return address and hop there to execute it. This irregular process transfer can be used as an inverse tracking tool.

==========================================

Interrupt instruction Int N

In protected mode, this instruction must be intercepted by the operating system. In the General PE program, this instruction has not been seen, and in the DOS era, interrupt is an important way to invoke the operating system and BIOS. The program now has the ability to invoke Windows features in a gentle and elegant name, such as Call User32!getwindowtexta. From a procedural point of view, the int instruction presses the current flag register first into the stack, then presses the full address of the next instruction into the stack, and then retrieves the interrupt descriptor chart according to the operand N, trying to transfer to the appropriate interrupt service program to execute. In general, the Interrupt service program is the operating system's core code, will inevitably involve priority conversion and protection check, stack switch, etc., the details can see some advanced tutorials.

The corresponding interrupt return instruction Iret do the reverse operation. It gets the return address from the stack and uses it to set the Cs:eip and then pops the flag register from the stack. Note that the flag register value on the stack may have been changed by the interrupt service program, usually the carry flag C, to indicate whether the function was properly completed. Similarly, Iret does not necessarily have to correspond to the int instruction, you can press the flag and address on the stack yourself, then execute the Iret to implement the process transfer. In fact, multitasking operating systems often use this trick to implement task conversions.

A broad interruption is a big topic, and you are interested in looking at books on system design.

============================================
Load full pointer command LDS,LES,LFS,LGS,LSS

These instructions have two operands. The first is a universal register, and the second operand is a valid address. The instruction obtains a 48-bit full pointer from the address, loads the selector into the corresponding segment register, and loads the 32-bit offset into the specified universal register. Note in memory, the position of the pointer is always 32-bit offset in front, and the 16-bit selector in the back. After loading the pointer, you can use the form Ds:[esi] to access the data that the pointer points to.

============================================
String manipulation directives

This includes cmps,scas,lods,stos,movs,ins and outs. These instructions have a common feature, that is, there is no explicit operand, and by the hardware specified using Ds:[esi] to point to the source string, with Es:[edi] point to the destination string, with Al/ax/eax for staging. This is a hardware rule, so be sure to set the corresponding pointer before using these instructions.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More