A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
-Disclaimer: This course is based on "Assembly Language (2nd Edition)" Engineering, mechanical industry press. This section of the experiment is taken from the second chapter of the textbook, "example two into the computer."
The experimental environment
dosemu can be installed to simulate DOS environment, and provide
DEBUG , and other assembly
LINK language development program.
2. Enter DOS and debug
Double-click the Dosemu icon on the desktop to go directly to DOS. Then do the following:
Second, enter the computer
C:\〉D: ——回车后进入D盘 D:\〉CD DOS ——进入DOS子目录 D:\dos〉DIR ——列出目录中的文件 D:\dos〉DEBUG ——进入DEBUG
The word length of a microcomputer is related to the number of register bits of microprocessor.
Take Intel 80x86 series microprocessors as an example:
CPU is 8086/8088, 80286 of the word length is 16 bits (bits bit), then their registers of the number of bits must be 16 bits;
The 32-bit word-length microcomputer CPU is 80386/80486 or Pentium series, and the number of registers is 32 bits.
In assembly language, the values are followed by the letters B, H, D for Binary (binary), hexadecimal (hexadecimal), and decimal (decimal) (decimals can omit D).
In the computer also stipulates the use of byte, Word, double words and other units to represent data.
byte (byte): 8-bit binary number. such as 00000101B, or expressed as 05h;10000101b, or expressed as 85H.
Word: 16-bit binary number, equal to 2 bytes. such as 1100010111010110B, or expressed as c5d6h.
Double Word: A 32-bit binary number, also known as a double-precision number, equals 4 bytes. such as 23456789H.2.1 8086 Register Group
8086 registers are 16-bit registers, which can be divided into 4 types according to their use. Data registers, address registers, segment registers, and control registers, respectively. ：
Each register in a data register can be divided into 2 8-bit registers: AH, AL,BH, Bl,ch, CL,DH, DL. H represents a high-byte (high 8-bit) register, and L represents a low-byte (low 8-bit) register. For example, the AX register holds a word of 1234H, expressed as (ax) =1234h, where high byte 12 is placed in Ah, and low byte 34 is placed in AL.
The address register includes a pointer and a variable-address register SP, BP, SI, di four 16-bit registers. As the name implies, they can be used to store the offset address of the memory operand. In addition, they can also be used as general-purpose registers.
The 8086CPU has 4 16-bit segment registers , namely CS code segment registers, DS data segment registers, ES additional segment registers, and SS stack segment registers.
The control register includes the IP and flags (also known as the PSW program status word) Two 16-bit registers for the execution of the control program.
The IP instruction pointer register , which holds the offset address in the code snippet, indicates the offset address of the unit in which the next instruction is currently executing.
A bit in the Flags Flag Register represents the 1 flags of the CPU, indicating a certain state of execution of the CPU. The lowest bit is D0 and the highest bit is D15. The 8086CPU flag Register has 9 flags, 6 condition code flags and 3 control flags, respectively.
(1) Condition code flag (D0~D7+D11)
CF carry flag: When the highest bit of the instruction execution result is forward with rounding, cf=1, otherwise cf=0
PF parity flag: When the number of 1 in the instruction execution result is an even number, pf=1, otherwise pf=0
AF auxiliary carry flag: Af=1 when the 3rd bit (half byte) of the instruction execution result is forward, otherwise af=0
ZF 0 Flag: When the command execution result is 0 o'clock, zf=1, the result is not 0 o'clock, zf=0
SF symbol flag: sf=1 when the highest bit (sign bit) of the instruction execution result is negative, otherwise sf=0
of overflow flag: of=1 If the command execution result has overflow (beyond the representation range of the number), otherwise of=0
(2) control mark (D8~D10)
TF Trap flag: In debug debugging, Tf=1, Single step execution mode, that is, enter the trap; tf=0, normal execution program
IF Interrupt flag: Set If=1, allow CPU response to mask interrupts, if=0 not respond
DF Direction flag: When the string processing instruction is executed, if the df=0 is set, the value of the address register of the memory cell is automatically increased, and if the df=1 is set, the value of the address register of the memory cell is automatically reduced.
Example: two binary number addition operation, the relevant flag bit automatically changed.
According to the calculation results, the CPU will automatically set the flag bit to: cf=0,sf=1,zf=0,of=0,pf=0, that is, no carry, the result is negative, the result is not 0, no overflow, odd number of 1. The judgment of overflow can also be understood from a simple point of view, because the binary number of the operation is the complement, you can see that the subject is a negative and a positive sum, the result is negative, does not overflow. If two positive numbers are added, the result is negative, or two negative numbers are added, the result is a positive number, which is overflow, indicating that the 8-bit complement has not indicated the result.
In the computer, there are two concepts of true value and number of machines for the number of symbols available. Truth is the actual value with a "+", "-" number, the so-called machine number, is the "+", "-" symbolic value (0, 1) The actual number of the computer can be represented.
The number of machines has three kinds of code, namely the original code, anti-code and complement. in assembly language, the numbers are expressed in the form of a complement, so it is necessary to master the number of complementary and complementary expressions. These three codes are defined as follows:
Original Code : The original code will be the highest bit as the sign bit, positive number is 0, negative numbers are 1, the remaining 7 bits as the value bit.
Inverse code : the inverse of positive numbers is the same as the original code of positive numbers. When the inverse code for negative numbers is obtained, the sign bit is 1, and the value is reversed on the basis of the original code.
complement : the complement of positive numbers is the same as the original code for positive numbers. When the complement of negative is obtained, the sign bit is 1, and the value bit is reversed plus 1 on the base of the original code.
Example: Decimal numbers +5 and 5 are represented as binary number primitives, anti-code, and complement, respectively:
[+5] original = [+5] anti = [+5] complement = 00000101B
[-5] Original = 10000101B
[-5] anti = 11111010B
[-5] complement = 11111011B
In assembly language, the concept of memory address and storage unit should be studied first. The identity of the storage unit can be represented by a physical address or logical address.
The physical address is the real address of the memory unit, and the physical address of the storage unit is unique. The INTEL8086CPU has 20 address lines, so it can store up to 2 of 20 =1m bytes (1MB). The addresses are all starting from 0, and the physical address range in hexadecimal representation in the storage space of the 20-bit address line is 00000H~FFFFFH.
The logical address is the address that is used when the user is programming, divided into segment address and offset address two parts. In 8086 assembly language, the memory address space is divided into a number of logical segments, each of which consists of a number of storage units, the maximum is 65,536 bytes per segment. Use the segment address to indicate which paragraph the offset address indicates which cell in the paragraph. Both the segment address and the offset address are 16-bit binary numbers. Logical address in the form of:
Segment Address: Offset address
For example: In, memory divides several segments. Section No. 0, section 1th, ..., each section has No. 0 units, 1th units, 2nd units, .... The length of each paragraph can be different, such as section No. 0 from unit No. 0 to 0FH unit A total of 16 byte units, paragraph 1th from unit No. 0 to 0139H unit A total of 314 byte units. The segment address represents the segment number, and the offset address represents the unit number in each paragraph , for example, 0000:0002H represents unit 2nd of paragraph No. 0, 0001:0002h represents unit 2nd of segment 1th, and so on. Therefore, the popular meaning of an offset address is the number of cells that are offset from the segment address within that paragraph.
The logical address that users use when programming is converted to the actual physical address when the CPU executes the program, the conversion process is done automatically by the address adder in the CPU. Convert the 16-bit segment address to the left 4-bit, equivalent to multiply by 16 or hexadecimal 10H, and then add the offset address. The conversion formula is:
Physical Address = segment Address x10h + offset address
Example: If the logical address of a cell is 0001:0002h, its physical address = 0001hx10h + 0002H = 00012H
The logical address of the other unit is 3020:055ah, and its Physical address = 3020hx10h + 055AH = 3075AH
The memory logic fragment types are as follows:
Code snippet--For storing instructions, segment address stored in segment register CS
Data Segment--for storing data, segment address stored in segment register DS
Additional segment--for auxiliary storage of data, segment address stored in segment register ES
Stack segment-is an important data structure that can be used to hold data, addresses, and system parameters, with segment addresses stored in the segment register SS
The data in the storage unit is called the storage unit content, and an actual storage unit can hold only one byte (8-bit binary) of data.
The address and content representation of a storage cell : enclosing the address in parentheses represents the contents of the cell. Such as
(3075AH) =12h //indicates that the content in the 3075AH unit is 12H, called a byte cell, and (37692H) =5678h //indicates that the 37692H unit and the 37693H unit are stored together with 5678H, the unit is a word unit.
When the word unit is stored, the high byte is placed in the High address unit, the low byte is placed in the low address unit, i.e. 56H is placed in the 37693H unit, 78H is placed in the 37692H unit.
We already know about the concepts of CPUs and storage units, so how do we see what's happening inside the actual machine? Can you see the contents of the specific registers, flags, and storage units? Can you modify and control them?
This series of questions can be answered with the support of debug tool software Debug. Through the computer experiment, we can strengthen the understanding of the relevant theoretical concepts, and grasp the powerful tool of debug, you can go into the inside of the machine to observe.2.3 Debug Tool Debug
Debug Tool debug is available in both DOS and Windows operating systems. Debug is a debugging tool designed for assembly language, which provides a very effective debugging method for programmers by means of single step, setting breakpoints and so on. It can be used to observe and modify the registers and memory units of the CPU, to keep track of the program's operation and to discover the program errors.
In the experimental building environment, the DOSEMU is used to simulate the DOS environment, and the debug program can be launched directly into the DOS environment.1. Main commands for debug
The debug command has more than 20, we mainly learn the most commonly used commands (see below)
r--viewing and modifying registers
d--Viewing memory units
e--Modifying memory units
u--disassembly, turning machine instructions into assembly instructions
t/p--Single Step execution
g--Continuous Execution Program
a--Input assembly Instructions
Debug to enter the DOS environment before use, in the virtual environment of the experimental building into the DOS method: (concrete operation diagram see Experimental Building)
After entering the DOS environment:
The simple DOS command used in this book:
cd\--first to use cd\ to return to the root directory c>
dir--Display file list
MD hb--establishing HB sub-directory
CD hb--into HB sub-directory
Copy D:\dos\masm.exe c:\hb--copies the Masm.exe in the D-disk DOS directory to the C-drive HB Directory
Copy D:\dos\link.exe c:\hb--copies the Link.exe in the D-disk DOS directory to the C-drive HB Directory
Cd.. --back to the top level directory
type--display text file contents (such as type c:\hb\abc.asm)
both DOS and Debug commands support case -sensitive .3. Enter Debug
To observe the situation inside the computer, you can go directly to debug. If you want to debug and observe the executable file, add the file name and the extension. EXE after Debug. We observe first, so type the DEBUG directly into the system.
The prompt for debug is a small short line-after which the command is entered.(1) R command--view and modify registers
There are two uses of the R command:
Typing r--directly displays all registers and flags for the CPU;
Modify register-write the register name after the R followed by, enter the contents of the register first, type the new value after the colon, and then use the R command to see the modified content. As shown in 1, change the value of the AX register to 1234H.
Figure 1: Viewing and modifying registers with the R command
The diagram shows that since debug enters the operating system environment at this point, the R command displays the value of the register under the system. As you can see, ax, BX, CX, DX are all 0, if you change the value of AX register to 1234H, after you execute R ax, enter 1234 after the colon. Note that thedata under debug is in hexadecimal numbers.
Another look at the four segment registers DS, ES, SS, CS values are 0AFAH, indicating that the system is now in the same logical segment (different system environments, the value of the segment register may not be the same, the Dosemu virtual machine is 07BEH). The operating system allocates segment addresses for each segment based on memory, so the address value may be different for each machine or for each run time. The value of the IP instruction pointer register is 0100H, which indicates the instruction to be executed in the 0100H unit of the code snippet. The logical address of the instruction unit should be composed of cs:ip, i.e. 0afa:0100h.
Let's look at the representation of the line below the register. This line shows the disassembly of an instruction in a code snippet. The so-called disassembly, refers to the binary machine instructions are displayed as assembly instructions . Consists of three parts: the leftmost 0afa:0100 represents the logical address of the unit in which the instruction is located, the middle 1E represents the machine code of the instruction, and the 3rd column shows the assembly instruction Push DS, which is the function of the DS into the stack. (DOSEMU test instruction in the virtual machine). With Debug, we know what a assembly instruction translates into a machine code, and conversely, a machine instruction can tell what assembly instructions it represents.
On the right side of the graph shows the status of each flag bit of the CPU flag register, you can observe the status of the current system in the table 2-1.(2) d command--view memory Unit
The memory is a small segment per 16 bytes, and the logical segment must start at the first address of the small segment. Use the D command to view the address and contents of the storage unit.
The D command format is:
D 段地址:起始偏移地址 [结尾偏移地址]
D ds:0 //View data segment, start from unit No. 0 D es:0 //View additional segments, start with unit No. 0 d ds:100 //View data segment, start with unit 100H D 0200:5 View unit 5th of the 0200H segment to the 15H unit (the command cannot be executed on the virtual machine)
The execution of the D command is shown in 2
Figure 2: Viewing a storage unit with the D command
Where the left column is the logical address, and the middle part is the contents of the storage unit. Each behavior is 16 byte units, and the middle dash is used to differentiate between the first 8 cells and the last 8 cells. Only the offset address of the first cell in each line is given in the logical address, and the offset address of the remaining 15 cells is not marked. It can be inferred that the offset address of the first row cell in the graph is from 0000H to 000FH, the offset address of the second row cell is 0010H~001FH, and so on. The right part shows the ASCII characters in the memory unit, which can be replaced with a small dot when not displayed.
In Figure 2:
The first d command shows the contents of the data Segment storage unit, and you can see that the segment address of the data segment is DS and its value is 0b05h. Unit No. 0 is cdh,1 20H, ..., the content of unit 15th is 03H; the second line of the 0010H (16th) unit is 69H, which is the ASCII code of the lowercase letter I, so I is shown in the right area, which means that the value of the unit 69H can be regarded as ASCII code.
The second D command displays the contents of the 0200H paragraph, starting with unit No. 0.
The third D command starts from unit 5th of the 0200H segment until the 15H unit is displayed.
If the offset address is written directly after D, the memory unit that starts at the offset address under the current data segment is displayed, such as:
D //Starting from Data segment 10H Unit display D100 //Starting from data segment 100H unit
Note: Type D multiple times to display the contents of subsequent cells consecutively.(3) E command--Modify memory unit
Use the e command to overwrite the contents of multiple storage units. The format is: E Start address Modify value Modify Value ...
For example, modify the contents of a ds:3~ds:5 three cell in a data segment to 14, 15, 16. Command for
E Ds:3 14 15 16
3 is shown
Figure 3: Modifying the storage unit with the e command
With the D ds:0 command displayed, you can see that the values of these three cells are changed from 9F 9A to 14 15 16.
If the offset address is followed directly by E, the cell value of the offset address under the current data segment is modified, and the storage unit contents of other segments can be modified with the e command.
(4) U command-- disassemblye //modify current data segment 10H unit content e es:100 //Modify additional segment 100H Unit content D es:100 //See if the contents of the 100H unit have been modified
The programmer's assembly language source program is compiled (compiled) into a binary machine instruction code, and the U command can be binary machine instructions into mnemonic form of assembly instructions, so called "disassembly." With the U command, we can get a comparison between the machine instruction and the assembly instruction to understand the storage condition of the machine instruction, as shown in 4
Figure 4: The assembler segment is displayed with the U command (the control of the machine instruction and the assembly instruction is visible)
On the left is the logical address of the storage unit in the code snippet, the value of segment address CS is 0AFEH, and the offset address starts at 0100H. The machine instruction code is immediately adjacent to the offset address, and the right part is the assembly instruction corresponding to the machine instruction. For example, in the first line, the machine instruction is 7419H, it corresponds to the assembly instruction is JZ 011B, the directive is a conditional transfer instruction, indicating that when the result is 0 o'clock jump to the offset address in the 011BH unit of instructions to continue execution. and the instruction of 0AFE:011BH unit is MOV bx,0034, is a transmission instruction. (The Dosemu virtual machine is another set of different instructions).
Note: Type U multiple times to display the following program sections consecutively.
U followed by an offset address, the disassembly begins at that address. Such as:
U 0 //Start disassembly U100//From Code snippet Unit No. 0 //Start disassembly from Code snippet unit 100H
It is important to note that the program code shown in Figure 4 is not a user-written program because the user program name is not written when the debug command is entered. EXE.
This program code is saved in the system code snippet, it may be a system program, or it may be invalid code.
In debug, a command can be used to enter assembly instructions, the system automatically translates the typed assembly instructions into machine code, and is stored successively in the store starting from the specified address. Since the value under debug defaults to hexadecimal number, the decimal number is first converted to hexadecimal number.
For example, the assembly instructions for calculating z=35+27 are:
MOV ax,23hadd Ax,1bhmov ,ax
The result of the addition is Z=62=3eh. The variable z is represented by the storage unit . These three instructions can be entered directly with the A command under Debug.
After entering the a command, the system automatically gives the logical address of 0aee:0100 (CS: Offset address), after the input assembly instructions, enter the next instruction can be entered, the direct return to exit the input. Procedure 5 shows the following:
Figure 5: Input assembly instructions with a command
You can also give the address of the instruction after the a command, such as a cs:0000, which indicates that the input instruction is stored from unit No. 0 of the code snippet.(6) t/p command--Single Step execution
After you have entered the instruction, you should execute it. The T command can execute the instruction one-by-one. The P command is the same as the T command, and the P command should be used when encountering interrupt instruction int n and calling command call to ensure that the program executes properly. This is because both the int n instruction and the call instruction are transferred to the subroutine to execute, the T command may not return after entering the subroutine, and the P command executes the instruction directly and brings the result back. You should also use the P command when you encounter Loop command loops, which allows the loop to end quickly.
Before this execution, use the R command to see if the value of the instruction pointer register IP is 0100, and if not, modify it to 0100 with the R IP command. Indicates that instructions are now to be executed from the CS:0100 unit. The T command shows the status of the current register every time it is executed, and we can always understand the execution of the instruction. Computes the execution of the z=35+27 assembly instruction, shown in 6
Figure 6: Stepping through three instructions with the T command
View execution results: After the first execution of the T command, the value of the AX register is changed to 0023, after the second execution, the value of Ax becomes 003E, stating that the addition add instruction has been executed, and after the third execution of T, the value of the register has not changed, stating that the third instruction does not operate on the register. The third instruction mov ,ax is to save the result to the data section of the storage unit NO. 0 Word cell, the D ds:0 command to view the value of the cell is already 003EH (two bytes Unit is a word unit).
The T command can also execute multiple instructions consecutively. As in the above example, 3 consecutive instructions can be used as follows T command:
The T command can also set the start address and the number of execution bars. As in the above example, starting from 0100H to execute 3 consecutive instructions, the following T command can be used:
(7) G command--Continuous execution program-T =0100 3
For the use of the continuous execution command g we put it in the later chapters to learn.(8) Q command--Exit debug
Type q, Exit debug after Enter, and return to DOS.
Hint: More commands and usage of debug see Appendix C of this book.2.4 Experimental Tasks (practice)
Practice Common DOS commands, master the usage of the main commands of debug, and lay the groundwork for the next programming.
1. DOS command usage
-Access to a DOS simulation environment in the Lab building Linux environment in two ways
-View files under the root directory with the dir command
-Use the CD command to enter the D tray directory and view the files in the subdirectory
2. Debug command usage
-Enter debug and use the D command to view the contents of the 0100h--0200h unit in the data segment
-Use the U command to view programs beginning with 0100H in the code snippet
-Use the R command to view and modify the IP register value to 0
-Modify the data section with the e command the contents of unit 5th and 6th are 12, 34
-Implement z=56+41 with the A command, execute with the T command and view the results with the D command
-Use the U command to view the assembly instructions just entered with the a command, what are their corresponding machine instructions?
Part 1th dos and debug introduction
Start building with 50+ products and up to 12 months usage for Elastic Compute Service