Program loading and Execution (iii)--"x86 assembly language: From the actual mode to the protection mode" Reading notes 23

Source: Internet
Author: User

Loading and execution of programs (iii)--reading notes 23

And then the last time the content said.
load_relocate_programthe explanation of the process is not over yet, so create a stack segment descriptor and reposition symbol table.

Allocating stack space and creating a stack segment descriptor
462         ; Creating a program stack segment descriptor463         movecx,[edi+0x0c]; 4KB magnification464         movEbx0x000fffff465         SubEbx,ecxTo get a section boundary.466         moveax4096                        467         MulDWORD [edi+0x0c]468         movEcx,eax; Prepare to allocate memory for the stack469         PagerSys_routine_seg_sel:allocate_memory470         AddEax,ecx; Get the high-end physical address of the stack471         movEcx0x00c09600                 ; 4KB granular stack Segment descriptor472         PagerSys_routine_seg_sel:make_seg_descriptor473         PagerSys_routine_seg_sel:set_up_gdt_descriptor474         mov[edi+0x08],cx

Before saying the code, first, the user program's head:

As a reminder, Ds:edi still points to the starting position of the user program.
463 line, get the user set the size of the stack (in 4KB units), is the following formula N ;
464~465, calculates the segment bounds in the descriptor, and the formula is:

If you don't understand why this is the formula, you can refer to my blog: "How to construct a stack descriptor"
http://blog.csdn.net/longintchar/article/details/50967180
466~469, call the process allocate_memory request stack space;
470: Prepare the parameter EAX because the base address in the descriptor is equal to the low-end physical address of the stack space plus the stack size. If you do not understand, please refer to the blog post I mentioned above.
472~473, create and install the stack segment descriptor.
474: Backfill the selection to the corresponding location (refer to).

Relocation of symbol table

In order to use the kernel-provided routines, the user program needs to create a symbol table. When the user program is loaded, the kernel will fill in the entry address of each routine according to the symbol table. This process is the relocation of the symbolic address. The essential link in the relocation process is the comparison and matching of strings.
In order to match the symbol table of the user program, the kernel must also create a symbol table that contains all the routines provided by the kernel.

329;=============================================================================== the      SectionCore_data vstart=0; The data segment of the system core331;-------------------------------------------------------------------------------332PGDT DW0; for setting up and modifyingGDT 333Dd0334335Ram_alloc DD0x00100000The next time the memory is allocated, the start address336337; Symbol Address Retrieval table338         Salt:339Salt_1 DB' @PrintString '340Times the-($-SALT_1) DB0341DD put_string342DW Sys_routine_seg_sel343344Salt_2 DB' @ReadDiskData '345Times the-($-salt_2) DB0346DD Read_hard_disk_0347DW Sys_routine_seg_sel348349Salt_3 DB' @PrintDwordAsHexString ' -Times the-($-SALT_3) DB0351DD Put_hex_dword352DW Sys_routine_seg_sel353354Salt_4 DB' @TerminateProgram '355Times the-($-SALT_4) DB0356DD Return_point357DW Core_code_seg_sel358359Salt_item_len equ$-Salt_4 theSalt_items equ ($-SALT)/salt_item_len

The No. 339 to No. 360 of the above code is the symbol table of the kernel.
Let's look at the user symbol table defined in the user program (in file c13.asm).

 -;------------------------------------------------------------------------------- -; Symbol Address Retrieval table -Salt_items DD (header_end-salt)/ the;#0x24 -          -Salt:;#0x28 inPrintstring DB' @PrintString ' -Times the-($-printstring) DB0 to                      +Terminateprogram DB' @TerminateProgram ' -Times the-($-terminateprogram) DB0 the                      *Readdiskdata DB' @ReadDiskData ' $Times the-($-readdiskdata) DB0

Each entry for the kernel symbol table consists of two parts:
1.256-byte symbolic name, the insufficient portion is filled with 0;
2. The entry of the routine (4 byte offset address + 2 byte segment selector);

There is only one part of each entry for the user symbol table:
The 256-byte symbolic name, and the insufficient portion is populated with 0.

When the kernel has relocated the user symbol table, the contents of the user symbol table have changed: The first 6 bytes of each entry are re-filled, and the entry for the corresponding routine is filled in.
The above procedure can be illustrated with a picture:

CMPS directive

Before we tell the code, we'll learn the string comparison instructions cmps . The instruction is available in 3 forms, respectively, for Byte, Word, and double-word comparisons.

    cmpsb   ;字节比较    cmpsw   ;字比较    cmpsd   ;双字比较

In 16-bit mode, the first address of the source string is specified by the DS:SI first address specified by the destination string ES:DI ;
In 32-bit mode, the first address of the source string is specified by the DS:ESI first address specified by the destination string ES:EDI ;
Inside the processor, the operation of the cmps instruction is to subtract the two operands and then set the corresponding flag bit based on the result. This is not done yet, (E)SI and the values are adjusted according to the value of DF (E)DI . Was from Intel Architecture software Developer's Manual Volume 2:instruction Set Reference, which described the operation process with pseudo-code.

REP/REPE/REPZ/REPNE/REPNZ directive

Simple CMPS instructions are compared only once, if you want to continuously compare, you need to add the instruction prefix rep ; the number of consecutive comparisons is CX controlled by (16-bit mode) or ECX (32-bit mode). In addition rep to the prefix, there are repe(repz) duplicates that indicate equality, and repne(repnz) inequality is repeated. When using these prefixes in conjunction with CMPS comparisons, the following procedures are used:

Thus, it is repe(repz) used to search for the first unequal byte, word, or double word to repne(repnz) search for the first equal byte, Word, or double word.

Well, with the above bedding, we can go into the code of learning.

476         ; reposition Salt477         moveax,[edi+0x04]478         movEs,eax; es-User program head479         movEax,core_data_seg_sel480         movDs,eax481      482Cld483484         movEcx,[es:0x24]: The number of salt entries for the user program485         movEdi0x28                       the salt in the user program is located inside the 0x28 of the head.

477~478: Assign the previously installed Head segment selection to ES; (Note that the DS still points to the 0-4GB memory segment, the value in EDI is the physical address that the program loads, so [edi+0x04] it can be addressed to the selection of the head segment.) )
479~480:ds points to the core data segment;
482: DF The mark bit = 0, the use of positive comparison;
484: As shown, the user's symbol table of the number of entries into the ECX;
485: The order points to the ES:EDI first symbol.

To illustrate the idea of the code, or to cite a picture of the book:

The idea is a two-layer cycle, which is divided into outer and inner loops. The function of the outer loop is to remove the symbol 1, symbol 2, ... from the user symbol table. The function of the inner loop is to traverse each entry in the kernel symbol table and compare it with the one that the outer loop takes out. If it matches, the offset address and segment selector are copied, and then the outer loop is jumped out.
Please note the red Word. The book code has a small bug, that is, after the match, did not jump out of the outer loop, but with the kernel symbol table of the next entry is again compared. This issue will be carefully analyzed later.

Outer Loop Code

Take a look at the outer loop first:

486  .b  2: 487  push  ecx  for each outer loop. 488  push  edi512
       .b  5: pop  edi ;. B5 This label is my own addition, will later talk about  513  add  edi,256  ; point to the next entry in the user symbol table  514  pop  ecx515  loop .b  2  

487~488: Because the inner cycle also need to use ECX and EDI , so into the inner cycle before the pressure stack to save;
513: EDI Add 256, then point to the next entry in the U-salt table;

For ES:EDI this entry that the outer loop points to, compare it to all the entries in the kernel symbol table in the inner loop (worst case scenario).

Inner Loop Code
490         movEcx,salt_items: Total number of kernel symbols491         movEsi,salt; The first symbol pointing to the kernel492  . b3:493         PushEdi494         PushEsi495         PushEcxhere, put the actual code for comparison.506         PopEcx507         PopEsi508         AddEsi,salt_item_len; point to the next entry in the kernel symbol table509         PopEdi510Loop. b3

490~491: Each time an outer loop enters the inner loop, the number of times the inner loop is initialized (= The total number of kernel symbols), and ESI the start point of the kernel symbol table (c-salt) is reset. This corresponds to the initialization of the inner loop, which can be imagined as the C language for the For statement

    for(ecx = salt_items,esi = salt;  ...;  ...)

493~495: Because in the actual contrast, will change ESI , EDI ECX the value, so to the actual comparison before the stack of these registers to save.
506~509: Restores the registers of the stack above, and increments ESI the value to point to the next entry in the kernel symbol table.

The core code of the comparison

Let's look at the core code of the comparison:

497         movEcx -                         ; Retrieves the number of comparisons for each purpose in a table498Repe CMPSD; 4 bytes per comparison499Jnz. b4; Zf=0 indicates no match, the jump -         movEax,[esi]or, if matched, ESI is pointing to the address data after it501         mov[es:edi- the],eaxto rewrite a string into an offset address502         movax,[esi+4]503         mov[es:edi-252],ax; and segment selector504  . b4:505      

Each time it executes here, DS:ESI and each ES:EDI points to an entry in the kernel symbol table and the user symbol table, respectively.
497: Because a symbol occupies 256 bytes, we use the cmpsd instruction, so the maximum need to compare 256/4=64 times, so the ECX incoming 64;
498: The comparison is continued if equal, and the stop condition is (ECX==0) || (ZF==0) that the ECX is 0 or the inequality is found to stop the comparison.
499: If the comparison found unequal, then, ZF=0 if the string is equal, then the comparison will be repeated 64 times, the last ZF=1 ; so the ZF=0 description does not match, and vice versa.
If it doesn't match, jump to the .b4 label. is actually jumping into the inner loop of 506 rows.

506: The value of the Restore ECX , which indicates how many loops are left (for a user symbol, how many kernel symbols remain compared to it);
509: Restore EDI The value, that is, to EDI point to the beginning of the current user symbol again.

500~501: If it matches, it ESI just points to the end of a symbol on a kernel match (total 256 bytes), followed by a 4-byte offset address and a 2-byte segment selector. Backfill the offset address at the beginning of a user symbol;
502~503: The segment selection is backfill to the back of the offset address, so the segment selector and the previous offset address form the entry for the routine. Then the user program will be able to use this portal, to a gorgeous far-off call or far jump.

This code says it's over here? No,no mentioned earlier, there is a small problem here. What should I do when the 500~503 is finished? Now that the match is successful, the fill is filled in, then the next symbol should be pointed to the EDI ESI beginning of the kernel symbol table, that is, jump out of the inner loop, into the next round of the outer loop (jump to 512 lines to start execution, equivalent to the C language break ). But it also involves a problem that we should balance the stack before jumping to 512 lines. Since the three registers are pressed into the 493~495 and then the actual comparisons are made, the three registers should also be ejected after comparison.
So 505 lines should insert a piece of code:

        pop ecx        pop esi        pop edi                                    jmp.b;跳转到512行

In fact, in these few lines of code, the Register ECX , the ESI EDI value inside is not important.
Because in 514 rows, the ECX appropriate value is obtained;
In the 512~513 line, the EDI appropriate value will be obtained;
In 491 rows, the ESI appropriate value is obtained;
So the above patch can be modified to:

        add esp,12    ;使栈平衡                                jmp.b5       ;跳转到512行

That's a lot more concise.

Perhaps some readers are not too convinced that the source of the book should not have a problem, is not my mistake. It doesn't matter, and I'll prove that this is really a bug in the post that follows. "Practice is a real truth." ”

Well, this blog post is here. Next time we talk about the execution of the user program.

"End"

Program loading and Execution (iii)--"x86 assembly language: From the actual mode to the protection mode" Reading notes 23

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.