I have been reading 32-bit assembly in Windows these days.ProgramDesign involves some things at the bottom of the computer, which I did not know before. Due to the complicated knowledge, I have recorded some things for future reference.
In dos, memory addressing is the first thing that comes into contact in Assembly. In dos, the memory management of the operating system is quite different from that of windows. To put it bluntly, the biggest difference between Win32 programming and DOS programming is the memory usage.
For DoS, it runs in the real mode. The real mode is equivalent to 8086 machines, and the addressing range is only 1 MB, that is, h-fffffh, all obtained address values cannot be outside this zone. For this short 1 m space, the hardware address of the system occupies some of the high-end memory space (from the A0000H-FDC000H, 384kb), in the low-end system memory, the interrupt vector table and BIOS data area are arranged (000000 H-000500 h, 1 K bytes), and less than K is the memory shared by the operating system and applications (This is the famous 640kb memory limit).
The addressing in DOS is obtained based on the DS × 10 h + segment offset of the segment register. The obtained value is the value of the corresponding address in the physical memory. For example, the actual location of the XXXX: yyyy virtual address in the memory is XXXX * 10 h + YYYY.
Windows Memory Management is very different from DoS Memory Management. Windows can run in real mode, protection mode, and virtual 8086 mode. When running in real mode, addressing is nothing different from Dos. Addressing is 1 m space, but in protection mode and virtual 8086 mode, it is essentially different, because in these modes, the 32-bit address line can access the memory address of up to 4 GB. The original segment address and segment offset cannot overwrite such a large space. In the 386 structure, the Register has been expanded to 32 bits, so the value of the Register can directly represent an address in the memory, without using a segment register for indirect addressing.
So what is the role of the segment register in 386 mode? It is not used as a base address, but as a segment selector. In protection mode, whether an address space can be written or how many priorities can be written, whether or not to allow execution and other protection issues all need to be considered. Therefore, you must define some security attributes for an address space. The segment descriptor is sent for this purpose. The segment descriptor is 64-bit, but in 386, the segment register is 16-bit, and the space is insufficient. The solution is to make the segment descriptor into a table structure and use the segment register for index. Therefore, in 386, the segment register is also called the segment selector.
In 386, two new registers are introduced to manage segment descriptors. One is the 48-bit global descriptive table register GDTR, and the other is the 16-bit local descriptive table register ldtr. In Windows, the memory is represented by XXXX: yyyyyyyy. The following describes how the memory is addressable:
First, xxxx cannot represent the base address of the segment. For this address, you must first check whether the Ti bit of XXXX is 0 (that is, the second bit of XXXX). If yes, the base address of gdt is obtained from GDTR, and then the index is obtained by the 12-bit high of the segment selector XXXX in gdt. The corresponding segment descriptor is found based on the index offset. The segment descriptor includes the base address of the segment, length limit and priority. This gives you the starting address of the segment. In addition, yyyyyyyy is the linear address of the memory to be found, zzzzzzzz. If the Ti bit is 1, it indicates that the segment descriptor is placed in LDT. The first step is to obtain the base address of gdt from GDTR, then obtain the index from the ldtr register (not the 12-bit high of XXXX). Note that according to the index offset, the location of the LDT segment is obtained instead of the segment descriptor, then, the segment description is obtained from the LDT Segment Based on the high twelve bits of xxxx. The base address of the segment is obtained based on the segment descriptor information, and the offset yyyyyyyy is added to obtain the linear address zzzzzzzz to be searched. You can write a simple simulated program to indicate:
If (second digit of XXXX = 1) // The segment descriptor is located in GDTR.
{
A1 = (the first bits of GDTR); // you can specify the base address of GDTR to A1.
Segment descriptor = A1 + (12-bit high of XXXX); // you can obtain the segment descriptor.
Linear address = base address + yyyyyyyy in the segment descriptor;
}
Else // The TI bit is 0, indicating that the segment descriptor is in the LDT
{
A1 = (the first 32 digits of GDTR );
A2 = A1 + ldtr; // A2 is the entry to the LDT Descriptor Table. Note: ldtr is 16 bits.
Segment descriptor = a2 + (XXXX's high 12-bit) // The LDT Descriptor Table entry is offset, that is, the corresponding segment descriptor
Linear address = base address + yyyyyyyy in the segment descriptor;
}
after the linear address is obtained, is it the value of the corresponding address in the physical memory? It depends on whether the paging mechanism is enabled. If the paging mechanism is not enabled, the linear address is the corresponding physical address. If the paging mechanism is enabled, it does not represent the corresponding physical address. It also takes some effort to get the actual physical address and obtain the corresponding value.
80386 enabling the paging mechanism can effectively solve the problem of memory fragmentation. Previously, the program exists in the continuous physical memory space. In this way, after a period of time, the establishment and exit of multiple programs will result in a large total of idle memory, however, any piece of continuous memory is as small as the execution program cannot be installed. This is the memory fragmentation. 386 use a memory of 4 kb as one page memory. The physical memory of each page can be mapped to different linear addresses based on the page Directory and page table. In this way, the memory of the physical address can be mapped together and regarded as a continuous address space on the linear address. in section 386, except for the physical address used by the third layer, all other elements use linear addresses to represent memory.
whether to enable paging in 386 is indicated by the bit 32 (PG bit) in the Cr0 register. If PG is set to 0, the paging mechanism is disabled, and vice versa. Memory paging can only be implemented in protected mode.
in terms of physical memory, Windows operating systems and DOS share all content in memory, for example, the Code and data used by the operating system can be viewed at the application code level, windows is regarded as a time-based multi-task operating system. The CPU is divided into time slices and allocated to different applications, something unrelated to the execution of this program (such as the code and data of other programs) is not mapped to the corresponding linear address. In this way, each program is independent, within the linear address of this application, you cannot access the linear address space used by other programs.
the book introduces some important concepts in Win32 programming:
1. each application has its own 4 GB addressing space, which can store the code of the operating system, system DLL, and user DLL. Among them, there are various functions that can be called by programs. Aside from some other space, the rest is the application code, data, and allocable address space.
2. the linear address space of different applications is isolated. Although they exist in the physical memory at the same time, in the time film of a program, code and data of other applications are not mapped to corresponding addressable linear addresses, so they are not accessible. The program can use its own 4 GB addressing space, which is completely private to the program.
3. dll programs do not have their own private space. They are always mapped to the address space of other applications and run together as other applications. The reason is simple. If it does not belong to the same address space as other programs, how can the application call it ??