Document directory
- 1. Absolute Loading Mode (Absolute Loading Mode)
- 3. Dynamic address duplication (Dynamic Run-time Loading)
- 2. Load-time Dynamic Linking)
- 3. Run-time Dynamic Linking)
- 4.1. Construct a dynamic link library
- 4.2. DLL Loading Method
1. Address-related concepts
1. physical address (physical address)
Physical memoryThe size of the memory stick inserted in the memory slot of the motherboard.
Memory is composed of several storage units, each of which has a number that uniquely identifies a storage unit calledMemory Address (or physical address ).We can regard the memory as an array of storage units ranging from 0 bytes to the maximum memory capacity by byte, that is, each storage unit corresponds to the memory address number.
2. Virtual memory (also called Virtual memory)
Virtual Memory AddressThat is, the address space that each process can directly address is not disturbed by other processes. Each Command or data unit has a definite address in this virtual space.
Virtual MemoryIt is a virtual space consisting of the target code, data, and other virtual addresses in the process.
The virtual memory does not take into account the size of the physical memory and the actual location where information is stored. It only specifies the relative location of the information associated with each other in the process. Each process has its own virtual memory, and the size of the virtual memory is determined by the address structure and addressing mode of the processor.
For direct addressing, if the valid address length of the CPU is 16 bits, the addressing range is 0-64 K.
For example, a 32-bit machine can directly address 4g space, meaning that each application has 4G memory space available. However, obviously the machine memory is rarely so large that each program can use 4 GB memory.
The difference between virtual memory and physical memory: virtual memory is the opposite of physical memory, refers to the memory space virtualized from the hard disk according to the system needs, is a computer system memory management technology, it is a computer program, and the physical memory is hardware. Sometimes, when you process large programs, the system memory is insufficient. At this time, the hard disk will be used as the memory to exchange data for the cache zone, however, the processing speed of physical memory is more than 30 times that of virtual memory.
3. logical address)
After the source program is compiled or compiled, the target code is formed. Each target code is compiled in the order of 0 as the base address. The unit accessed with the symbol name is replaced by the specific data-the unit number. The generated target program occupies a certain address space, which is called the logical address space of a job.
The address of each instruction in the logical space and the address of the operand to be accessed in the instruction are collectively referred toLogical Address. The address used in the application. The physical address in the memory can be obtained only after calculation or transformation in the addressing mode.
Very simple,Logical AddressIt is the address used in your source program. After the source code is compiled, the compiler converts some labels and variables to the address or the offset address relative to the current segment.
Logical AddressIt refers to the segment-related offset part generated by the program. For example, you can read the value (& operation) of the pointer variable in the C language pointer programming. In fact, this value is the logical address, which is relative to the address of the Data Segment of your current process, it is not related to an absolute physical address. The logical address is equal to the physical address only in Intel real mode (because the real mode does not have a segmentation or paging mechanism, the Cpu does not perform automatic address translation ); the logic is the offset address within the limit of the code segment executed by the program in Intel protection mode (assuming that the code segment and data segment are exactly the same ). Application programmers only need to deal with logical addresses, while the segmentation and paging mechanisms are completely transparent to you and are only involved by system programmers. Although the application programmer can directly operate the memory, it can only operate on the memory segment allocated to you by the operating system.
However, some materials directly regard the logical address as a virtual address, and there is no clear line between the two.
In the Linux kernel, the virtual address is the 3g-4g address, which is mapped to the physical address through the page table. The logical address is the virtual address 3G-3G + main_memory_size, its ing with physical addresses is linear. Of course, you can also use page table ing. Therefore, logical addresses are part of virtual addresses.
Logical Address: It is composed of a segment identifier plus an offset of the relative address within the specified segment, which is expressed as [segment identifier: Intra-segment OFFSET]
Figure 4.1 job namespace, logical address space, and loaded physical space
4. Linear address or virtual address in Linux)
This address is very important and difficult to understand. In the segmentation mechanism, CPU addressing is a two-dimensional address, that is, segment address: Offset address. The CPU cannot recognize two-dimensional addresses. Therefore, the address must be converted to one-dimensional address, that is, segment address x 16 + offset address, the obtained address is a linear address (it is also a physical address without the paging mechanism enabled ). What does this mean? In other words, the calculation method of this one-dimensional address is known to anyone who studies computers. But do you really understand what it means? To understand what it means, you must know what an address space is.
A linear address is the intermediate layer between a logical address and a physical address transformation. The program code will generate a logical address, or an offset address in the segment. A linear address is generated by adding the base address of the corresponding segment. If the paging mechanism is enabled, linear addresses can be transformed to generate a physical address. If the paging mechanism is not enabled, linear addresses are physical addresses. Intel 80386 linear address space is 4 GB (2 to the power of 32 address bus addressing ).
Similar to the logical address, it is also an invalid address. If the logical address is the address before the segment Management Switch on the hardware platform, the linear address corresponds to the address before the hardware page memory conversion.
The CPU needs to take two steps to convert the addresses in a virtual memory space to physical addresses: first, a logical address is given (in fact, the segment offset = ), the CPU needs to use its segmented Memory Management Unit to convert a logical address into a thread address, and then use its webpage Memory Management Unit to convert it to the final physical address.
How the program runs
In a multi-program environment, you must first create a process for the program to run. The first thing to create a process is to load the program and data into the memory. To change a user source program to a program that can be executed in the memory, follow these steps:
First, compile the program (compiler) to compile the user source code into the CPU executable target code, and generate several target modules (Object modules) (that is, several program segments ),
Second, the link Program (linker) links a set of target modules (program segments) formed after compilation and the library functions they need, form a complete load module );
Finally, load the module into the memory by the loader. Figure 4-2 shows the three steps.
Figure 4-2 process the user program
2. Program loading (address transformation)
For the convenience of elaboration, we first introduce the loading process of a single target module without link. The target module is the Mount module. When you load a module into the memory, you can use the absolute loading mode, the relocable loading mode, and the dynamic running loading mode. The following describes the loading methods respectively.
1. Absolute Loading Mode (Absolute Loading Mode)
During compilation, if you know where the program will reside in the memory, the Compilation Program will generate an absolute address of the target code. That is, the actual physical address is assigned according to the location of the physical memory. For example, if we know in advance that the user program (process) resides at the starting position of R, the target module (that is, the loaded module) generated by the compiled program will start to expand from R. The absolute loader loads the program and data into the memory according to the address in the module. After the module is loaded into the memory, because the logic address in the program is exactly the same as the actual memory address, you do not need to modify the address of the program and data. The absolute address used in the program can be provided during compilation or assembly, or directly granted by the programmer.
The advantage of this method is that the CPU executes the target code quickly.
Disadvantages: 1) due to memory size restrictions, the number of processes that can be loaded into memory for concurrent execution is greatly reduced
2) The compilation program must know the current idle address part of the memory and its address, and store different program segments of the process continuously. Compilation is very complicated. Because the program
Therefore, it is usually better to use Symbolic addresses in the program, and then convert these symbolic addresses into absolute addresses during compilation or assembly.
How can I change the virtual memory address space to the unique one-dimensional physical linear space in the memory? Two problems are involved:
First, the division of virtual space.
The second is to load the linked and divided content in the virtual space into the memory, and map the virtual space address to the memory address. Address ing.
Address ing is to establish the relationship between virtual addresses and memory addresses.
2. Static address Relocation (Relocation Loading Mode)
The absolute loading method can only load the target module to a specified position in the memory. In a multi-program environment, it is impossible for the compiler to predict where the compiled target module should be stored in the memory. Therefore,The absolute Mount mode is only applicable to a single program environment.. In a multi-program environment, the starting address of the target module usually starts from 0, and other addresses in the program are calculated relative to the starting address. In this case, the relocable loading mode should be used to mount the module to the appropriate location of the memory according to the current situation of the memory.
Static address relocation: It is completed when the program is loaded into the memory of the target code, that is, before the program starts running, the programCommands and DataAll addresses have been relocated, that is, the ing between virtual addresses and memory addresses is completed. Address translation is usually completed at the time of loading and will not be changed later.
It is worth noting that relocatedLoad ProgramAfter the loaded module is loaded into the memory, all the logical addresses in the loaded module are different from the physical addresses in the actual loaded memory. Figure 4-3 shows this situation.
Figure 4-3 Memory loading of a job
For example, if the user program has a command load 1000 at Unit 2500, the command is used to take the integer 365 in unit to register 1. However, if you load the user program to 10000 of the memory ~ Unit 15000 without address transformation, when executing the command in unit 11000, it will still get data from Unit 2500 to register 1, resulting in data errors. As shown in Figure 4-3, the correct method is to change the address 2500 in the count command to 12500, that is, to change the relative address 2500 in the command to the starting address in the memory of the program.
10000 to get the correct physical address 12500. In addition to the data address, the command address must also be modified. That is, the relative address 1000 of the command is added to the starting address 10000 to obtain the absolute address 11000.
Advantages:No hardware support required
Disadvantages:1) after the program is relocated, it cannot be moved in the memory;
2) the storage space of the program must be consecutive and the program cannot be placed in several discontinuous regions.
3. Dynamic address duplication (Dynamic Run-time Loading)
The relocatable mounting method can load a module to any allowed location in the memory, so it can be used in multiple program environments. However, this method does not allow the running of programs to move in the memory. Because the movement of a program in the memory means that its physical location has changed. In this case, you must modify the address (absolute address) of the program and data before running the program. However, the actual situation is that its location in the memory may change frequently during the running process. In this case, the dynamic running mode should be adopted.
Dynamic address relocation:Address translation is performed not before the program is executed. More specifically, the address translation is postponed until the program is actually executed, that is, the program or data address to be accessed before each access to the memory unit is changed to the memory address. Dynamic Relocation allows the assembly module to be loaded into the memory without any modification. To enable address translation without affecting the instruction execution speed, This method requires the support of a relocation register,
Advantages(1) The target module does not need to be modified when it is loaded into the memory, so the correct execution will not be affected after it is loaded. This is extremely beneficial for memory compression and fragment solving;
2) When a program is composed of several relatively independent target modules, each target module is loaded into a storage area, which may not be sequential, as long as each module has its own location register.
Disadvantages:Hardware support is required.
3. program Link
After the source program is compiled, a set of target modules can be obtained. Then, the linked program links this set of target modules to form the loaded modules. Based on the Link Time, links can be divided into the following three types:
(1) Static links. Before running the program, link the target modules and their required database functions into a complete assembly module. We call this method of linking in advance a static link method.
(2) Dynamic Link during loading. This refers to a set of target modules obtained after the user source program is compiled. when loading the memory, the link mode of edge loading and link is used.
(3) Dynamic Link during running. This refers to the link to some target modules. It is the link to the target module only when it is required for program execution.
1. Static Link (Static Linking)
An example is provided to illustrate some problems that should be solved when static links are implemented. Figure 4-4 (a) shows the three target modules A, B, and C after compilation. Their lengths are L, M, and N, respectively. In module A, call B is used to CALL Module B. In Module B, CALL C is used to CALL Module C. Both B and C are external call symbols. When Assembling these target modules into a load module, the following two problems must be solved:
(1) modify the relative address. In all the target modules generated by the compiled program, the relative address is used, and the starting address is 0. The addresses in each module are calculated relative to the starting address. After a module is linked to a mount module, the starting address of the original Module B and Module C is no longer 0, but l and L + m, respectively, therefore, you must modify the relative addresses in Module B and C, that is, add L to all relative addresses in the original B, and add L + m to all relative addresses in the original C.
(2) Change the external call symbol. The external call symbols used in each module are also transformed into relative addresses. For example, the starting address of B is changed to L, and the starting address of C is changed to L + M, 4-4 (B. A complete load module formed by such a first-line link is also known as an executable file. It is usually not split, and can be directly loaded into the memory at runtime. This kind of link method is called static link method.
Figure 4-4 program Link
2. Load-time Dynamic Linking)
The target module obtained after compilation of the user's source program is loaded with edge links when the memory is loaded. That is, if an external module call event occurs when a target module is loaded, it will cause the loader to find the corresponding external target module, load it into the memory, and modify the relative address of the target module in the way shown in Figure 4-4. The dynamic link mode for loading has the following advantages:
(1) Easy to modify and update. If you want to modify or update a target module, you must re-open the module. This is not only inefficient, but sometimes impossible. If dynamic links are used, it is very easy to modify or update each target module because each target module is stored separately.
(2) facilitate sharing of the target module. When static links are used, each application module must contain copies of its target module, which cannot be shared with the target module. However, with the dynamic link mode during loading, the OS can easily link a target module to several application modules to share multiple applications with this module.
3. Run-time Dynamic Linking)
In many cases, the modules to be run each time during application running may be different. However, you cannot know which modules to run in advance. Therefore, you can only load all modules that may be running into the memory and link them together when loading them. Obviously, this is inefficient, because sometimes some target modules do not run at all. A typical example is the target module used for error handling. This module is obviously not used if the program does not encounter errors throughout the running process. The dynamic link mode popular in recent years is an improvement of the above link mode during loading. This method delays the link of some modules until the program is executed. That is, when a called module is found to have not been loaded into the memory, the OS immediately finds the module and loads it into the memory, and links it to the caller module. Any target module that is not used during execution will not be transferred to the memory or linked to the loading module. This not only speeds up the program loading process, but also saves a lot of memory space.
4. Windows NT Dynamic Link Library
4.1. Construct a dynamic link library
A dll is a module that contains functions and data. Its calling module can be EXE or DLL. It is loaded by the calling module at runtime. When it is loaded, it is mapped to the address space of the calling process. There is a type of project in VC used to create a DLL.
• Library file. c: It is equivalent to providing the source code of a group of function definitions. • module definition file. DEF: equivalent to defining link options, which can also be defined in the source code. For example, the introduction and extraction of functions in DLL (dllimport and dllexport ). • Compile program exploitation. C file generation target module. OBJ • Library Management Program exploitation. DEF file generation DLL input library. LIB and output file. EXP • Link program exploitation. OBJ and. the EXP file generates a dynamic link library. DLL. 4.2. DLL Loading Method
1) load-time ):-Explicitly call a DLL function during programming. This DLL function is called an import function in an executable file. -The. LIB file must be used for linking. Create an IMAGE_IMPORT_DESCRIPTOR structure for each introduced DLL in the executable file.
During the loading, the system will rewrite the function pointers in the Import Address Table Based on the DLL ing Address in the process. Hint is the serial number of the DLL function in the DLL file. After the DLL file is modified, it may not point to the original DLL function. When loading, the system will find the corresponding DLL and map it to the process address space to obtain the entry address of each function in the DLL, locate references to these functions in the process
Dynamic Link process during loading:
(Note: The Import Address Table is determined based on the DLL module's loading position during loading ).
DLL function call process:
2) run-time ):When programming, use LoadLibrary (giving the DLL name, returning the handle to the DLL after loading and linking), FreeLibrary, and GetProcAddress (its parameters include the symbol name of the function and return the entry pointer of the function) API to use the DLL function. In this case, import library is no longer required ). -LoadLibrary or LoadLibraryEx maps the executable module to the address space of the calling process and returns the module handle.-GetProcAddress obtains the pointer of a specific function in the DLL and returns the function pointer; -FreeLibrary: reduces the reference count of the DLL module by 1. When the reference count is 0, the DLL ing between the DLL module and the process address space is removed. Examples of dynamic links during runtime
HINSTANCE hInstLibrary; // module handle definition DWORD (WINAPI * InstallStatusMIF) (char *, char *, BOOL ); // define the function pointer if (hInstLibrary = LoadLibrary ("ismif32.dll") // ing {InstallStatusMIF = (DWORD (WINAPI *) (char *, char *, char *, BOOL) GetProcAddress (hInstLibrary, "InstallStatusMIF"); // obtain the function pointer if (InstallStatusMIF) {if (InstallStatusMIF ("office97", "Microsoft", "Off Ice 97 "," 999.999 "," ENU "," 1234 "," Completed successfully ", TRUE )! = 0) // call the function {}} FreeLibrary (hInstLibrary) in the DLL module; // remove the ing}