ArticleDirectory
- 4.1 Windows Memory Management
Essentially, virtual memory is usedProgramOfCodeAnd data can be run when not all is loaded into the memory. During the running process, when the Code has not been loaded into the memory, or you want to access data that has not been loaded into the memory, the Virtual Memory Manager dynamically loads this part of code or data from the hard disk to the memory. In general, the virtual memory manager will replace some code or data in the memory to the hard disk to free up space for the code or data to be loaded.
Because data transmission between memory and hard disk is very slow compared to code execution, the efficiency factor must also be taken into account when the Virtual Memory Manager ensures that the work is correct. For example, it needs to optimize replacement.AlgorithmTry to avoid the code to be executed or the accessed data being replaced by the memory, while the code or data that has not been accessed For A Long Time stays in the memory. In addition, it also needs to maintain the code or data of each process residing in the memory in a reasonable amount, and dynamically adjust the number according to the performance of the process, and so on, this minimizes the number of disk I/O operations involved when the program is running to improve the running performance of the program.
The first part of this chapter focuses on the Virtual Memory Management Mechanism of windows, and the later part briefly introduces the virtual memory management mechanism of Linux.
4.1 Windows Memory Management
If you look at the windows virtual memory management system from the application perspective, it can be summarized as one sentence. Win32 Virtual Memory Manager provides a private, page-based, 4 GB (32-bit) linear virtual address space for each Win32 process. This statement can be broken down as follows:
(1) "process private" means that each process can only access its own address space, but cannot access the address space of other processes, do not worry that your address space will be seen by other processes (except for parent and child processes, for example, the debugger uses the parent and child process relationship to access the address space of the debugged process, which is not described here ). Note that the DLL used during process running does not have its own virtual address space. It is the virtual address space of the process to which it belongs, the global data of the DLL, and the memory applied through the DLL function are all opened from the virtual address space that calls its process.
(2) "Page-based" refers to the virtual address space divided into multiple units called "pages". The page size is determined by the underlying processor, and the page size in x86 is4 kb. The page is the smallest unit processed by the Win32 Virtual Memory Manager. The corresponding physical memory is also divided into multiple pages. The application and release of the virtual memory address space and the data transmission or replacement of the memory and disk are carried out in the smallest unit of page.
(3) "4 GB size" means that the address value range in the process can be from 0x00000000 to 0 xffffffff. Win32 Sets2 GB for low-zone ProcessesUse, 2 GB for the high area is reserved for the system.
In Win32, Hard Disk Files used to assist in implementing virtual memory are called "paging files", and there can be 16. The paging files are used to store data replaced by the memory by the Virtual Memory Manager. When the data is accessed again by the process, the virtual memory manager first changes the data from the page file to the memory so that the process can access the data correctly. You can configure the page object by yourself. For the sake of space utilization efficiency and performance, program code (including EXE and DLL files) will not be modified, so when their pages are replaced by memory, it is not written into the paging file, but is discarded directly. When necessary again, the virtual memory manager finds them from the EXE or DLL files that store them and transfers them to the memory. In addition, the processing of the read-only data contained in EXE and DLL files is similar, and it does not open up space for them in the paging file.
When a process executes some code or accesses some data, and the code or data is not in memory, this situation is called "page missing error ". There are many reasons for page missing errors. The most common one is mentioned, that is, the code and data are replaced by the virtual memory manager out of the memory, in this case, the Virtual Memory Manager transfers the code to the memory before it is executed or the data is accessed. This operation is transparent to developers and greatly simplifies the burden on developers. However, the page turning error involves disk I/O, and a large number of page turning errors will greatly reduce the overall performance of the program. Therefore, you need to understand the main causes of page missing errors and how to avoid them.
4.1.1 use virtual memory
The memory allocated in Win32 is divided into two steps: "Reserved" and "submitted ". Therefore, there are three types of pages in the Process Virtual Address Space: free, reserved, and committed ).
(1) Free means that this page has not been allocated and can be used to meet new memory allocation requests.
(2) reserved refers to the area (Region, the integer multiple size of the page) drawn from the virtual address space. After the area is drawn, the page cannot meet the new memory allocation request, it is used for future use by code that requires "Reserved" this section. No physical storage is allocated during the reservation, but a Data Structure describing the Usage Status of the virtual address space of the process is added (VAD, Virtual Address Descriptor), used to record that this region has been reserved. The "Reserved" operation is relatively fast because physical storage is not actually allocated. Because real physical storage is not allocated, the reserved space cannot be directly accessed, access to the reserved page will cause a "memory access violation" (the memory access violation will cause the entire process to exit immediately, not just to stop the thread that caused the violation ).
(3) Submit. If you want real physical storage, you must submit the reserved memory. The submission will start fromPagination FileAnd modify the corresponding items in the VAD. Note that the submission does not immediately allocate space from the physical memory, but only opens space from the disk paging file. This space is used as a backup space for future replacement. When some code accesses some data in the submitted memory for the first time, the system finds that there is no real physical memory and throws a page missing error. The virtual memory manager handles this page missing error until physical memory is actually allocated. The submission can also be performed at the same time as the reservation. Note that the commit operation will open up disk space from the page file, so it takes longer than the reserved operation.
This is also an embodiment of the demand-Paging policy in Win32 virtual memory management, that is, the real physical memory is not allocated to a virtual address when real access is not performed. This policy is based on performance considerations. It completes the work in segments to improve the overall performance. Second, for the sake of space efficiency, Win32 always assumes that the process will not access the majority of data without real access, therefore, you do not need to open up storage space for them or replace them with the physical memory, which can improve the usage efficiency of the storage space (disk and physical memory.
Imagine that some programs require a lot of memory, but they do not need all the memory immediately. Opening up space from the physical storage at one time is just a "potential" requirement, execution performance and storage space efficiency are a waste. Because it is only a "potential" requirement, it is very likely that a large part of the allocated memory is not actually used at the end. If you allocate all the physical storage at one time during the application, the space utilization efficiency will be greatly reduced.
On the other hand, if there is no need to reserve or submit a mechanism, but the memory is allocated as needed to satisfy each request, for a code that will frequently request memory at different time points, because the gap between the request memory at different time points is very likely that there will be other code to request memory. In this way, the memory obtained from code requests that frequently request memory at different time points cannot be fully accessed by using the locality feature of the space because the virtual address is not continuous (such as traversal) the number of page missing errors is increased to reduce program performance.
Reserved and submitted are both used in Win32The virtualalloc function is complete. The mem_reserve parameter is reserved and the mem_commit parameter is submitted.. The virtualfree function is used to release virtual memory. This function corresponds to virtualalloc based on different input parameters, and physical storage corresponding to the virtual address area can be released, however, the virtual address area can still be reserved or released together with the virtual address area, and the area will return to the Free State.
The implementation of the thread stack and process stack both utilizes the two-step mechanism of reservation and submission. The following uses the thread stack as an example to illustrate how the Win32 system uses the two-step mechanism of reservation and submission.
When a thread stack is created, it is only a reserved virtual address area. The default value is 1 MB (this size can be modified at createthread or through the link option at the link). Initially, onlyThe first two pages are submitted.. When the thread stack requires more pages for nested function calls, the Virtual Memory Manager dynamically submits subsequent pages in the virtual address area to meet their needs, until the upper limit of 1 MB is reached. When the maximum size of this reserved area is reached (1 by default)
MB), the virtual memory manager does not increase the size of the reserved area, but throwsStack Overflow exceptionWhen a stack overflow exception is thrown, a page of space is available for the stack and the program can still run normally. When the program continues to use the stack space and the last page is used up, the storage space is still required, which exceeds the upper limit.Process exited.
Therefore, to prevent the entire program from exiting due to thread stack overflow, you should control the stack usage as much as possible. For example, to reduce the number of nested layers of functions and reduce the use of recursive functions, do not use too many local variables in the functions as much as possible (large objects can be stored in a space from the heap, because the heap will expand dynamically, and the thread Stack's available memory region is fixed when the thread is created, and cannot be extended throughout the thread's lifecycle ).
In addition, to prevent the entire process from exiting due to a thread stack overflow, an exception can be added to the thread body function that may cause thread stack overflow, capture the overflow exception thrown when the last page is submitted and handle it accordingly.
4.1.2 process for accessing virtual memory
After a virtual memory region is reserved and submitted, you can access the data in the region. This describes the processing process when the program accesses a certain segment of memory:
As shown in Figure 4-1, when the data is already in the physical memory, the Virtual Memory Manager only needs to map the virtual address pointing to the data into a physical pointer to access the real data in the physical memory. This step does not involve disk I/O and is relatively fast.
When you access a piece of data in the submitted memory for the first time, because there is no real physical memory allocated to it. Or the data has been accessed before, but is replaced by the Virtual Memory Manager. In both cases, a page missing error occurs. The virtual memory manager will handle this page missing error, it first checks whether the data has a backup space in the paging file (the code page and read-only data page of EXE and DLL are similar, but the backup space is not in the paging file, but the EXE or DLL files that contain them ). In either case, the accessed data is backed up on the disk. Then, the virtual memory manager needs to find the appropriate page in the physical memory, replace the backup data stored in the disk with the physical memory.
Figure 4-1 Process for accessing virtual memory
The Virtual Memory Manager first queries whether there are idle pages in the current physical memory. The Virtual Memory Manager maintains a data structure called "Page-frame Database, this data structure is global in the operating system. It is initialized when Windows is started to track and record the status of every page in the physical memory. It connects all idle pages with a linked list, when idle pages are required, you can directly find the list of idle pages. If yes, you can directly use a free page. Otherwise, a page is selected based on the paging algorithm. It should be noted that when the virtual memory manager calls a page, it does not call only one page. To use the local feature, it calls a page containing the required data, it will call several pages near it into the memory. For simplicity and clarity, it is assumed that only the target page is transferred. However, we should be aware of this feature when debugging Win32 pages, because it can be used to improve program efficiency. This page will be used to store the content of the page to be replaced from the disk. After selecting a Memory Page, check the status of this page. If this page has not been modified since the last time it was transferred to the memory, use this page directly (the code page and read-only page can also be used directly); otherwise, if this page has been modified ("dirty"), you must first "write" the content of this page to the backup page corresponding to this page in the paging file, this page is marked as an idle page.
Now we have a free page to store the data to be accessed. At this time, the Virtual Memory Manager checks again whether the data is the memory that has just been applied for and is accessed for the first time. If yes, clear the free page to 0 (you do not need to read the backup page content from the disk because the backup page content is meaningless); if not, you need to read the backup page of the page in the paging file to this idle page, and then change the status of this page from idle page to active page.
In this case, the data is already on the physical memory page and can be accessed by ing the virtual address to the physical address.
The above is the case when the access is successful, but this is not always the case. For example, if you define an array that is located at the bottom of the page, and the next page of this page is just free or reserved (not submitted, that is, there is no real physical storage ). When the program accidentally cross-border access to this array, it first raises a page missing error. Then the Virtual Memory Manager detects that it is not in the paging file when handling page missing errors. This is the so-called "accessviolation ). Access violation means that the Virtual Memory Page of the address to be accessed has not been submitted, that is, there is no actual physical storage corresponding to it. access violation will directly cause the entire process to exit (crash ).
We can see that the consequences of cross-border pointer access vary according to the actual situation during the runtime. As mentioned above, when an array is not at the boundary of its page and is still on the same page after the boundary is exceeded, it will only "mistakenly access" (misunderstanding or mistaken write, where, misreading only affects the code being executed; if it is written by mistake, it will affect the code execution in other places.) other data on the page will not cause the crash of the entire process. Even if the array is really at the boundary of its page, and the pointer value falls into its adjacent page after the boundary is exceeded. However, if this adjacent page happens to be a submitted page, it will still be "accessed by mistake" and will not cause the crash of the process. This also means that there is a pointer out-of-bounds Access Error in the code of the same application,Sometimes crash during runtime, but sometimes it does not.
Microsoft provides a tool for monitoring pointer out-of-bounds accessPageheapThe principle is to force the memory allocated each time to be at the page boundary, and at the same time force the adjacent pages of the page to be free pages (that is, do not allocate their adjacent pages to the program ). In this way, the access will immediately cause accessviolation and crash. In this way, the cross-border access error of the pointer will be exposed during the development period, without a cross-border access error of a pointer hidden in the release version until the end user is used.
4.1.3 ing between virtual addresses and physical addresses
As mentioned above, after ensuring that the accessed data is already in the physical memory, you must first convert the virtual address to the physical address, that is, "address ing", to truly access this data. This section describes how the Virtual Memory Manager maps virtual addresses to physical addresses in Win32.
Win32 uses a two-tier table structure to implement address ing. Because the 4 GB virtual address space is private to each process, each process maintains its own hierarchical table structure to implement address ing. The first-level table is called the pagedirectory, which is actually a memory page (4 kb = 4 096 bytes ). This page is divided into 1024 items in four bytes, each of which is called a "page Directory item" (PDE). The second-level table is called a "page table" (page
Table). There are 1024 page tables. In the page Directory, each page Directory item 'pge' corresponds to a page table in this layer, and each page table also occupies a memory page. 4 kb in this page, that is, 4096 bytes are also divided into items as the page Directory, each of which is 4 bytes, each item in a page table is called a page table item (PTE ). Each page table item PTE points to a page frame in the physical memory, as shown in Figure 4-2.
Figure 4-2 page table
Win32 provides a 4 GB (32-bit) virtual address space. Therefore, each virtual address is a 32-bit integer, which consists of three parts, as shown in 4-3.
Figure 4-3 Virtual Address Space
In the first part of the three parts, the first 10 digits are the subscript of the page Directory, which can be used to locate one of the 1024 items in the page Directory. You can find a page table in the 2nd page table based on the item value of the item located. The second part of the virtual address, that is, the 10 digits in the middle, is the page table subscript, which can be used to locate one of the 1024 items in the page table just found. This value can be found on a page in the physical memory. This page contains the data represented by this virtual address. Finally, the third part of the virtual address is used, that is, the last 12 digits can be used to locate the specific byte location on the physical page. The 12 digits can be used to locate the bytes at any location on the page.
For example, if you access a pointer in a program (the "Pointer" in Win32 indicates a virtual address), the pointer value is 0x2a8e317f, figure 4-4 shows the ing process from virtual addresses to physical addresses.
Figure 4-4 ing between virtual addresses and physical addresses
0x2a8e317f is written in the binary format of 0010101010,0011100011, 000101111111. For convenience, the 32 bits are divided into 10 bits, 10 bits, and 12 bits. The first 10-bit 00101010 is used to locate the page Directory items in the page Directory. Because the page Directory items are four bytes, remove the 10-bit left before positioning, that is, 0010101000 (0x2a8 ). Use this value as the subscript to find the corresponding page Directory item, which points to a page table. In the same way, use the second 10-bit 0011100011 to locate the page table items in the table on this page. The table items on this page point to the real physical memory, and then use the last 12-bit 000101111111 to locate the data in the page (at this time, the 12-bit does not need to be moved left, because when locating in the physical page, each byte needs to be located. Unlike the page Directory and page table, you only need to locate the 1st bytes per 4 bytes), that is, the data pointed to by this pointer.
It is assumed that the data is already in the physical memory. In fact, the process of "determining whether the accessed data is in the memory" is also completed during the address ing process, win32 always assumes that the data is already in the physical memory and the address is mapped. One page table item is used to identify whether the page containing the data is in the physical memory page. When the page table item is obtained, this bit is checked. If it is in, it is the process described in this section, if not, a page missing error is thrown. At this time, the table item on this page contains whether the data is in the paging file. If not, it is an access violation, the table items on this page can be used to locate the page where the data page is located and the starting position of the data page in this page file, then, the data page is transferred from the disk to the physical memory based on the information, and the address ing process continues.
As we have already said, in order to implement the private nature of each process in the virtual address space, each process has its own page Directory and page table structure. For different processes, the page Directory item value (partial de) in the page Directory ), and the items (PTE) in the page table are different, so the same pointer (Virtual Address) is mapped to different physical addresses by different processes. This also means that passing pointers between different processes is meaningless.
4.1.4 virtual memory space usage status record
When using virtualalloc to apply for a virtual memory, how does the virtual memory manager know which memory blocks are free and can be used to meet this memory request? That is, how does Win32 virtual memory maintain and record the usage status of each 4 GB virtual memory address space of each process, such as the status, size, and start address of each region?
In the previous section, the reader may think that the Usage Status of the virtual memory space can be collected by traversing the page Directory and the item values in the page table, but the efficiency problem is first solved, because you need to perform a search every time you apply for memory. However, this method is not only because of efficiency problems, but also does not work. For reserved pages, the virtual memory manager does not allocate physical storage for them. Therefore, you will not enter a page table for it. In this case, you cannot determine whether a virtual memory is free or reserved by traversing the page table. In addition, even for the submitted page, page tables cannot obtain complete information, as described in section 4.1.1, the main policy demand-Paging used by Win32 in virtual memory management, that is, before the program does not actually access a certain block of memory, the Win32 Virtual Memory Manager always assumes that the block of memory will not be accessed, so it will not perform more processing for the block of memory, including not allocating real physical memory space to it, or even page tables, that is, the storage space of the page tables used to map virtual addresses to physical addresses in the process is also allocated as needed.
The Win32 Virtual Memory Manager uses another data structure to record and maintain the usage and status information of the 4 GB virtual address space of each process. This is the virtualaddress Descriptor (VAD ). Each process has its own VAD set. The VAD in this set is organized into a selfBalanced Binary TreeTo improve the search efficiency. In addition, onlyReserved or submittedThe free memory block is not VAD (therefore, the virtual address block that is not in the VAD tree structure is free ). The VAD organization is 4-5.
Figure 4-5 VAD Organizational Structure
(1) When a program requests a new memory, the Virtual Memory Manager only needs to access the VAD tree. Find two adjacent vads. As long as the difference between the upper limit of a small VAD and the lower limit of a large VAD meets the size requirement of the applied memory block, you can use the virtual memory between the two.
(2) When the memory submitted is accessed for the first time, the Virtual Memory Manager follows the process described in the previous section. That is, it is always assumed that the data page is already in the physical memory and the virtual address is converted to the physical address. When the corresponding page Directory item is found that the page Directory item does not point to a valid page table, it will find the VAD tree of the process. Find the VAD that contains the address, and generate corresponding page table items as needed based on the information in the VAD, such as the size and range of the memory block and the starting position in the page file, then, the address ing will continue from where a page error occurred. It can be seen that when a Virtual Memory Page is submitted, in addition to opening a backup page in the page file, a page table containing the page table items pointing to it will not be generated, it will not fill in the page table items pointing to it, nor open a real physical memory page for it, but until the first access to this commit page, in order to "on demand" obtain information about the entire region of the page from vad, generate the corresponding page table, and fill in the table items on the corresponding page.
(3) when accessing the reserved memory, the Virtual Memory Manager maps virtual addresses to physical addresses according to the process described in the previous section, find the corresponding page Directory and find that the directory does not point to a valid page table. Then, it searches for the VAD tree of the process and finds the VAD containing the address. At this time, it will find that the memory block is reserved, but not submitted, that is, there is no real physical storage, then directly throw an access violation and the process exits.
(4) when accessing free memory, the Virtual Memory Manager maps virtual addresses to physical addresses based on the process described in the previous section. Find the corresponding page Directory and find that the page Directory does not point to a valid page table. Then, it looks for the VAD tree of the process and finds that no VAD contains this virtual address, at this point, we can know that the virtual address page where the address is located is in the Free State, directly throw an access violation and the process exits.
4.1.5 process working set
Disk I/O caused by frequent page adjustment operations will greatly reduce the program running efficiency. Therefore, for each process, the virtual memory manager will host a certain amount of memory pages in the physical memory. It also tracks the performance indicators it executes and dynamically adjusts the quantity. The Memory Page residing in the physical memory in Win32 is called the workingset of the process. You can view the working set of the process in the task manager.Memory usage column is the size of the working set. Figure 4-6 the number in the green box is the working set size of the word editor used when I wrote this book, that is, 38740
KB.
The working set changes dynamically. At the beginning of a process, only a few code pages and data pages are transferred to the memory. When you execute code that has not been transferred into the memory or access data that has not yet been transferred into the memory, these code pages or data pages will be transferred to the physical memory, and the working set will also grow. But the working set cannot grow infinitely. The system defines a default minimum working set for each process (this value may be 20 ~ according to the physical memory size of the system ~ 50 MB) and the maximum working set (depending on the system physical memory size, this value may be 45 ~ 345 MB ). When the working set reaches the maximum working set, that is, the process needs to call a new page to the physical memory again, the virtual memory manager will replace some pages in the original working set with the memory first, then, call the new page to the memory.
Figure 4-6 Working Set