Windows Memory Management

Source: Internet
Author: User
Tags c queue

Main content of this article:
1. Basic Concepts: physical memory, virtual memory, physical address, virtual address, logical address, page directory, and page table
2. Windows Memory Management
3. CPU Memory Management
4. CPU page-Based Memory Management
 
I. Basic Concepts
1. Two memory concepts
Physical memory: It is well known that the memory is inserted into the motherboard. It is fixed. The size of the memory stick is as large as that of the physical memory (except for the integrated graphics system ). However, if the program runs a lot or the program itself is large, it will lead to a large amount of physical memory usage, and even cause the physical memory consumption.
Virtual Memory: in short, virtual memory divides a page file on the hard disk and acts as the memory. When a program is running and some resources are not used or several programs are opened at the same time, but only one of them is operated, the system does not need to put all the resources of the program in the physical memory, as a result, the system places these resources that are not currently used in the virtual memory and calls them out when necessary.
2. Three address concepts
Physical address: used for memory chip-level unit addressing. It corresponds to the address bus connected to the processor and CPU.
-- This concept should be better understood among these concepts, but it is worth mentioning that although physical addresses can be directly understood as the memory itself inserted on the machine, consider the memory as a large array of serial numbers from 0 bytes to the maximum number of NULL bytes, and then call this array a physical address. But in fact, this is just a figure provided by hardware to the software, the memory addressing method is not like this. Therefore, it is more appropriate to say that it is "corresponding to the address bus", but aside from considerations of the physical memory addressing method, it directly maps the physical address to the physical memory one by one, it is also acceptable. Maybe wrong understanding is more conducive to the metaphysical image extraction.
Logical address: refers to the segment-related offset address generated by the program. For example, you can read the value (& operation) of the pointer variable in the C language pointer programming. In fact, this value is the logical address, which is relative to the address of the Data Segment of your current process, it is not related to an absolute physical address. The logical address is equal to the physical address only in Intel real mode (because the real mode does not have a segmentation or paging mechanism, the Cpu does not perform automatic address translation ); the logic is the offset address within the limit of the code segment executed by the program in Intel protection mode (assuming that the code segment and data segment are exactly the same ). Application programmers only need to deal with logical addresses, while the segmentation and paging mechanisms are completely transparent to you and are only involved by system programmers. Although the application programmer can directly operate the memory, it can only operate on the memory segment allocated to you by the operating system.
Intel retains the segmented memory management methods of the ancient times for compatibility purposes. A logical address refers to the address in a machine language instruction that is used to specify an operand or an instruction. In the above example, the address 0x08111111 allocated by the connector for A is the logical address.
-Sorry, it seems that it violates Intel's middle-stage management requirements for logical addresses, "a logical address, it is expressed as [segment identifier: Intra-segment offset] by adding an offset of the relative address in the specified segment to a segment identifier, that is, the 0x08111111 in the preceding example, it should be represented as [A code segment identifier: 0x08111111], so that it is complete"
Linear address or virtual address)
Similar to the logical address, it is also an invalid address. If the logical address is the address before the segment Management Switch on the hardware platform, the linear address corresponds to the address before the hardware page memory conversion.
-------------------------------------------------------------
Each process has a 4 GB virtual address space.
4 GB in 3 parts
(1) Some physical memory ing
(2) Some swap files mapped to the hard disk
(3) do nothing in part.
Each program uses a 4 GB virtual address. Physical addresses are used to access the physical memory. Physical addresses are addresses placed on the addressing bus, in bytes (8 bits.
-------------------------------------------------------------
The CPU needs to take two steps to convert the addresses in a virtual memory space to physical addresses: first, we need to give a logical address (in fact, it is a segment offset. This must be understood !!!), The CPU needs to use its segmented Memory Management Unit to convert a logical address into a thread address, and then use its webpage Memory Management Unit to convert it to the final physical address.
The two conversions are indeed very troublesome and unnecessary, because linear addresses can be directly extracted to the process. The reason for this redundancy is that Intel is completely compatible.
3. Concepts of page tables and page Directories
After the paging mechanism is used, 4G address space is divided into fixed pages, each page can be mapped to physical memory, or mapped to swap files on the hard disk, or you have not mapped anything. For general programs, only a small part of the 4G address space is mapped to the physical memory, while the large part is not mapped to anything. The physical memory is also paged to map the address space. For 32-bit Win2k, the page size is 4 K bytes. The CPU is used to convert a virtual address into a physical address and store the information in a structure called a page Directory and a page table.
Physical memory paging. The size of a physical page is 4 kb, and the size of 0th physical pages starts from the physical address 0x00000000. Because the page size is 4 kb, that is, 0 x 1st bytes, the page starts from the physical address 0 x. Page 1 starts from the physical address 0x00002000. We can see that because the page size is 4 kb, we only need 20 bits in the 32bit address to address the physical page.
Page Directory: a page Directory is 4 K bytes in size and placed on a physical page. Consists of 1024 4-byte page Directory items. The size of the page Directory items is 4 bytes (32 bit), so a page directory contains 1024 page Directory items. The content of each item in the page Directory (4 bytes per item) is 20 bits in height to put the physical address of a page table (the page table is placed in a physical page), and 12 bits in height are marked.
Page table: a page table is 4 K bytes in size and is placed on a physical page. It consists of 1024 4-byte page table items. The size of a page table item is 4 bytes (32 bit). Therefore, a page table contains 1024 page table items. The content of each item in the page table (4 bytes for each item, 32bit) is 20 bits in height to put the physical address of a physical page, and 12 bits in height are marked.
For x86 systems, the physical address of the page Directory is placed in the CPU's Cr 3 register.
4. convert a virtual address to a physical address
A virtual address is 4 bytes in size and contains information about the physical address.
The virtual address is divided into three parts.
(1) 31-22 Bits (10 bits) Are indexes in the page Directory.
(2) 21-12 bits (10 bits) Are indexes in the page table.
(2) 11-0 bits (12 bits) are intra-page offsets.

Conversion process:
First, find the physical page where the page Directory is located through the action column. In the action column, locate the directory items on this page. In the action column, locate the page table address corresponding to the virtual address. address-12 Find the physical address of the physical page-"More 11-0 digits of the virtual address as the offset plus the address of the physical page, then find the physical address corresponding to the virtual address
The CPU converts a virtual address to a physical address, which is 4 bytes (32 bit) in size and contains the information for finding the physical address. It consists of three parts: the 10 digits (up to 10 digits) between 22nd and 31st digits are indexes in the page directory, and the 10 digits (12th to 21st digits) Are indexes in the page table, the 12 bits (12 bits low) between 0th and 11th bits are intra-page offsets. For a virtual address to be converted to a physical address, the CPU first finds the physical page where the page Directory is located based on the value in S3. Then, based on the value of the 10-bit (highest 10-bit) value from the 22nd-bit to the 31st-bit virtual address, find the corresponding page directory item (DTA, page directory entry ), the page directory contains the physical address of the page table corresponding to the virtual address. With the physical address of the page table, the corresponding page table item (PTE, page table entry) in the page table is found based on the value of the 10-bit to 12th-bit virtual address as an index ), the page table item contains the physical address of the physical page corresponding to the virtual address. Finally, the physical address corresponding to the virtual address is obtained by adding the physical address of the physical page with the minimum 12 bits of the virtual address, that is, the offset in the page.
-------------------------------------------------------------
There are 1024 items in a page directory, and the top 10 bits of the virtual address can index 1024 items (the 10th power of 2 is equal to 1024 ). A page table also has 1024 items, 10 bits in the middle of the virtual address, and exactly indexes 1024 items. The lowest 12-bit virtual address (2 to the power of 12 is equal to 4096), as the intra-page offset, it can index 4 kb, that is, each byte in a physical page.
-------------------------------------------------------------
The computing process of converting a virtual address to a physical address is that the processor finds the physical page of the directory on the current page through the-bit high of the virtual address, then, shift the 10bit value to 2bit (because the four bytes of each page Directory item are long and the 2bit value is shifted to 4) to get the address on this page, obtain the address, in 4-byte format, and find the physical page where the virtual address corresponds to the page table. The virtual address ranges from 12th to 21st, then, shift the 10bit value to 2bit right (because each page table item has four bytes long, And the 2bit value shifted to the right is equivalent to multiplication 4) to get the address on this page, take the PTE (4 bytes) at the address, find the address of the physical page corresponding to the virtual address, and add a 12-bit page offset to obtain the physical address.
-------------------------------------------------------------
A 32-bit pointer that can be addressable in the range of 0x000000000000-0xffffff, 4 GB. That is to say, a 32-bit pointer can address every byte of the entire 4 GB address space. A page table is responsible for ing 4 k address space to physical memory. A page table contains 1024 items, that is, ing of 1024*4 k = 4 M address space. A page Directory item corresponds to a page table. There are 1024 items in a page Directory, which corresponds to 1024 page tables. Each page table is responsible for 4 M address space ing. The 1024 page tables are responsible for the address space ing of 1024*4 M = 4G. A process has a page Directory. Therefore, in the unit of pages, the page Directory and page table can ensure the ing between each page and physical memory in 4 GB address space.
-------------------------------------------------------------
Each process has its own 4G address space, from 0x000000000000-0xffffff. It is implemented through a set of page directories and page tables of each process. Since each process has its own page Directory and page table, the physical memory mapped to the address space of each process is different. The values of the two processes at the same virtual address (if both have physical memory ing) are generally different, because they usually correspond to different physical pages.

4G address space: medium to low 2G, 0x000000000000-0x7fffff is the user address space, and 4G address space: high to 2G, that is, 0x80000000-0xFFFFFFFF is the system address space. The program must have the ring0 permission to access the system address space.
 
Ii. windows Memory Principle
 
The main content is as follows:
1. Overview
2. Virtual Memory
3. Physical memory
4. ing

1. Overview:
In windows, we generally program with linear addresses, which are what we call virtual addresses, however, unfortunately, as I grew up, I found that the linear address was from the operating system, and it was not where our data actually exists. in other words, we have written the "UESTC" string in the place of 0x80000000 (virtual address), but our string does not actually exist in the 0x80000000 physical address. in addition, the real physical address is located using a N-long array (amount ~ It doesn't matter if you don't understand this sentence. You will understand the physical address later). But why does windows and even linux need to adopt this method for addressing? The reason is simple. I have heard of the protection mode? As the name implies, this mode includes measures to protect system security, and linear address is also one of the so-called security measures.
We assume that if linear addresses are not used, we can directly access the physical address. However, when we write something into the memory, the operating system cannot check whether the memory is writable, in other words, the operating system cannot control page access. this is a terrible thing, just like win9x, it's okay to write things to the kernel address in the application state, and there's no reason to worry about it ~~
Because of the security needs of the operating system, the application of virtual addresses has been promoted. in the CPU, there is something called MMU (which should be the Memory management Unit of Memory Manage Unit), specifically responsible for converting between linear and physical addresses. every time we read and write the memory, doesn't the CPU structure need to go through ALU? After ALU obtains the virtual address, it throws it to MMU to convert it into a physical address and then reads the data into the register. this is the process.

2. Virtual Memory
During programming, we are faced with virtual addresses. Each process has 4 GB of virtual memory (small supplement: in 4 GB of virtual memory, 2 GB of high memory is part of the kernel, it is shared by all processes. Low 2 GB memory data is unique to processes, and each process has a different low 2 GB memory). However, note that virtual addresses are obtained by the operating system, for example, the idea has not been put into practice, so it does not constitute any resource loss. for example, if we want to write A "UESTC" at 0x80000000, the operating system will map this virtual address to A physical address, you write this virtual address as A physical address. however, we only applied for a 1 kb virtual memory without reading/writing, and the system will not allocate any physical memory, the system allocates physical space only when the virtual memory is used.
The following are some details. In fact, virtual memory management is implemented by a pile of data structures. The following describes the data structures:
(If you are too lazy to play so many games, you can only play the important part ~~)
There is a data structure in EPROCESS as follows:
Typedef struct _ MADDRESS_SPACE
{
PMEMORY_AREA MemoryAreaRoot; // This pointer points to a binary sorting tree. You must have learned the data structure ~~ Hey ~~ In this case, the binary sorting tree can accelerate the Memory search speed.
...
...
...
} MADDRESS_SPACE, * PMADDRESS_SPACE;

However, the node Structure of the binary sorting tree is as follows:
Typedef struct _ MEMORY_AREA
{
PVOID StartingAddress; // The starting address of the virtual memory segment
PVOID EndingAddress; // end address of the virtual memory segment
Struct _ MEMORY_AREA * Parent; // The Father of the node
Struct _ MEMORY_AREA * LeftChild; // The left son of the node
Struct _ MEMORY_AREA * RrightChild; // The left son of the node
...
...
...

} MEMORY_AREA, * PMEMORY_AREA;
This node mainly records the allocated virtual memory space. If you want to apply for the virtual memory space, you just need to create a node here. If you want to delete the space, you can also delete the corresponding node. however, to be simple, there are still many operations involved, such as balancing the tree or something.
Then when we allocate virtual memory space, the system will go to find this tree, and then traverse this tree through some algorithms to find the memory gaps that meet the conditions (unallocated memory space ), create a node and mount it to this tree. The return start address is complete.


3. Physical memory
Next we will go to the physical memory. In fact, the physical memory is managed based on an array. I have heard of the paging mechanism. Next we will talk about paging. in windows, the page size is 4 kb. If we have 4 GB of physical memory, windows splits the 4 GB space into 4 GB/4 kb = 1 MB, and then each page (4 KB) the physical space is managed by a data structure named PHYSICAL_PAGE. This data structure will not be written .... I am also tired of reading and writing ~~ Let's talk about the idea.
Then, assuming that the 4 GB memory operating system generates a PHYSICAL_PAGE array, how many elements does this array have? There are 1 MB physical addresses that exactly cover 4 GB. this is the prototype of the paging mechanism. to put it bluntly, the memory is managed by 4 K. then the physical address is located. The so-called physical memory address can be expressed directly by an array subscript. In fact, the physical memory address here refers to the page number of the physical memory address... the specific address must be determined based on the low 12 bits of the virtual memory address and the page number of the physical memory.
Let's talk about the management of physical memory. There are three queues in the kernel. The elements in these queues are the PHYSICAL_PAGE structure mentioned above.
They are:
A. allocated memory queue: stores memory in use
B. Memory queue to be cleared: stores the released memory, but the memory is not cleared (cleared)
C. Idle queue: stores available idle memory

The system management process is as follows:
1). at intervals of time, the system automatically extracts queue elements from queue B for cleanup, and then puts them in the idle queue.
2). When physical memory is released, the system extracts the memory to be released from queue A and puts it into queue B.
3) when applying for memory, the system will extract the memory to be allocated from the C queue and put it into the queue.

4. ing
Speaking of ing, we must first start with the 32-bit address of the virtual memory. There is a Cr 3 register in the CPU, which stores the page Directory address of each process.

We can divide the conversion process into several steps.
1. we can locate the base address of the page Directory of the current process based on the value of (note: the value in the "03" is a physical address, then, the specified Page Directory Entry is obtained through the offset of the high 10 bits of the virtual address. The content of this interface is 4 bytes and the 20 bits are used as the base address of the Page table, the remaining bits are used for permission control and other things. the system only needs to detect the corresponding bit to implement memory permission control.
2. you can use the base address provided by apsaradb and the 10-bit (21-12) in the virtual memory to offset the Page Table PTE (Page Table Entry) address, then, the 20-bit height of the PTE is the base address of the physical memory (in fact, it is the bottom label of the PHYSICAL_PAGE array ....), the remaining bits are also used for access control.
3. The unique physical memory can be determined by adding 12-bit low in the virtual memory and 20-bit high in the PTE as the base address.


Iii. CPU memory management, how to convert logical addresses to linear addresses
A logical address is composed of two parts. The segment identifier is the intra-segment offset. A segment identifier is a 16-bit long field, which is called a segment selector. The first 13 digits are an index number. The last three digits contain some hardware details,

The last two items involve permission checks, which are not included in this post.

Index number, or directly understood as an array subscript -- it always corresponds to an array. What is its index? This is "segment descriptor (segment descriptor)". Haha, the specific address of the segment descriptor describes a segment (I think of it as an understanding of the word "segment, take a knife and cut the virtual memory into several blocks ). In this way, many segment descriptors are grouped into an array called the "segment descriptor table". In this way, the first 13 digits of the segment identifier can be used, find a specific segment descriptor directly in the segment descriptor table. This descriptor describes a segment. The image of the segment is not accurate just now, let's look at what exactly exists in the descriptor-that is, how it describes it, and understand what exactly the segment has. Each segment description consists of 8 bytes, for example:

These things are very complicated. Although we can use a Data Structure to define them, I only care about the same here, that is, the Base field, which describes the linear address at the beginning of a segment.

Intel designed some global segment descriptors to be placed in the global segment descriptor table (GDT). Some local descriptors, such as, put it in the so-called "local segment descriptor table (LDT. So when should we use GDT and LDT? This is represented by the T1 field in the segment selection operator, = 0, indicating GDT, = 1 indicating LDT.

The GDT address and size in the memory are stored in the gdtr control register of the CPU, while the LDT is in the ldtr register.

Many concepts, like tongue twisters. This figure looks more intuitive:

First, give a complete Logical Address [segment selector: Intra-segment offset address],
1. Check whether the segment selection operator T1 is 0 or 1. Check whether the CIDR block to be converted is in GDT or in LDT. Then, based on the corresponding register, obtain the address and size. We have an array.
2. Extract the first 13 digits of the segment selection character. You can find the corresponding segment descriptor in this array. In this way, the Base address is known.
3. Set Base + offset to the linear address to be converted.

It is quite simple. In terms of software, in principle, you need to prepare the information required for hardware conversion so that the hardware can complete the conversion. OK. Let's see how Linux works.

Segment Management in Linux
Intel requires two conversions. Although this is compatible, it is redundant. Oh, no way. If the hardware requires this, the software can only do the same thing. It is also formalistic.
On the other hand, some other hardware platforms do not have the concept of secondary conversion. Linux also needs to provide a high-level image to provide a unified interface. Therefore, the Linux segment management is actually just a "scam" of hardware.

According to Intel's intention, GDT is used globally, and each process uses LDT itself-but Linux uses the same segment for all processes to address commands and data. That is, the user data segment, user code segment, corresponds to the kernel data segment and kernel code segment. There is no such thing as taking the form, just like writing a year-end summary.
[Copy to clipboard] [-]
CODE:
# Define GDT_ENTRY_DEFAULT_USER_CS 14
# Define _ USER_CS (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3)

# Define GDT_ENTRY_DEFAULT_USER_DS 15
# Define _ USER_DS (GDT_ENTRY_DEFAULT_USER_DS * 8 + 3)

# Define GDT_ENTRY_KERNEL_BASE 12

# Define GDT_ENTRY_KERNEL_CS (GDT_ENTRY_KERNEL_BASE + 0)
# Define _ KERNEL_CS (GDT_ENTRY_KERNEL_CS * 8)

# Define GDT_ENTRY_KERNEL_DS (GDT_ENTRY_KERNEL_BASE + 1)
# Define _ KERNEL_DS (GDT_ENTRY_KERNEL_DS * 8)
Replace the macro with a numeric value:
[Copy to clipboard] [-]
CODE:
# Define _ USER_CS 115 [00000000 1110 0 11]
# Define _ USER_DS 123 [00000000 1111 0 11]
# Define _ KERNEL_CS 96 [00000000 1100 0 00]
# Define _ KERNEL_DS 104 [00000000 1101 0 00]
Square brackets are the 16-bit binary representation of the four segment delimiters. Their index numbers and T1 field values can also be calculated.
[Copy to clipboard] [-]
CODE:
_ USER_CS index = 14 T1 = 0
_ USER_DS index = 15 T1 = 0
_ KERNEL_CS index = 12 T1 = 0
_ KERNEL_DS index = 13 T1 = 0
If T1 is 0, GDT is used. Check the corresponding 12-15 items (arch/i386/head. S) in the initialized GDT ):
[Copy to clipboard] [-]
CODE:
. Quad 0x00cf9a000000ffff/* 0x60 kernel 4 GB code at 0x00000000 */
. Quad 0x00cf92000000ffff/* 0x68 kernel 4 GB data at 0x00000000 */
. Quad 0x00cffa000000ffff/* 0x73 user 4 GB code at 0x00000000 */
. Quad 0x00cff2000000ffff/* 0x7b user 4 GB data at 0x00000000 */

According to the descriptions in the previous segment descriptor table, you can expand them and find that the 16-31 bits are all 0, that is, the base address of the four segments is all 0.
In this way, given an intra-segment offset address, according to the preceding conversion formula, the offset in the 0 + segment can be converted to a linear address, and an important conclusion can be drawn: "in Linux, the logical address and the linear address are always the same (they are the same, not some people say the same), that is, the offset field value of the Logical Address is always the same as the linear address value .!!!"
Too many details are ignored, such as segment permission check. Haha.
In Linux, LDT is not used in most processes unless Wine is used to simulate Windows programs.

Iv. CPU page-Based Memory Management

The page Memory Management Unit of the CPU is responsible for translating a linear address into a physical address. From the perspective of management and efficiency, linear addresses are divided into groups with a fixed length, called pages. For example, a 32-bit machine can provide a maximum of 4 GB linear addresses, 4 kb can be used as a page. on this page, the entire linear address is divided into a large array of tatol_page [2 ^ 20], with a total of 20 to the power of 2 pages. This large array is called a page Directory. Each directory item in the directory is an address-the address of the corresponding page.
Another type of "page" is called a physical page, or a page box or page layout. It is a paging unit that divides all the physical memory into a fixed-length Management Unit. Its length is generally one-to-one correspondence with the Memory Page.
It is noted that the total_page array has 2 ^ 20 members, each of which is an address (32-bit host, and one address is 4 bytes). to represent such an array, it takes up 4 MB of memory space. To save space, a second-level management mode machine is introduced to organize paging units. The text description is too tired to look at the image intuitively:

For example,
1. In the paging unit, the page Directory is unique, and its address is placed in the CR 3 register of the CPU, which is the start point of address conversion. The long journey began.
2. Every active process has its own virtual memory (the page Directory is also unique), which corresponds to an independent page Directory address. -- To run a process, you need to put its page Directory address in the register of and save the other addresses.
3. Each 32-bit linear address is divided into three parts. Surface Directory Index (10 bits): page table index (10 bits): offset (12 bits)
Perform conversion by following these steps:
1. Retrieve the page Directory address of the process from the process (the operating system is responsible for loading the address into the corresponding register when scheduling the process );
2. Find the corresponding index item in the Array Based on the top 10 linear addresses. Because the second-level management mode is introduced, items in the page Directory are no longer page addresses, the address of a page table. (An array is introduced), and the page address is put into the page table.
3. Locate the start address of the page in the page table (also an array) based on the middle 10 of the linear address;
4. Add the start address of the page and the last 12 digits of the linear address to get the final hoist we want;

This conversion process should be very simple. All of them are completed by hardware. Although there is one more procedure, it is worthwhile to save a lot of memory. Then let's simply verify:
1. Can this second-level mode still represent 4G addresses;
Page Directory: 2 ^ 10 items, that is, there are so many page tables
Each table corresponds to 2 ^ 10 pages;
Addressable on each page: 2 ^ 12 bytes.
Or 2 ^ 32 = 4 GB

2. Does this second-level mode actually save space;
That is, the total space occupied by the Directory items and table items on the next page is calculated (2 ^ 10*4 + 2 ^ 10*4) = 8 KB. Ah ,...... How can this problem be solved !!! (Is it actually reduced? Because this is an increase, (4 + 2 ^ 10*4 + 2 ^ 10*2 ^ 10*10*4) = 4100KB + 4 Byte)

It is worth mentioning that although the items in the page Directory and page table are both 4 bytes and 32 bits, they only use 20 bits in height, the 12-bit low block is 0. It is easy to understand that the 12-bit Low Block of the page table is 0, because in this case, it corresponds to a page size, and everyone is an integer. It is much easier to compute. However, why does the page Directory still need to be shielded with 12 characters lower? Because, in the same way, we only need to block the lower 10 bits, but I think, because 12> 10, we can make the page Directory and page table use the same data structure, convenience.

This post only introduces the principle of general conversion. It will not be too long to extend paging, page protection mechanism, and PAE mode paging ...... You can refer to other professional books.
 
 
 
Win32 uses a two-layer table structure to implement address ing, because each process has private 4G virtual memory space, each process has its own hierarchical table structure for address ing.
The first layer is called the page Directory, which is actually a memory page. The Win32 Memory Page has a size of 4 kb. This memory page is divided into 1024 items by 4 bytes, each item is called a "page Directory item" (partial de );
The second layer is called a page table. There are 1024 page tables in this layer. The structure of the page table is similar to that of the page Directory. Each page table is also a memory page. This memory page is divided into 1024 items in a size of 4 kb, each item in a page table is called a page table item (PTE). It is easy to know that there are 1024 × 1024 page table items. Each page table item corresponds to a "Memory Page" in a physical memory, that is, a total of 1024x1024 physical memory pages, each physical memory page is 4 kb, in this way, the virtual physical memory of 4 GB can be indexed.
As shown in (the size of the page Directory items and page table items in the note should be 4 bytes rather than 4 kb ):

Win32 provides a 4 GB virtual address space. Therefore, each virtual address is a 32-bit integer, that is, the pointer we usually call, that is, the pointer size is 4B. It consists of three parts, such:

The first part of the three parts, that is, the first 10 digits are the page Directory subscript, which is used to address the page Directory items. There are exactly 1024 page Directory items. Find the page Directory and find the page table corresponding to the page Directory. The second part is used to address the page table and find the page table items. There are a total of 1024 page table items, and the physical memory pages are found through the page table items. The third part is used to find the corresponding byte on the physical memory page. The size of a page is 4 kb, and 12 bits can meet the addressing requirements.
Example:
Assume that a thread is accessing the data pointed to by a pointer (Win32 pointer refers to the virtual address). This pointer refers to 0x2A8E317F, indicating this process:

0x2A8E317F is written in binary format as 0010101010_0011100011_000101111111. For convenience, it is divided into three parts.
First, locate the page Directory items according to 0010101010 addressing. Because a page Directory item is 4 kb, first move 0010101010 to the left, 001010101000 (0x2A8), and use this subscript to find the page Directory item, then, locate a page table in the next layer based on the directory items on this page.
Then, according to the 0011100011 addressing, find the page table items in the page table in the previous step. The addressing method is similar to the preceding method. After finding the page table items, you can find the corresponding physical memory page.
Finally, according to the 000101111111 addressing, find the intra-page offset.
The above assumption is that the data is already in the physical memory. In fact, it is determined whether the accessed data is in the memory that is also completed in the address ing process. The Win32 system always assumes that the data is already in the physical memory and performs address ing. There is a flag in the page table to identify whether the page containing this data is in the physical memory. If it is in the physical memory, address ing will be performed directly. Otherwise, page missing interruption will be thrown, at this time, the page table item can also identify whether the page containing this data is in the page file (External Storage). If not, the access is in violation and the program will exit, page table items will find out which page file the data page belongs to, and then transfer the data page to the physical memory, and then proceed with address ing. To implement private 4 GB virtual address space for each process, that is to say, each process has its own page Directory and page table structure. For different processes, even the same pointer (virtual address) the physical addresses mapped by different processes are also different, which means that it is meaningless to pass pointers between processes.


Page-Based Memory Management in Linux
In principle, Linux only needs to allocate the required data structure to each process and put it in the memory. Then, when the process is scheduled, it switches to the Register c3.3, the rest is handed over to the hardware (haha, in fact it is much more complicated, but I only analyze the most basic process ).

I mentioned above the i386 second-level page management architecture. However, some CPUs have third-level or even fourth-level architecture. Linux provides a unified interface for each CPU to provide images at a higher level. Provides a layer-4 page management architecture to be compatible with the CPU of these level-2, level-3, and level-4 Management architectures. The four levels are:

Page global directory PGD (corresponding to the page Directory just now)
Page parent directory PUD (new)
Page center Directory (new)
Page table PT (corresponding to the page table just now ).

Based on the hardware conversion principle, the entire conversion only requires secondary array indexes, such:

So how can I coordinate the 32-bit hardware that uses the second-level management architecture with level-4 conversion? Well, let's see how to divide linear addresses in this case!
From the hardware point of view, the 32-bit address is divided into three parts-that is to say, the three leaders are only known when the software is not managed and finally implemented to the hardware.
From the software perspective, two parts are introduced, that is, there are five parts. -- It is easy to understand the hardware of the layer-2 architecture. When dividing the addresses, set the length of the upper-level directory on the page and the middle directory on the page to 0.
In this way, the operating system sees five parts, and the hardware is still divided by the rigid three parts, and there will be no errors, that is to say, we have built a harmonious computer system.

In this way, we will not set the two in the middle to 0 if we use a 64-bit address and a layer-4 CPU conversion architecture, software and hardware are in harmony again-the image is powerful !!!

For example, a logical address has been converted to a linear address, 0x08147258, and is converted to a binary address, that is:
0000100000 0101000111 001001011000
The kernel divides the address.
Pgd= 0000100000
PUD = 0
PMD = 0
PT = 1, 0101000111
Offset = 001001011000

Now we can understand Linux's Hardware-targeted tricks, because the hardware does not see the so-called PUD, PMD, so in essence, the PGD index is required to directly correspond to the PT address. Instead of going to PUD and PMD to check the Array (although the two of them are in a linear address, the length is 0, 2 ^ 0 = 1, that is, they are all arrays with an array element ), then, how can we reasonably arrange the address of the kernel?
From the software point of view, because the item has only one 32-bit, it can store the address pointer with the same length as that in PGD. So the so-called first PUD, to do the ing conversion to the PMD, it becomes to keep the original value unchanged, one by one can be changed. In this way, "logically pointing to a PUD and then to a PDM, but physically pointing to the corresponding PT image, because the hardware doesn't even know PUD or PMD ".

Then hand it over to the hardware. The hardware divides the address and you can see the following:
Page Directory = 0000100000
PT = 1, 0101000111
Offset = 001001011000
Well, first index in the page Directory Array Based on 0000100000 (32), find the address in its element, get its 20-bit height, and find the address of the page table, the address of the page table is dynamically allocated by the kernel. Then, the final physical address is added with an offset.
 
 
V. Storage Methods
The protection mode is the basis of the modern operating system. Understanding it is the first mountain we need to go over. The protection mode is relative to the actual mode. They work in two ways of the processor. A long time ago, we used dos to run in real mode, while the current windows operating system is running in protected mode. The two running modes are quite different,
The real mode is developed from 8086/8088, so it is more like a simple mode for running single-chip microcomputer. After the computer is started, it first enters the real mode, there are only 20 address lines through 8086/8088, so its addressing range is only 20 power of 2, that is, 1 M. The memory access mode is the familiar seg: offset logical address mode. For example, we provide the address logical address, which will be converted into 20 physical addresses in the cpu, that is, move the seg four places to the left and add the offset value. For example, if the address is 1000 h: 5678 h, the physical address is 10000 h + 5678 h = 15678 h. The real mode is retained in the subsequent cpu, but the limitations of the real mode are obvious, because the use of seg: offset Logical Address can only access 1 m more memory space, it is difficult to access a space of more than 1 MB in a cpu with 32 address lines. With the continuous development of computers, the working mode of the real mode is increasingly unable to meet the computer's management of resources (storage resources and cpu resources, etc.), resulting in a new management mode-protection mode.
80386 or more of the processor functions are much larger than their previous processors, but only in protection mode can the processor play a role. In protection mode, all 32 address lines are valid and can address 4 GB physical address space. The expanded storage segmentation mechanism and the optional storage paging mechanism are available, it not only provides hardware support for Memory sharing and protection, but also provides hardware support for implementing virtual memory; supports multi-task; 4 privileged-level and complete privileged-level check mechanisms, data security and confidentiality are achieved. After the computer starts, it first enters the real mode. Only by setting the corresponding registers can it enter the protection mode (which will be introduced later ). The protection mode is an overall way of working, but further discussions are more conducive to learning.

The storage mode is mainly reflected in the memory access mode. Due to compatibility and IA32 framework restrictions, the protection mode is extended to the real-mode seg: offset mode (I .e: in fact, the form of seg: offset is only a shell in the protection mode, and the internal storage mode is different from the actual mode. In protected mode, the logical address is not directly converted to a physical address. Instead, the logical address is first converted to a linear address and then to a physical address.

Linear address is a new concept, but you should not think too complicated. Simply put, it is just h ~ Ffffffffh (0 ~ 4G) linear structure is a continuous address that can be expressed by 32 bite bits. However, it is a conceptual address, an abstract address, and does not exist in reality. Linear address addresses are mainly generated by paging. After obtaining the logical address, the processor first converts it to a linear address through the segmentation mechanism, and then the linear address is converted to a physical address through the paging mechanism to finally read the data.
 
The segmentation mechanism is required and the paging mechanism is optional. When paging is not used, linear addresses are directly mapped to physical addresses, the paging mechanism is mainly set up to achieve virtual storage (the paging mechanism is described later ). The following section describes how to convert a logical address to a linear address.
The segmentation mechanism cannot be bypassed in the protection mode and returns to our seg: offset address structure. In the protection mode, the new name of seg is "segment Selection Sub" (seg .. selector ). Segment Selection Sub-, GDT, and LDT constitute the storage structure of the protection mode. GDT and LDT are called Global Descriptor tables and Local Descriptor tables respectively. descriptor tables are a linear table (array ), the table stores descriptors.
"Descriptor" is a new concept in the protection mode. It is an 8-byte data structure, and its role is to describe a segment (which will be used later ), convert the offset of the segment base address and logical address (sel: offset) recorded in the description table to a linear address. The descriptor consists of three parts: Base, Limit, and Attr ). A task involves multiple segments. Each segment requires a descriptor to describe. To facilitate organization management, 80386 and later processors organize the descriptor into a table, that is, the descriptor table. In the protection mode, there are three types of Descriptor tables: Global Descriptor Table (GDT), Local Descriptor Table (LDT), and Interrupt Descriptor Table (IDT) (IDT will be discussed later ).
(1) The Global Descriptor Table GDT (Global Descriptor Table) has only one GDT in the whole system, and GDT can be placed anywhere in the memory, however, the CPU must know the GDT entry, that is, where the base address is stored. The Intel designer provides a register GDTR for storing the GDT entry address, after the programmer sets GDT to a location in the memory, the GDT entry address can be loaded into the memory using the LGDT command. From then on, the CPU uses the content in this register as the GDT entry to access GDT. GDTR stores the base address of GDT in memory and its table length limit.

(2) segment Selection Sub-(Selector) the Global Descriptor Table accessed by GDTR is completed through "segment Selection Sub" (segment register in real mode), step 3. The segment Selection Sub-is a 16-bit register (the segment registers are the same in real mode)

Segment selection consists of three parts: descriptor index, TI, and request privilege level (RPL ). Its index (descriptor index) indicates the location of the required segment descriptor in the descriptor table, the corresponding descriptor can be found based on the base address of the Descriptor Table stored in GDTR (Step 3 ). Then, the OFFSET of the Logical Address (SEL: OFFSET) added to the base address of the segment in the descriptor table can be converted to a linear address (Step 3 ), the TI value in the segment Selection Sub-has only one 0 or 1, which indicates that the sub-selection is in GDT, and 1 indicates that the sub-selection is in LDT. The request privilege level (RPL) indicates the sub-privilege level. There are four privilege levels (Level 0, level 1, level 2, and level 3 ). For example, the logical address 21 h: 12345678h is given and converted to a linear address.
A. select the sub-SEL = 21 h = 0000000000100 0 01b, which means: select the sub-index = 4, that is, 100b selects the 4th descriptors in GDT; TI = 0 indicates that the Selection Sub-program is selected in GDT; 01b on the left indicates that the privileged level RPL = 1
B. OFFSET = 12345678h if the Base address of the segment described in the fourth GDT descriptor is 11111111 h at this time, the linear address = 11111111 h + 12345678 h = 23456789 h
(3) The Local Descriptor Table LDT (Local Descriptor Table) can have several Local Descriptor tables, and each task can have one. We can understand GDT and LDT as follows: GDT is a level-1 Descriptor Table, and LDT is a level-2 Descriptor Table. V.

LDT and GDT are essentially the same, but LDT is nested in GDT. LDTR records the starting position of the Local Descriptor Table. Different from GDTR, LDTR contains a segment selection sub-item. Because LDT itself is also a piece of memory and a segment, it also has a descriptor to describe it. This descriptor is stored in GDT, and there will also be a selector corresponding to this descriptor, LDTR loads such an option. LDTR can be changed at any time in the program by using the lldt command. 5. If Selector 2 is loaded, LDTR points to the table LDT2. For example, if we want to select 12345678 h for the segment described by the third descriptor in Table LDT2.
1. First, you need to load LDTR to point it to LDT2. Use the lldt command to load Select2 to LDTR.
2. when using the logical address (SEL: OFFSET) to access the table, index = 3 indicates the third descriptor is selected. TI = 1 indicates that the Selection Sub-item is selected in the LDT. At this time, LDTR points to LDT2, therefore, it is selected in LDT2. The SEL value is 1Ch (the binary value is 11 1 00b ). OFFSET = 12345678 h. Logical Address: 1C: 12345678 h
3. The SEL selects the descriptor and adds OFFSET to the Base address of the descriptor to obtain a linear address. For example, if the Base address is 11111111 h, the linear address = 11111111 h + 12345678 h = 23456789 h.
4. in this case, if you want to access the third descriptor in LDT1, you only need to use the lldt command to load the Selector 1 and then execute steps 2 and 3 (because LDTR points to LDT1 again)
Because each process has its own program segment, data segment, and stack segment, with a local descriptor table, the program segment, data segment, and stack segment of each process can be encapsulated together, you only need to change LDTR to access different process segments.
The storage mode is the basis of the protection mode. To learn how to compare it with the storage mode in the real mode, the general idea is to first find the descriptor of the corresponding segment in the descriptor table by selecting the sub-segment, first, the segment location is determined based on the segment base address in the descriptor, and then the linear address is calculated using OFFSET plus the segment base address.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.