Use windbg to observe the PAE-enabled paging Mechanism
I introduced the paging mechanism of the CPU in section 2.7 of software debugging, because this is the supporting content of this book. Considering the length constraints, I did not discuss how to enable Pae. After the publication of the book, many readers are very interested in this content, and some readers have encountered the situation of enabling Pae. Therefore, I decided to write this short article to introduce the overview of PAE and how to perform the 2.7.5 test in the system after the PAE is enabled.
PAE is the abbreviation of physical address extension, that is, physical address extension. Simply put, the addressing capability of the IA-32 processor is extended from the original 4 GB to 64 GB. Addressing 4 GB space, the physical address must be 32-bit in width. Similarly, to address 64 GB space, the physical address width is 36 bits. For this reason, PAE is also known as a PAE-36bit.
If from the CPU pin, then the CPU supporting PAE should have 36 address lines in principle, but because the IA-32 CPU's front-end bus is 64-Bit Width, each time can transmit 8 bytes of data at the same time. Therefore, the low three-digit address line is omitted. Take the Pentium 4 CPU as an example. Its address line is a [35: 3] #. That is to say, when the CPU accesses the memory, it always gives the front-end bus an address with a low 3-bit value of 0, which actually plays the function of alignment the address by 8 bytes.
The PAE feature is the first generation of the P6 processor, introduced by Pentium Pro (see software debugging 35 pages), since then the IA-32 CPU supports this feature. The system software can use the cpuid command to check whether the currently running CPU supports Pae. The 5th-bit control register CR4 (Cr4 [5]) is used to enable PAE (see software debugging page 44 ).
Because when the IA-32 CPU is working in 32-Bit mode, the software uses a 32-Bit Width virtual address and linear address, so there is a problem, how to translate a 32-bit linear address into a 36-bit physical address? The answer is to extend the original second-level ing to third-level ing, that is, to add a level based on the original page Directory and page table, which is called the page Directory Index Table. The size of a single memory page varies by 4 kb or 2 MB.
Figure 1 linear address translation to physical address when PAE is enabled (4 kb Memory Page) (from IA-32 Manual Volume 3A)
The page is 4 kb memory. The pdptr is the abbreviation of page Directory pointer table register. It is the alias of the 03 register after PAE is enabled. The 32-bit linear address is divided into the following three parts:
- 2-bit (30-bit and 31-bit) page Directory Index, used to index the corresponding table items in the page Directory Index Table.
- 9-bit page Directory Index, used to index the corresponding table items in the page Directory table.
- 9-bit 12-20 page table index, used to index the corresponding table items in the page table.
- 12-bit (0-11) Intra-page offset, which is the same as before.
Compared to the 4kb case where PAE is not enabled (see section 2.7.4 of software debugging:
The number of tables has changed from two to three.
Because the addresses in the table are physical addresses, each table item in the three tables changes from 32-bit to 64-bit. The specific format is shown in.
Figure 2 address translation table item format when PAE is enabled (4 kb Memory Page) (from IA-32 Manual Volume 3A)
That is to say, each 64-bit table item has 24 bits to represent the base address. The 24 bits correspond to the 24 bits of the physical address, and the 12 bits are 0. This means that the page Directory table, page table, and the base address of each page are 4 kb aligned.
Figure 3 linear address translation to physical address when PAE is enabled (2 MB memory page) (from IA-32 Manual Volume 3A)
Figure 3 shows the 2 MB memory page. As with PAE not enabled, page tables are no longer required.
With the above foundation, we will perform the experiment in section 2.7.5 on a Windows XP SP2 system with PAE enabled.
According to Step 1-4, find the gpsznum variable address 000ab048, the content is as follows:
0: 002> dB 000ab048
000ab048 31 00 32 00 33 00 34 00-35 00 36 00 37 00 38 00 1.2.3.4.5.6.7.8.
000ab058 39 00 2E 00 00 00 00 00 00 00 00 00 00 00 00 9 ...............
000ab068 00 00 00 00 00 00 00 00 00 00 00 00 1E 68 72 28 ...... HR (
000ab078 02 00 07 00 3E 01 08 00-90 B0 0a 00 00 B0 0a 00 ......> ...........
000ab088 04 00 02 00 20 01 0C 00-01 00 00 01 00 00 00 ...............
000ab098 00 00 00 00 00 00 00 00 00 00 00 57 00 53 00.
000ab0a8 02 00 04 00 24 01 0C 00-30 00 00 00 78 01 0a 00 ...... $ ...... 0 ...... x...
000ab0b8 02 00 02 00 26 01 08 00-d0 B0 0a 00 F0 B0 0a 00 ....&...........
Start a local kernel debugging session and observe the overview of the calc process:
Lkd>! Process 0 0 calc.exe
Process 896acb08 sessionid: 0 CID: 1020 peb: 7ffd5000 parentcid: 0fc8
Dirbase: 1b1c0aa0 objecttable: e3265fb8 handlecount: 48.
Image: calc.exe
The preceding dirbase is the value of the CR 3 register, that is, the content of the pdptr. Its format is as follows:
The linear address 000ab048 to be converted is displayed as binary:
Lkd>. Formats 000ab048
Evaluate expression:
HEX: 000ab048
Binary: 00000000 00001010 10110000 01001000
According to Figure 1, it is divided into the following four parts:
- The maximum two digits are 0, which is the index of the page Directory Index.
- The next 9 bits (000000 000) are page Directory indexes, that is, 0.
- The next nine digits (01010 1011) are page Directory indexes, that is, 0xab. Lkd>? 0y010101011
Evaluate expression: 171 = 000000ab
- The last 12-bit (0000 01001000) is the intra-page offset, that is, 0x48.
Because the low 5 bits (1b1c0aa0) are 0, you can know that the base address of the index table in the page Directory of the calculator process is 0x1b1c0aa0. Observe the content near this address:
Lkd>! Dd 0x1b1c0aa0
#1b1c0aa0 6408b001 00000000 34bcc001 00000000
#1b1c0ab0 3d00d001 00000000 4430a001 00000000
#1b1c0ac0 5ae85001 00000000 12a46001 00000000
#1b1c0ad0 41807001 00000000 5e284001 00000000
#1b1c0ae0 35377001 00000000 3bcb8001 00000000
Each page Directory has a total of four table items, each of which is 8 bytes (64-bit). Therefore, you can know that the first two rows above are the page Directory of the calculator process.
According to the above decomposition, 1b1c0aa0 corresponds to Table 0, that is, 00000000 '6408b001. As shown in figure 2, bit 12 in place 35 is the base address of the page Directory table, which is 24 characters high. Therefore, you can know that the base address of the corresponding page Directory table is 0x6408b000. Observe its table No. 0:
Lkd>! DQ 0x6408b000
#6408b000 00000000 '42d20067 00000000 '3c5b6067
#6408b010 00000000 '3f11b067 00000000 '1e551067
#6408b020 00000000 '0e0000067 00000000 '2cecc067
#6408b030 00000000 '39d4e067 100' 00000000
#6408b040 00000000 '0b6db067 100' 00000000
#6408b050 00000000 '2017 00000000 '00000000
#6408b060 00000000 '000000' 00000000
#6408b070 00000000 '2017 00000000 '00000000
It can be seen that the page Directory table item corresponding to the linear address we want to translate is 00000000 '42d20067, where the bit 12 in place 35 is the 24-bit high of the base address of the page table, therefore, we can know that the base address of the page table we are looking for is 42d20000. Observe its 0xab table item:
Lkd>! DQ 42d20000 + 0xab * 8
#42d20558 80000000 '46852067 80000000 '2a3db067
#42d20568 80000000 '4009c067 80000000 '43263067
#42d20578 80000000 '444e4067 80000000 '7b165067
#42d20588 80000000 '0b92e067 80000000 '3d12f067
#42d20598 80000000 '283b0067 100' 80000000
#42d205a8 80000000 '348ba067 80000000 '72cbb067
#42d205b8 80000000 '421fd067 80000000 '7223e067
#42d205c8 00000000 '2017 00000080 00000000 '00000000
In this way, the page table items are as follows: 80000000 '421fd067, and its bit 12 is in place. 35 is the 24-bit high of the base address of the Memory Page, that is, the base address of the Memory Page corresponding to the linear address 000ab048 is 421fd000, add the offset 0x48 to the preceding page to get the final physical address, that is, 0x421fd048, and display its content:
Lkd>! DB 46852048
#46852048 31 00 32 00 33 00 34 00-35 00 36 00 37 00 38 00 1.2.3.4.5.6.7.8.
#46852058 39 00 2E 00 00 00 00 00 00 00 00 00 00 00 9 ...............
#46852068 00 00 00 00 00 00 00 00 00 00 00 00 1E 68 72 28 ...... HR (
#46852078 02 00 07 00 3E 01 08 00-90 B0 0a 00 00 B0 0a 00 ......> ...........
#46852088 04 00 02 00 20 01 0C 00-01 00 00 00 01 00 00 ...............
#46852098 00 00 00 00 00 00 00 00-00 00 00 57 00 53 00 ............ w.s.
#468520a8 02 00 04 00 24 01 0C 00-30 00 00 00 78 01 0a 00 ...... $ ...... 0 ...... x...
#468520b8 02 00 00 26 01 08 00-d0 B0 0a 00 F0 B0 0a 00 ....&...........
It can be seen that the content is consistent with the content observed by the user mode debugger.