In-depth understanding of Linux memory mapping mechanism

Source: Internet
Author: User
Keywords Cloud computing linear address virtual address
Tags address addressing analysis based beginning binary cache cloud

I. Introduction

We often see in the program disassembled some similar to 0 × 32118965 this address, the operating system called linear address, or virtual address. What is the use of virtual address? Virtual address is how to translate into physical memory address? This chapter will give a brief account of this.

1.1 Linux Memory Addressing Overview

Modern sense of the operating system are in 32-bit protection mode. Each process can generally address 4G of physical space. But our physical memory is generally hundreds of M, the process how to get 4G of physical space? This is the benefit of using a virtual address, which is usually done using a technique called virtual memory, as part of the hard disk can be used as memory. Exceptions Now that the operating system is divided into system space and user space, the use of virtual addresses can be well protected kernel space is destroyed by user space.

How to convert the virtual address to a physical address, the conversion process with the operating system and CPU to complete the operating system for the CPU settings page table. CPU through the MMU unit for address translation.

1.2 Browse the kernel code tools

Now the kernel is very large, so we need some kind of tool to read a large source code system, now the kernel development tools are used vim + ctag + cscope browse the kernel code, there are ready-made makefile online to generate ctags / cscope / etags.

First, usage:

Find an empty directory, copy the attachment Makefile into it. Then in the directory, optionally run the following make command:

$ make

Will handle the source files under / usr / src / linux, generate ctags in the current directory, cscope

Note: SRCDIR is used to specify the kernel source directory, if not specified, the default is / usr / src / linux /

1) Create only ctags

$ make SRCDIR = / usr / src / linux-2.6.12 / tags

2) Create cscope only

$ make SRCDIR = / usr / src / linux-2.6.12 / cscope

3) Create ctags and cscope

$ make SRCDIR = / usr / src / linux-2.6.12 /

4) Create etags only

$ make SRCDIR = / usr / src / linux-2.6.12 / TAGS

Second, the kernel source files included in the process:

1) does not include drivers, sound directory

2) Excludes irrelevant architectural directories

3) fs directory only includes the top directory and ext2, proc directory

Third, the most simple ctags command

1) Enter

After entering vim, use

: tag func_name

Jump to the function func_name

2) look at the function (identifier)

Want to enter the cursor function, use


3) back off

Roll back with CTRL + T

1.3 kernel version of the selection

This paper analysis, I chose the linux-2.6.10 version of the kernel. The latest kernel code is 2.6.25. But now the mainstream servers are using RedHat AS4 machine, it makes

With 2.6.9 kernel. I chose 2.6.10 because it is very close to 2.6.9 and now Red Hat Enterprise Linux 4 is the most stable and powerful commercial product based on the Linux 2.6.9 kernel. In 2004

During the year, open source projects such as Fedora provided an environment for more mature Linux 2.6 kernel technology, making the Red Hat Enterprise Linux v.4 kernel more and better than previous versions

Features and algorithms, including:

Common Logical CPU Scheduler: handles multi-core and hyper-threaded CPUs.

• Object-based reverse mapping of virtual memory: Improves memory-constrained system performance.

Read Replication Update: SMP algorithm optimization for operating system data structures.

Multiple I / O Scheduler: Choose according to your application environment.

Enhanced SMP and NUMA support: Improved performance and scalability for large servers.

Network Interruption Mitigation (NAPI): Improves the performance of high-traffic networks.

The Linux 2.6 kernel uses many techniques to improve the use of large amounts of memory, making Linux more suitable for the enterprise than ever before. Including reverse mapping

, Use larger memory pages, page table entries stored in high memory, and more stable manager. Therefore, I choose linux-2.6.10 kernel version as the analysis object.

Two. X86 hardware addressing method

Please refer to the Intel x86 manual ^ _ ^

Third, the kernel page table settings

CPU mapping to the premise that the operating system to prepare for its kernel page table, and page table settings, the kernel in the initial system startup and system initialization are set separately.

3.1 Several memory-related macros

These macros convert unsigned integers into their corresponding types

#define __pte (x) ((pte_t) {(x)})

#define __pmd (x) ((pmd_t) {(x)})

#define __pgd (x) ((pgd_t) {(x)})

#define __pgprot (x) ((pgprot_t) {(x)})

According to x convert it to the corresponding unsigned integer

#define pte_val (x) ((x) .pte_low)

#define pmd_val (x) ((x) .pmd)

#define pgd_val (x) ((x) .pgd)

#define pgprot_val (x) ((x) .pgprot)

Converts the linear address of kernel space to a physical address

#define __pa (x) ((unsigned long) (x) -PAGE_OFFSET)

Translates a physical address into a linear address

#define __va (x) ((void *) ((unsigned long) (x) + PAGE_OFFSET))

x is the value of the page table entry, get the corresponding physical page frame number through pte_pfn, and finally get the corresponding physical page descriptor through pfn_to_page

#define pte_page (x) pfn_to_page (pte_pfn (x))

If the corresponding entry value is 0, 1 is returned

#define pte_none (x) (! (x) .pte_low)

x is the page table entry value, the right to move 12 to get the corresponding physical page frame number

#define pte_pfn (x) ((unsigned long) (((x) .pte_low >> PAGE_SHIFT)))

According to the page frame number and page table property value into a page table entry value

#define pfn_pte (pfn, prot) __pte (((pfn) << PAGE_SHIFT) | pgprot_val (prot))

According to the page frame number and page table property value into a middle item value

#define pfn_pmd (pfn, prot) __pmd (((pfn) << PAGE_SHIFT) | pgprot_val (prot))

Writes the specified value to an entry

#define set_pte (pteptr, pteval) (* (pteptr) = pteval)

#define set_pte_atomic (pteptr, pteval) set_pte (pteptr, pteval)

#define set_pmd (pmdptr, pmdval) (* (pmdptr) = pmdval)

#define set_pgd (pgdptr, pgdval) (* (pgdptr) = pgdval)

According to the linear address to get the high 10-bit value, which is the index in the directory table

#define pgd_index (address (>> address >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))

According to page descriptors and attributes to a page table entry value

#define mk_pte (page, pgprot) pfn_pte (page_to_pfn (page), (pgprot))

3.2 kernel page table initialization

The kernel has not enabled paging until it enters protection mode. Before this, the kernel first sets up a temporary kernel page table because the kernel still needs to use the page until the kernel continues to be initialized until the full memory mapping mechanism is established after entering protected mode Table to map the corresponding memory address. Initialization of the temporary page table is done in rch / i386 / kernel / head.S:

swapper_pg_dir is a temporary page global catalog table that is statically initialized during kernel compilation.

pg0 is the beginning of the first page table, it is also initialized during the kernel compilation process.

The kernel builds a temporary page table with the following code:

ENTRY (startup_32)


/ * Get the index of the beginning of the directory entry, from which you can see that the kernel is established in the 768 entries swapper_pg_dir, the corresponding linear address is 0xc0000000 above

The address, that is, the kernel initializes its own page table *

page_pde_offset = (__PAGE_OFFSET >> 20);

/ * pg0 address compiled in the kernel, it is already with 0xc0000000, minus 0xc00000000 get the corresponding physical address * /

movl $ (pg0 - __PAGE_OFFSET),% edi

/ * Passing the directory address to edx indicates that the kernel should also start building page tables from 0x00000000, thus ensuring a smooth transition from fetching physical addresses to fetching instructions in linear space at system space, as explained in detail below * /

movl $ (swapper_pg_dir - __PAGE_OFFSET),% edx

movl $ 0 × 007,% eax

leal 0 × 007 (% edi),% ecx

Movl% ecx, (% edx)

movl% ecx, page_pde_offset (% edx)

addl $ 4,% edx

movl $ 1024,% ecx


stosl addl $ 0 × 1000,% eax

loop 11b

/ * Kernel in the end how many page tables to create, that is, how much memory space to map, depending on the conditions of this judge. As long as the kernel is guaranteed to map to the kernel, including the code segment, the data segment, the initial page table and 128k space for dynamic data structure in the kernel initialization process * * /

leal (INIT_MAP_BEYOND_END + 0 × 007) (% edi),% ebp

cmpl% ebp,% eax

jb 10b

movl% edi, (init_pg_tables_end - __PAGE_OFFSET)

In the above code, why the kernel maps the first few directory entries in user space and kernel space to the same page table, although the kernel has entered protected mode in head.S, the kernel is now in protected mode Segment addressing mode, because the kernel has not enabled paging mapping mechanism, are now based on the physical address to fetch the symbol address encountered in the code can only be subtracted 0xc0000000 Caixing, when the opening of the mapping mechanism do not have to Now cpu instruction pointer eip still point to the low area, if only to establish the kernel space mapping, when the kernel open the mapping mechanism, the address in the low area no way to address, there should be no corresponding page table Unless you encounter a symbolic address as an absolute transfer or a subroutine call. So as soon as possible to open the CPU page mapping mechanism.

movl $ swapper_pg_dir -__ PAGE_OFFSET,% eax

movl% eax,% cr3 / * cr3 control register is stored in the directory table address * /

movl% cr0,% eax / * Turn on the mapping mechanism to the highest bit of cr0 * /

orl $ 0 × 80000000,% eax

movl% eax,% cr0

ljmp $ __ BOOT_CS, $ 1f / * Clear prefetch and normalize% eip * /


lss stack_start,% esp

By ljmp $ __ BOOT_CS, $ 1f This instruction to the CPU into the system space to continue execution because __BOOT_CS is a symbolic address, address 0xc0000000 above.

After head.S finishes building the kernel temporary page table, it continues with initialization, including initializing INIT_TASK, which is the first process after the system is started; building a complete interrupt handler, reloading the GDT descriptor, and finally jumping Go to init_main.c start_kernel function to continue initialization.

3.3 kernel page table complete establishment

The kernel continues the second phase of initialization in start_kernel () because at this stage, the kernel is already in protected mode, the previous is simply set up the kernel page table, the kernel must first create a complete page table to continue running Because memory addressing is a prerequisite for continued operation of the kernel.

The code for pagetable_init () is in mm / init.c:

[start_kernel ()> setup_arch ()> paging_init ()> pagetable_init ()]

For the sake of simplicity, I've neglected to support the PAE option.

static void __init pagetable_init (void)



pgd_t * pgd_base = swapper_pg_dir;


kernel_physical_mapping_init (pgd_base);



In this function, the pgd_base variable points to swapper_pg_dir, which is the start address of the kernel directory, the pagetable_init () function passes

kernel_physical_mapping_init () function to complete the complete establishment of the kernel page table.

The kernel_physical_mapping_init function is also in mm / init.c, and I've omitted the code associated with PAE mode:

static void __init kernel_physical_mapping_init (pgd_t * pgd_base)


unsigned long pfn;

pgd_t * pgd;

pmd_t * pmd;

pte_t * pte;

int pgd_idx, pmd_idx, pte_ofs;

pgd_idx = pgd_index (PAGE_OFFSET);

pgd = pgd_base + pgd_idx;

pfn = 0;

for (; pgd_idx <PTRS_PER_PGD; pgd ++, pgd_idx ++) {

pmd = one_md_table_init (pgd);

if (pfn> = max_low_pfn)


for (pmd_idx = 0; pmd_idx <PTRS_PER_PMD && pfn <max_low_pfn; pmd ++, pmd_idx ++) {

unsigned int address = pfn * PAGE_SIZE + PAGE_OFFSET;


pte = one_page_table_init (pmd);

for (pte_ofs = 0; pte_ofs <PTRS_PER_PTE && pfn <max_low_pfn; pte ++, pfn ++, pte_ofs ++) {

if (is_kernel_text (address))

set_pte (pte, pfn_pte (pfn, PAGE_KERNEL_EXEC));


set_pte (pte, pfn_pte (pfn, PAGE_KERNEL));




By the author's comments, you can see the function of this function is to map the entire physical memory address from the beginning of kernel space, from the entire kernel space 0xc0000000, until the physical memory mapping is completed. This function is relatively long, but also uses a lot of macro definition of memory management, understand this function, you can probably understand how the kernel is to create a page table, the abstract model of a complete understanding. The following analysis of this function in detail:

The function begins to define four variables pgd_t * pgd, pmd_t * pmd, pte_t * pte, pfn;

pgd points to the address of a directory entry, pmd points to an intermediate directory address, pte points to the beginning of a page table pfn page frame number is initially 0. pgd_idx According to pgd_index macro calculation result is 768, but also from the kernel directory The 768th entry in the table is set. The 256 entries from 768 to 1024 are set by the Linux kernel to kernel directory entries, and the lower 768 directory entries are used by user space. Pgd = pgd_base + pgd_idx; pgd points to the 768th entry.

Then the function begins a loop that begins filling the contents of 256 directory entries from 768 to 1024.

The one_md_table_init () function finds the pointing pmd table based on pgd.

It is also defined in mm / init.c:

static pmd_t * __init one_md_table_init (pgd_t * pgd)


pmd_t * pmd_table;

#ifdef CONFIG_X86_PAE

pmd_table = (pmd_t *) alloc_bootmem_low_pages (PAGE_SIZE);

set_pgd (pgd, __pgd (__pa (pmd_table) | _PAGE_PRESENT));

if (pmd_table! = pmd_offset (pgd, 0))

BUG ();


pmd_table = pmd_offset (pgd, 0);


return pmd_table;


As can be seen, if the kernel does not enable the PAE option, the function returns the address of pgd via pmd_offset. Because of the two-level mapping model linux, pmd has been ignored in the middle of the table.

Then another sentence to judge:

>> if (pfn> = max_low_pfn)

>> continue

This is the key, max_low_pfn represents the total number of physical memory page frame. When pfn is greater than max_low_pfn, it indicates that the kernel has mapped the entire physical memory into system space, so the remaining entries that have not been filled are ignored directly. Because the kernel can already map the entire physical space, it is not necessary to continue filling in the remaining entries.

Followed by the second for loop, in the Linux 3-level mapping model, is to set the pmd table, but ignored in the 2-level mapping, only one cycle directly to the page table pte settings.

>> address = pfn * PAGE_SIZE + PAGE_OFFSET;

Address is a linear address, according to the above statement can be seen from the address 0xc000000 address, that is, starting from the kernel space, the back of the page table entry properties set will be used

To it.

>> pte = one_page_table_init (pmd);

According to pmd allocation of a page table, the same code mm / init.c:

static pte_t * __init one_page_table_init (pmd_t * pmd)


if (pmd_none (* pmd)) {

pte_t * page_table = (pte_t *) alloc_bootmem_low_pages (PAGE_SIZE);

set_pmd (pmd, __pmd (__pa (page_table) | _PAGE_TABLE));

if (page_table! = pte_offset_kernel (pmd, 0))

BUG ();

return page_table;


return pte_offset_kernel (pmd, 0);


pmd_none macro to determine whether the pmd table is empty, if it is empty, alloc_bootmem_low_pages allocate a 4k size physical page. Then set_pmd (pmd, __pmd

(__pa (page_table) | _PAGE_TABLE)); to set the pmd entry. Page_table is obviously a linear address, first through __pa macro into a physical address, with the _PAGE_TABLE macro, at this time they are still unsigned integers, through the __pmd unsigned integer into pmd type, after these conversions, you get A property entry, and then set the pmd entry through the set_pmd macro.

Then there is a loop, set 1024 page table entries.

The is_kernel_text function determines whether the address linear address belongs to the kernel code segment based on the address mentioned earlier, which is also defined in mm / init.c:

static inline int is_kernel_text (unsigned long addr)


if (addr> = (unsigned long) _stext && addr <= (unsigned long) __ init_end)

return 1;

return 0;


_stext, __init_end is a kernel symbol, generated when the kernel is linked, indicating the start and end addresses of the kernel code segment, respectively.

If address belongs to the kernel code segment, PAGE_KERNEL_EXEC property should be added when setting the page table entry, if not, then add a PAGE_KERNEL property.



#define _PAGE_KERNEL \


Finally set the page table entry by set_pte (pte, pfn_pte (pfn, PAGE_KERNEL)); First use the pfn_pte macro to combine the value of the page frame number and the page table entry into a page table entry value. When using the set_pte macro The page table entry value is written to the page table entry.

When the pagetable_init () function returns, the kernel has set up a kernel page table, call load_cr3 (swapper_pg_dir);

#define load_cr3 (pgdir) \

asm volatile ("movl% 0, %% cr3":: "r" (__pa (pgdir)))

Control swapper_pg_dir into control register cr3 Whenever cr3 is reset, the CPU loads the pages in the page-map directory into the TLB portion of the CPU internal cache. The mapping in memory now (in effect, in the cache) Change the directory, we must let the CPU load again. Since the page-mapping mechanism is natively enabled, the size of the mapped area in system space is expanded from this command, with the exception of the entire physical memory (high memory), in fact swapper_pg_dir has changed Of the directory entry is likely to be still in the cache, so the contents of the cache are also flushed to memory via __flush_tlb_all () so that the contents of the mapped directory in memory are guaranteed to be consistent.

3.4 How to build a page table summary

Through the above pagetable_init () analysis, we can clearly see that the construction of the kernel page table, nothing more than to the corresponding entries to write the next address and attributes. In the kernel space to retain part of the memory designed to store the kernel page table when the cpu to address when, in the kernel space, or in user space, will be mapped through this page table. For this function, the kernel has mapped the entire physical memory space, when the user space process to use physical memory, would not not be able to do the corresponding mapping? In fact, not the kernel just made a mapping, the mapping does not mean that the use of this is the kernel in order to facilitate the management of memory only.

Four. Instance analysis mapping mechanism

4.1 Sample Code

Through the previous theoretical analysis, we write a simple program to analyze how the kernel maps linear addresses to physical addresses.

[root @ localhost temp] # cat test.c

#include <stdio.h>

void test (void)


printf ("hello, world. \ n");


int main (void)


test ();


This code is very simple, we deliberately call the main test function, just want to see how the test function virtual address is mapped into a physical address.

4.2 Segment Mapping Analysis

We compile first, in the disassembly test file

[root @ localhost temp] # gcc-o test test.c

[root @ localhost temp] # objdump-d test

08048368 <test>:

8048368: 55 push% ebp

8048369: 89 e5 mov% esp,% ebp

804836b: 83 ec 08 sub $ 0 × 8,% esp

804836e: 83 ec 0c sub $ 0xc,% esp

8048371: 68 84 84 04 08 push $ 0 × 8048484

8048376: e8 35 ff ff ff call 80482b0 <printf @ plt>

804837b: 83 c4 10 add $ 0 × 10,% esp

804837e: c9 leave

804837f: c3 ret

08048380 <main>:

8048380: 55 push% ebp

8048381: 89 e5 mov% esp,% ebp

8048383: 83 ec 08 sub $ 0 × 8,% esp

8048386: 83 e4 f0 and $ 0xfffffff0,% esp

8048389: b8 00 00 00 00 mov $ 0 × 0,% eax

804838e: 83 c0 0f add $ 0xf,% eax

8048391: 83 c0 0f add $ 0xf,% eax

8048394: c1 e8 04 shr $ 0 × 4,% eax

8048397: c1 e0 04 shl $ 0 × 4,% eax

804839a: 29 c4 sub% eax,% esp

804839c: e8 c7 ff ff ff call 8048368 <test>

80483a1: c9 leave

80483a2: c3 ret

80483a3: 90 nop

As can be seen from the above results, ld to test () function assigned address 0 × 08048368. In the elf format executable file code, the actual location of ld always from 0 × 8000000 start the program code segment, for each A program are like this. As for the program in the physical memory at the actual location of the kernel in the memory mapping for its temporary arrangements, the specific address is determined by the physical memory allocated to the page. Assuming that the program has been run, the entire mapping mechanism has been established and the CPU is executing call 8048368 in main () to move to virtual address 0 × 08048368 to run below will be described in detail the virtual address to physical Address mapping process.

The first is the stage mapping stage. 0 × 08048368 is a program entry, more importantly, in the process of being executed by the CPU instruction counter EIP point, so in the code segment. Therefore, the i386CPU uses the current value of the code segment register CS as the selector of the segment map, that is, it uses it as the subscript in the segment description table. What is the value of CS?

Debugging with GDB test:

(gdb) info reg

eax 0 × 10 16

ecx 0 × 1 1

edx 0x9d915c 10326364

ebx 0x9d6ff4 10317812

esp 0xbfedb480 0xbfedb480

ebp 0xbfedb488 0xbfedb488

esi 0xbfedb534 -1074940620

edi 0xbfedb4c0 -1074940736

eip 0x804836e 0x804836e

eflags 0 × 282 642

cs 0 × 73 115

ss 0x7b 123

ds 0x7b 123

es 0x7b 123

fs 0 × 0 0

gs 0 × 33 51

You can see that the value of CS is 0x73, and we break it down into binary:

0000 0000 0111 0011

The lowest 2 is 3, indicating RPL value of 3, should be our province is in the user space, the value of RPL is 3.

The third bit of 0 indicates that the subscript is in the GDT.

The high 13 is 14, so the segment descriptor in the GDT table, the first 14 entries, we can go to the kernel code to verify the next:

In i386 / asm / segment.h:



You can see that the segment descriptor is indeed the 14th entry in the GDT table.

Let's go to the GDT table to see what the specific entry value is and GDT's content is defined in arch / i386 / kernel / head.S:

ENTRY (cpu_gdt_table)

.quad 0x0000000000000000 / * NULL descriptor * /

.quad 0x0000000000000000 / * 0x0b reserved * /

.quad 0 × 0000000000000000 / * 0 × 13 reserved * /

.quad 0x0000000000000000 / * 0x1b reserved * /

.quad 0 × 0000000000000000 / * 0 × 20 unused * /

.quad 0 × 0000000000000000 / * 0 × 28 unused * /

.quad 0 × 0000000000000000 / * 0 × 33 TLS entry 1 * /

.quad 0x0000000000000000 / * 0x3b TLS entry 2 * /

.quad 0 × 0000000000000000 / * 0 × 43 TLS entry 3 * /

.quad 0x0000000000000000 / * 0x4b reserved * /

.quad 0 × 0000000000000000 / * 0 × 53 reserved * /

.quad 0x0000000000000000 / * 0x5b reserved * /

.quad 0x00cf9a000000ffff / * 0 × 60 kernel 4GB code at 0x00000000 * /

.quad 0x00cf92000000ffff / * 0 × 68 kernel 4GB data at 0x00000000 * /

.quad 0x00cffa000000ffff / * 0 × 73 user 4GB code at 0x00000000 * /

.quad 0x00cff2000000ffff / * 0x7b user 4GB data at 0x00000000 * /

.quad 0 × 0000000000000000 / * 0 × 80 TSS descriptor * /

.quad 0 × 0000000000000000 / * 0 × 88 LDT descriptor * /

/ * Segments used for calling PnP BIOS * /

.quad 0x00c09a0000000000 / * 0 × 90 32-bit code * /

.quad 0x00809a0000000000 / * 0 × 98 16-bit code * /

.quad 0 × 0080920000000000 / * 0xa0 16-bit data * /

.quad 0 × 0080920000000000 / * 0xa8 16-bit data * /

.quad 0 × 0080920000000000 / * 0xb0 16-bit data * /

/ *

The APM segments have byte granularity and their bases

* and limits are set at run time.

* /

.quad 0x00409a0000000000 / * 0xb8 APM CS code * /

.quad 0x00009a0000000000 / * 0xc0 APM CS 16 code (16 bit) * /

.quad 0 × 0040920000000000 / * 0xc8 APM DS data * /

.quad 0x0000000000000000 / * 0xd0 - unused * /

.quad 0x0000000000000000 / * 0xd8 - unused * /

.quad 0x0000000000000000 / * 0xe0 - unused * /

.quad 0x0000000000000000 / * 0xe8 - unused * /

.quad 0x0000000000000000 / * 0xf0 - unused * /

.quad 0x0000000000000000 / * 0xf8 - GDT entry 31: double-fault TSS * /

.quad 0x00cffa000000ffff / * 0 × 73 user 4GB code at 0x00000000 * /

We expand this value into binary:

0000 0000 1100 1111 1111 1010 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111

Based on the above description of the segment descriptor entry value, we can draw the following conclusion:

B0-B15, B16-B31 is 0, which means the base address is all 0s.

L0-L15, L16-L19 is 1, indicating that the upper limit of the segment is all 0xffff.

G bit is 1 means that the length of the unit are 4KB.

A D bit of 1 indicates that all accesses to the segment are 32-bit instructions

P bit is 1 for segment in memory.

DPL is 3 means that the privilege level is level 3

S bit is 1 for the code segment or data segment

Type 1010 for the code segment, readable, executable, has not received the visit

This descriptor indicates the segment's entire 4G virtual memory space starting at address 0, where the logical address is translated directly into a linear address.

So after the paragraph-based mapping put the logical address into a linear address, which is why in Linux, the logical address is equivalent to a linear address.

4.3 page mapping analysis

Now into the process of page mapping, each process in Linux has its own page directory PGD, pointers to this directory are stored in the mm_struct data structure for each process. Whenever a process is scheduled to run, the kernel sets the control register cr3 for the upcoming process, while the MMU's hardware always obtains a pointer to the current page directory from cr3. When we move to the address 0 × 08048368 to go in the process, the process is running, cr3 early to set up, point to the page directory of our process. First linear address 0 × 08048368 expand into binary:

0000 1000 0000 0100 1000 0011 0110 1000

In contrast to the linear address format, we can see that the highest 10 bits are binary 0000 0000 00, which is decimal 32, so the MMU finds its directory entry in its page directory with 32 subscripts. The top 20 of this directory entry points to a page table, and the CPU adds 12 0s to the page to get the page table pointer. After finding the page table, the CPU looks again at the middle 10 bits in the linear address, 0001001000, which is the decimal 72. The CPU then looks for the corresponding entry in the page table as the subscript. The top 20 entries in the entry point to one physical memory page, followed by 12 0 to get the start address of the physical page. Assume that the physical address is 0 × 620000, the lowest 12 bits of the linear address is 0 × 368. Then the entry address of the test () function is 0 × 620000 + 0 × 368 = 0 × 620368

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.