Virtual memory mapping mechanism of Linux programs

Source: Internet
Author: User

1 Explanation of virtual memory:

The core concept of virtual memory is that the memory address used by the code is not related to the physical address.
In user space, the virtual address a of a process points to different physical memory, not to the address of another process.
Any time the CPU sends an instruction to the memory to access the data, the software converts the data of the virtual address into the physical address.
The work of turning a virtual address into a physical address into a physical address is done by the Memory Management Unit (MMU).
Virtual memory addresses can also be called logical addresses.

2) Memory Management unit:

The memory snap-in is part of the CPU function, and if the CPU has a cache, it will have a memory snap-in and vice versa.
The memory snap-in can map the access of two processes to the same memory logical address to a different physical address.
The memory snap-in works in close collaboration with the cache, passing memory as required between RAM and cache.
The memory snap-in divides memory into many pages, which are the smallest unit of physical memory that contains a 4KB byte address space per page.

3 mapping of virtual memory to physical memory:


3.1 The mapping process:

The transformation of the virtual address to the physical address is related to the architecture, and it is transformed in two ways on the X86 CPU by segmented and paging.

Virtual address (logical address)---segment mapping---linear address---page map---Physical address

The way Linux is used in Segment-page management is due to the hardware architecture of Intel's X86 CPUs. Such a dual mapping itself is unnecessary and does not work in the middle of Linux.
It can be understood that a virtual address is a linear address.
The following procedures are used to analyze the mapping of virtual memory to linear addresses to physical memory, and we take X86 as an example:


VI hello.c

#include <stdio.h>

int greeting () {
printf ("Hello world!/n");
return 0;
}

Int
Main () {
Greeting ();
return 0;
}

Write a Hello World program, compile with gcc hello.c-o hello


Objdump-xd Hello

Here we mainly look at the calls of main and greeting:
08048354 <greeting>:
8048354:55 Push%EBP
8048355:89 e5 mov%esp,%ebp
8048357:83 EC-Sub $0x8,%esp
804835a:c7 movl $0x8048470, (%ESP)
8048361:e8 2e FF FF call 8048294 <puts@plt>
8048366:B8 mov $0x0,%eax
804836b:c9 leave
804836C:C3 ret

0804836D <main>:
804836D:8D 4c Lea 0x4 (%ESP),%ecx
8048371:83 e4 F0 and $0xfffffff0,%esp
8048374:ff FC Pushl 0XFFFFFFFC (%ECX)
8048377:55 Push%EBP
8048378:89 e5 mov%esp,%ebp
804837A:51 Push%ECX
804837b:83 EC Sub $0x4,%esp
804837e:e8 d1 FF FF call 8048354 <greeting>
8048383:B8 mov $0x0,%eax
8048388:83 C4 Add $0x4,%esp
804838b:59 Pop%ECX
804838C:5D Pop%EBP
804838D:8D FC Lea 0XFFFFFFFC (%ECX),%esp
8048390:C3 ret

function Main () invokes the greeting function via call 8048354 <greeting>.

First you can see the LD assigned to greeting address is 0x08048354, in the elf-format executable code, the LD always start from the 0x08000000 to arrange the code snippet, for each program.
The actual location of the program in physical memory at the time of execution is arranged by the kernel when the memory map is created for it, and the specific address depends on the physical memory page that was allocated at that time. This is completely transparent to us.
The mapping mechanism is already in place when the program is running.


3.2) Segment Mapping

From the example above, when the greeting () function is invoked, the current address is 0x08048354, which is the value of the EIP pointer register, what is the value of CS?
The CS register is a selection code for segment mapping, which can be understood as an index.
In Linux, the selection code is only 4, that is to say, only 4 of the following 1, the 4 selection codes are:
Segment register Type numerical index TI RPL
__kernel_cs 0x10 0000 0000 00010 0 00
__kernel_ds 0x18 0000 0000 00011 0 00
__user_cs 0x23 0000 0000 00100 0 11
__user_ds 0x2b 0000 0000 00101 0 11

Compared with the above:
__kernel_cs index=2 ti=0 rpl=0
__kernel_ds index=3 ti=0 rpl=0
__user_cs index=4 ti=0 rpl=3
__user_ds index=5 ti=0 rpl=3

Explain the selection code:
1 on the paragraph register assignment, according to the following principles:
Cs=__user_cs
Ds=__user_ds
Es=__user_ds
Ss=__user_ds
Because our program is running in user space, both code snippets and data segments are __USER_XX

2 on the value of Ti, TI can be GDT (global Segment Description table), can also be LDT (local segment Description table).
The GDT is 0.
The LDT is 1.
Linux's ti is almost 0,linux kernel basically does not use the local description table Ldt,ldt just runs in vm86 mode wine and other Linux simulations run Windows
Or a DOS software application.

3 about Rpl,linux only 0, 32 levels.
0 represents the kernel process, and 3 represents the user process.

Through the above analysis, our program is obviously the user process, so the reflection is __user_cs,
The last value of the CS register is 0x23, and the index is 4. Binary (100) = Decimal (4)

And what is 4 in the GDT Global description table?
Let's take a look at the GDT Global Description table:

ENTRY (gdt_table)
. Quad 0x0000000000000000/*null desccriptor*/
. Quad 0x0000000000000000/*not used*/
. Quad 0X00CF9A000000FFFF/*0x10 kernel 4GB code at 0x00000000*/
. Quad 0X00CF92000000FFFF/*0x18 kernel 4GB code at 0x00000000*/
. Quad 0x00cffa000000ffff/*0x23 user 4GB code at 0x00000000*/
. Quad 0x00cff2000000ffff/*0x2b user 4GB code at 0x00000000*/
. Quad 0x0000000000000000/*not used*/
. Quad 0x0000000000000000/*not used*/

You can see that the index is 4 GDT is
. Quad 0x00cffa000000ffff/*0x23 user 4GB code at 0x00000000*/

Now expand these 4 descriptors:
63-60 59-56 55-52 51-48 47-44 43-40 39-36 35-32 31-28 27-24 23-20 19-16 15-12 11-8-7-4 3-0
KERNEL_CS:0X00CF9A000000FFFF-->0000 0000 1100 1111 1001 1010 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111
KERNEL_DS:0X00CF92000000FFFF-->0000 0000 1100 1111 1001 0010 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111
USER_CS:0X00CFFA000000FFFF-->0000 0000 1100 1111 1111 1010 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111
USER_DS:0X00CFF2000000FFFF-->0000 0000 1100 1111 1111 0010 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111

The following is an explanation of the descriptors:
Descriptor formats are as follows:
The 63-56-bit storage site is 31-24 sites and 0 sites.
55 bit is also called G-bit, in Linux are all 1, equal to 1 time length in 4k bytes, equal to 0 o'clock in bytes
54-bit is also called D-bit, in Linux are 1, equal to 1 for the access to the segment is 32-bit instruction, equals 0 to 16-bit instruction
53 digits equals 0
52 Bits, the CPU ignores the bit and can be used by the software.
The 51-48-bit address has 19-16 digits, 1.
The 47-bit is also called P-bit, which is 1 in Linux, which means that 4 segments are in memory.
46-45-bit is a DPL bit that represents the privilege level. There are 00 (0) and 11 (3) combinations.
44-bit is also called S-bit, equal to 1 when the general code segment or data segment, equal to 0 for system management of the system segment, such as various description table.
The 43-41-bit is called the type bit because there is a strong connection between you:
43-bit is also called e-bit, equal to 1 when the code snippet, when the 42nd bit is called C, c is equal to 0 o'clock will ignore the privilege level, C-bit equals 1 o'clock will be in accordance with the privilege level. At this time 41 bit is called R bit, equal to 1 o'clock is readable, for 0 o'clock
43-bit is equal to 0 when the data segment, when the 42nd is called Ed bit, ed bit is equal to 0 o'clock up (data section), ed bit equal to 1 o'clock down (stack segment), at this time 41 bit is called W bit, is equal to 1 o'clock for writable, for 0 o'clock not write.
The 40-bit is called a-bit, and is 1 in Linux, indicated as being accessed.
The 39-16-bit storage sites are 23 to 0 locations, with a base site of 0.
The 15-0-bit address has 15-0 digits, 1.

Conclusion: Each segment is an entire 4GB virtual storage space starting from 0 address, and the mapping of virtual address to line address keeps the original value unchanged.
As a result, the Linux kernel's page-mapping allows you to use a linear address as a virtual address. The two are exactly the same.


3.3) Page-map

3.3.1) The concept of page mapping:

1 in the I386 CPU, the basic idea of the page storage is: through the page Directory and page table divided into two levels from the linear address to the physical address mapping.
2 in Linux to take into account a variety of different CPUs, it is a hypothetical, virtual CPU and MMU based on the design of a generic model, and then put it into a variety of specific CPU.
Therefore, the Linux kernel mapping mechanism is designed into three layers, the page directory and the middle of the page table added a layer of "intermediate directory."
The logical three-layer mapping is a two-layer mapping for i386 CPUs and MMU, skipping the intermediate directory PMD, but the structure of the software maintains a three-layer mapping framework.
3 The page directory is called PGD, and the intermediate directory is called PMD, and the page table is called the pt.pt table entry, which is called the Pte.
The page directory, the Middle directory, and the page list are all of the arrays.
4 logically divide the linear address into 4 fields, respectively, in the offset of the page directory PGD, the offset in the intermediate directory PMD, the offset in the page table PT and the offset in the physical page, while the I386 CPU has no intermediate directory.
This is divided into 3 sections, which are the offset of the page directory PGD, the offset in the page table PT, and the offset within the physical page.
5 each process has its own page Directory table and page table, the process of switching is the current process of the page directory table to save the start address to the CR3 register.


3.3.2) linear address to Physical address mapping:

1 load the page directory starting address of a process into register CR3.
2 The physical Address of the page table is found by the offset of the 1th Dan-PGD of the linear address. The size of the page catalog table is 4k, exactly one page size, contains 1024 items, 4 bytes per item (32 bits)
3 with the linear address of the 2nd Dan Pt offset, find the table entries, page table size is also 4k, also contains 1024 items, 4 bytes per item (32 bits)
4 to get the table entry of the high 20-bit + low 12-bit 0, this high 20 is the physical address of the high 20, plus the linear address of the 3rd Dan 12-bit offset will get the final physical address.


3.3.3) illustrates the mapping process by using an instance:

Step one: Find the page table by page catalog table
Or take the above procedure as an example:
Hello program execution, call function grreeting, the virtual address here is also a linear address for 0x08048354
Call 08048354 <greeting>
The result of the decomposition is:
0000 1000 0000 0100 1000 0011 0101 0100
1th Dan (High 10 digits):
0000 1000 00
32 of the decimal, that is, in the page Catalog table offset 32 to find the physical address of its page table, that is, a pointer to the page table, its low 12 bit is 0, because the page table is 4KB size, so it must be the boundary alignment.

Step two: Find the starting Physical address of the page by page table high
Next is the second segment of the linear address (middle 10 digits):
00 0100 10 00
On the decimal 72, that is, in the page just found offset 72 to find the target page's starting physical address, high 20-bit valid address, low 12-bit fill to 0.

Step three: Get the final physical address
The final physical address is obtained by finding the starting physical address of the page, plus the offset address of the third segment of the linear address.
For example:
Third Dan: 0011 0101 0100
16 to 0x354
If the starting physical address for the target page is: 0x740000, then the final physical address is:
0x740000+0x354=0x740354

This article from Csdn Blog, reproduced please indicate the source: http://blog.csdn.net/wishfly/archive/2010/05/21/5613931.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.