Article title: Linux memory View (1 ). Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
Directory
Each subsystem provides interfaces for other subsystems. you don't even need to go deep into the details of each subsystem. you just need to figure out the subsystem interfaces to develop kernel-level programs.
Kernel Address Space Layout
Initialization and fixed ing
Boot mem
High-end memory
VM and vmalloc
Physical memory management
Slab management
Page cache
Swap cache and swap file
Virtual Storage Management (vma)
Swap out
Swap in
Mm fault handle
Mmap
My understanding is as follows:
1. it can be divided into two parts for discussion:
Memory management of kernel space
User space memory management
2
As you said, the core of user space management is Ing. the cpu ing operation is automatically completed by the cpu, but the ING is determined by linux.
As defined in mathematics, a ING has three elements;
Definition domain
|
| Ing rule
V
Value range
Therefore, to define a ing, you must
Allocate a definition domain in the user space (vm_area_struct allocation and other operations );
Allocate a value Field (kernel space allocation-page-level distributor) on the "physical address );
Define mappings (page table operations );
3
Other operations can also be considered from these three elements.
For example, exchange:
It is to "move" some of the value domains to the "peripherals", map the original image to a fixed end, and move the "elephant" end to the "peripherals ".
Page missing in switch
But some "value domains" are moved back to the memory.
Kernel Address Space Layout
Let's calculate how much space a page table occupies if 4G space is mapped. a page table is 4 K (A pte table is 4 K ), pgd contains 1024 items (each item represents 4 K space), so 4 k * (1024 + 1) = 4 M + 4 K space is required.
The kernel pgd is swapper_pg_dir for static allocation. The first 768 items are empty during system initialization. that is, only 3G space is initialized, and the virtual address of the kernel starts from 3G during compilation. in this way, the kernel uses this page directory for addressing. the space mapped during initialization is called pre- ING. pre- ing maps all physical memory to the kernel, and p-v conversion is very simple, so that the kernel does not need to maintain its own virtual space and can easily access the user space.
As we all know, the _ pa macro is based on such a pre- ING. the kernel has an independent pgd, that is, the virtual space of the kernel is independent of other programs. in this way, it is completely unrelated to other processes. so what do we say is that users are 3 GB low and the kernel is 1 GB at most, sharing for all users? In fact, the first 768 items in the process page table point to the user space of the process. if the process wants to access the kernel space, such as system calls, the item after the first item in the page Directory of the process points to the item after the second item of swapper_pg_dir. Then, use swapper_pg_dir to access the kernel space. Once the user falls into the kernel, swapper_pg_dir of the kernel will be used (instead of directly using it, the swapper_pg_dir following the user's pgd 768 will be kept consistent with that of swapper_pg_dir, sharing the kernel page table {Because pgd is not switched ?} Look at the related processing of do_page_fault ^_^) for addressing!
Linux divides his 1G linear space into several parts:
1) Linux divides the entire 4G linear address space into two parts: user space and kernel space, while kernel address space is divided into "physical memory zone" and "virtual memory allocation zone ", "High-end page ing zone", "dedicated page ing zone", and "system reserved ing zone.
2) in the standard configuration, the maximum length of the physical zone is 896 MB, and the physical memory of the system is sequentially mapped to the physical zone) and global page (PGE) on the machine, the physical area uses 4 m pages and serves as a global page for processing (ha, not in vain ). when the system physical memory is larger than MB, the part of the memory exceeds the physical zone
It is called high-end memory. The low-end memory and high-end memory are bounded using the highmem_start_page variable. The Kernel must map them to the "high-end page ing zone" when accessing the high-end memory ".
3) Linux retains the top k area of the kernel space as the reserved area, and the area below the reserved area is the dedicated page ing area, the total size and purpose of each page are pre-defined by the fixed_address enumeration structure during preparation. you can use _ fix_to_virt (index) to obtain the logical address of the pre-defined page in the private area. A high-end memory ing page is pre-defined for each CPU in the dedicated page area, which is used for ING operations on the mid-to-high-end pages during interrupt processing.
4) a segment 32 M from the top of the kernel space and 4 M in length is a high-end memory ing zone, which occupies the total physical memory represented by one page frame table, it can buffer the ing of 1024 high-end pages. the virtual memory allocation zone is located between the physical zone and the high-end ing zone. it is used for the vmalloc () function. its front and physical zone have 8 m isolation bands, there are 8 K backend and high-end ing areas (4 k for 2.4 ?) .
5) when the physical memory of the system exceeds 4 GB, you must use the 64-bit page Directory items provided by the extended paging (PAE) mode of the CPU to access the physical memory above 4 GB. in PAE mode, the linear address-to-physical address conversion uses a three-level page table. the 1st-level page directory is indexed by a maximum of two digits of the linear address. Each directory item corresponds to a 1 GB addressing space, the 2nd-level page Directory items are 9-bit indexed. Each directory item corresponds to 2 MB of addressing space, and the 3rd-level page Directory items are 9-bit indexed. Each directory item corresponds to 4 K page frames. the structure of the 64-bit and 32-bit page Directory items is no different except that the physical address described by the page Directory items is expanded to 36 characters. in PAE mode, the number of pages corresponding to the intermediate page directory that contains the PSE bit is reduced from 4 MB to 2 MB.
1G linear space of the kernel (gray indicates that a ING has been established, and only the physical zone is fully mapped)
8 M in physical zone isolation 8 K in vmalloc zone isolation 4 m in high-end ing zone fixed ing zone 128 K
Reserved Area
|
V
The physical memory corresponding to the physical area is mapped to the physical memory of the high-end ing area. other high-end physical memory
The following code looks for a reference (the above analysis does not seem to be 2.4.0, ^_^ ):
The following code is taken from include/asm-386/pgtable. h
/* Just any arbitrary offset to the start of the vmalloc VM area:
* Current 8 MB value just means that there will be a 8 MB "hole" after
* Physical memory until the kernel virtual memory starts. That means that
* Any out-of-bounds memory accesses will hopefully be caught.
* The vmalloc () routines leaves a hole of 4kB between each vmalloced
* Area for the same reason .;)
*/
# Define VMALLOC_OFFSET (8*1024*1024)
# Define VMALLOC_START (unsigned long) high_memory + 2 * VMALLOC_OFFSET-1 )&~ (VMALLOC_OFFSET-1 ))
# Define VMALLOC_VMADDR (x) (unsigned long) (x ))
# Define VMALLOC_END (FIXADDR_START)
We can see the hole in the middle of the physical zone and the VM zone. the vmalloc zone and the fixed ing zone should also be 4 K holes!
Fixmap. h
Fixed_addresses you can see from this structure that the high-end memory ing area belongs to a fixed memory area, and each cup has one.
Enum fixed_addresses {
# Ifdef CONFIG_X86_LOCAL_APIC
FIX_APIC_BASE,/* local (CPU) APIC) -- required for SMP or not */
# Endif
# Ifdef CONFIG_X86_IO_APIC
FIX_IO_APIC_BASE_0,
FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS-1,
# Endif
# Ifdef CONFIG_X86_VISWS_APIC
FIX_CO_CPU,/* Cobalt timer */
FIX_CO_APIC,/* Cobalt APIC Redirection Table */
FIX_LI_PCIA,/* Lithium PCI Bridge */
FIX_LI_PCIB,/* Lithium PCI Bridge B */
# Endif
# Ifdef CONFIG_HIGHMEM
FIX_KMAP_BEGIN,/* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS)-1,
# Endif
_ End_of_fixed_addresses
};
The following definitions of this file are also very meaningful:
/*
* Used by vmalloc. c.
*
* Leave one empty page between vmalloc 'ed areas and
* The start of the fixmap, and leave one page empty
* At the top of mem ..
*/
# Define FIXADDR_TOP (0xffffe000UL)
# Define FIXADDR_SIZE (_ end_of_fixed_addresses need_resched = 1;
Cpu_idle ();
}
Arch/i386/kernel/setup. c
Void _ init setup_arch (char ** cmdline_p)
{
Unsigned long bootmap_size;
Unsigned long start_pfn, max_pfn, max_low_pfn;
Int I;
.......
Setup_memory_region (); // Some system e820 is not very good, may forge a bios e820
.......
Init_mm.start_code = (unsigned long) & _ text; // initialize init_mm
......
Code_resource.start = pai_to_bus (& _ text );
......
Data_resource.start = pai_to_bus (& _ etext );
......
# Define PFN_UP (x) + PAGE_SIZE-1)> PAGE_SHIFT)
# Define PFN_DOWN (x)> PAGE_SHIFT)
# Define PFN_PHYS (x) MAXMEM_PFN ){
Highstart_pfn = MAXMEM_PFN;
Printk (KERN_NOTICE "% ldMB HIGHMEM available. \ n ",
Pages_to_mb (highend_pfn-highstart_pfn ));
}
# Endif
/*
* Initialize the boot-time allocator (with low memory only ):
*/
Bootmap_size = init_bootmem (start_pfn, max_low_pfn );
/*
* Register all available low-end memory on bootmem allocator.
*/
.......
/*
* Reserve the bootmem bitmap itself as well. We do this in two
* Steps (first step was init_bootmem () because this catches
* The (very unlikely) case of us accidentally initializing
* Bootmem allocator with an invalid RAM area.
*/
Reserve_bootmem (HIGH_MEMORY, (PFN_PHYS (start_pfn) +
Bootmap_size + PAGE_SIZE-1)-(HIGH_MEMORY ));
/*
* Reserve physical page 0-it's a special BIOS page on elastic boxes,
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.