Memory management--detecting memory

Source: Internet
Author: User
Tags ranges

After the Linux kernel is loaded into memory by bootloader, the CPU first executes functions such as Start_of_setup functions in Head.s, and then jumps to Main.c,main to first execute Detect_memory function to probe the memory;

int detect_memory (void) {int err = -1;if (detect_memory_e820 () > 0) Err = 0;if (!detect_memory_e801 ()) Err = 0;if (!detect _memory_88 ()) Err = 0;return err;}

The Linux kernel obtains memory-related information through detect_memory_xxx, and these functions are obtained by triggering an int 0x15 interrupt, and the AX Register is set to 0xe820h, 0xe801h, 0x88h, respectively, before the call.


for e820 ();

struct E820entry {__u64 addr;/* start of memory segment */The starting address of the segment __u64 size;/* size of memory segment */size of the segment __u32 Ty pe;/* type of memory segment */The length of the segment} __attribute__ ((packed)); struct E820map {<span style= "White-space:pre" > </span>__u32 nr_map;<span style= "White-space:pre" ></span>struct e820entry Map[E820_X_MAX];};
type: The kind of memory segment that can be divided into usable (normal) ram,reserved-unusable,acpi reclaimable memory,acpi NVS memory,area containing bad m Emory, to get information about all of the memory segments, detect_memory_e820 () takes a do_while loop to constantly trigger an int 0x15 interrupt to get the information for each memory segment and save the information in a struct An array of type E820entry.

static int detect_memory_e820 (void) {int count = 0;struct biosregs ireg, oreg;struct e820entry *desc = Boot_params.e820_map ; static struct e820entry buf; /* Static So it is zeroed */initregs (&ireg); ireg.ax = 0xe820;ireg.cx = sizeof Buf;ireg.edx = Smap;ireg.di = (size_t ) &buf;/* * Note:at least one BIOS is known which assumes that the * buffer pointed-by-one e820 call are the same on  E as * The previous call, and only changes modified fields. Therefore, * We use a temporary buffer and copy the results entry by entry.  * * This routine deliberately does no try to account for * ACPI extended attributes. This is because there was * bioses in the field which report zero for the valid bit for * all ranges, and we don ' t current  Ly make all use of the * other attribute bits. Revisit this if we see the extended * attribute bits deployed in a meaningful the on the future. */do {<span style= "White-space:pre" ></span> /* parameters entered when executing this inline assembly statement are:       &nbsp EAX Register =0xe820         DX register = ' SMAP '          EDI Register =desc   & nbsp     EBX Register =next         ECX Register =size            &nbs P     parameters returned to the C language code are:         ID=EAX registers          RR=EDX registers  & nbsp       EXT=EBX Register          SIZE=ECX Register          desc points to memory address Set when executing 0x15 interrupt call         / <span style= "White-space:pre" ></span>intcall ( 0x15, &ireg, &oreg);/* Trigger interrupt 0x15*/ireg.ebx = OREG.EBX; /* For next iteration ... *//* bioses which terminate the chain with CF = 1 as opposed to%EBX = 0 don ' t always report th E SMAP signature on the final, failing, probe.  */if (Oreg.eflags & x86_eflags_cf) break;/* Some bioses Stop returning SMAP in the middle of the search loop.  We don ' t know exactly how the BIOS Screwed up the might has a partial map, the full map, or complete garbage, so just return fail Ure. */if (Oreg.eax! = SMAP) {count = 0;break;} *desc++ = buf;/* Save the retrieved memory segment information */ count++; /* Gets the number of memory segments plus 1*/ } while (Ireg.ebx && count < array_size (Boot_params.e820_map)); <span style= "White-space:pre" ></span>/* keep the number of memory blocks in the variable */ return boot_params.e820_entries = count;}

static int detect_memory_e801 (void) {struct Biosregs ireg, Oreg;initregs (&ireg); ireg.ax = 0xe801;intcall (0x15, &ireg, &oreg); if (Oreg.eflags & X86_EFLAGS_CF) return-1;/* Do we really need to does this? */if (oreg.cx | | oreg.dx) {OREG.AX = OREG.CX;OREG.BX = OREG.DX;}  if (Oreg.ax > 15*1024) {return-1;/* bogus! */} else if (oreg.ax = = 15*1024) {Boot_params.alt_mem_k = (oreg.bx << 6) + Oreg.ax;} else {/* * This ignores memory above 16MB if we have a memory * hole there.  If someone actually finds a machine * with a memory hole at 16MB and no support for * 0e820h they should probably generate A fake e820 * map. */boot_params.alt_mem_k = OREG.AX;} return 0;} static int detect_memory_88 (void) {struct Biosregs ireg, Oreg;initregs (&ireg); Ireg.ah = 0x88;intcall (0x15, & Ireg, &oreg); boot_params.screen_info.ext_mem_k = Oreg.ax;return-(oreg.eflags & X86_EFLAGS_CF); /* 0 Or-1 */}

for a 32-bit system, the call chain Arch/x86/boot/main.c:main ()--->arch/x86/boot/pm.c:go_to_protected_mode ()--->arch/x86/ Boot/pmjump. S:protected_mode_jump ()--->arch/i386/boot/compressed/head_32.s:startup_32 ()--->arch/x86/kernel/head_32.s : Startup_32 ()--->arch/x86/kernel/head32.c:i386_start_kernel ()--->init/main.c:start_kernel (), arrives at the well-known Linux kernel startup function Start_kernel (), where Setup_arch () is called to complete a series of initialization work related to architecture, including initialization of various types of memory, such as the creation of memory diagrams, the initialization of administrative areas, and so on. For the x86 architecture, the Setup_arch () function is in Arch/x86/kernel/setup.c, as follows:

void __init Setup_arch (char **cmdline_p) {/* ... */x86_init.oem.arch_setup (); Setup_memory_map ();/* Build Memory Map */e820_ Reserve_setup_data (); */* ... *//* * Partially used pages is not Usable-thus * We is rounding upwards: */MAX_PFN = E8 20_END_OF_RAM_PFN (); /* Find the maximum available memory page frame number */<span style= "White-space:pre" ></span><pre name= "code" class= "CPP" style= "Font-size: 24px; " >/       * ... * *
#ifdef config_x86_32/* MAX_LOW_PFN here to update */find_low_pfn_range (); /* Find the maximum page frame number of low-end memory */#elsenum_physpages = max_pfn;/* ... *//* max_pfn_mapped in this update *//* initialize memory mapping mechanism */max_low_pfn_mapped = INI T_memory_mapping (0, max_low_pfn<<page_shift); max_pfn_mapped = max_low_pfn_mapped;/* ... */initmem_init (0, MAX_PFN); /* Start the memory allocator *//* ... */x86_init.paging.pagetable_setup_start (swapper_pg_dir);p aging_init (); /* Create a complete page table */x86_init.paging.pagetable_setup_done (swapper_pg_dir); */* ... */}


In Start_kernel---->setup_arch ()--------------->setup_memory_map;

void __init Setup_memory_map (void) {char *who;who = X86_init.resources.memory_setup (); memcpy (&e820_saved, & e820, sizeof (struct e820map));p rintk (kern_info "e820:bios-provided physical RAM map:\n"); E820_print_map (WHO);
The memory_setup function under x86 is defined in x86_init.c :

/* * The Platform setup functions is preset with the default functions * for standard PC hardware. */struct x86_init_ops X86_init __initdata = {. resources = {. probe_roms= probe_roms,.reserve_resources= Reserve_standard _io_resources,.memory_setup= default_machine_specific_memory_setup,},.mpparse = {. mpc_record= x86_init_uint_noop,. setup_ioapic_ids= x86_init_noop,.mpc_apic_id= default_mpc_apic_id,.smp_read_mpc_oem= Default_smp_read_mpc_oem,. mpc_oem_bus_info= default_mpc_oem_bus_info,.find_smp_config= default_find_smp_config,.get_smp_config= Default_get _smp_config,},.irqs = {. pre_vector_init= init_isa_irqs,.intr_init= native_init_irq,.trap_init= X86_init_noop,},.oem = {. arch_setup= x86_init_noop,.banner= default_banner,},.mapping = {. Pagetable_reserve= native_pagetable_reserve,},. paging = {. pagetable_setup_start= native_pagetable_setup_start,.pagetable_setup_done= Native_pagetable_setup_done, },.timers = {. setup_percpu_clockev= setup_boot_apic_clock,.tsc_pre_init= X86_init_noop,.timer_init= hpet_time_init,.wallclock_init= X86_init_noop,},.iommu = {. iommu_init= Iommu_init_noop,},.pci = {. init= x86_default _pci_init,.init_irq= x86_default_pci_init_irq,.fixup_irqs= X86_default_pci_fixup_irqs,},};

Can be recalled: Default_machine_specific_memory_setup ();

Char *__init default_machine_specific_memory_setup (void) {char *who = "bios-e820"; u32 new_nr;/* * Try to copy the Bios-supp Lied E820-map. * * Otherwise fake a memory map; One section from 0k->640k, * the next sections from 1mb->appropriate_mem_k */NEW_NR = Boot_params.e820_entries;saniti Ze_e820_map (boot_params.e820_map, /* eliminates overlapping memory segments */ array_size (BOOT_PARAMS.E820_MAP), &AMP;NEW_NR); boot_ Params.e820_entries = New_nr;if (Append_e820_map (Boot_params.e820_map, boot_params.e820_entries) < 0) { /* Copy the memory layout information from the Boot_params.e820_map to the struct e820map e820*/ u64 mem_size;/* Compare results from the other methods and take th e Greater */if (Boot_params.alt_mem_k < boot_params.screen_info.ext_mem_k) {mem_size = Boot_params.screen_info.ext_ mem_k;who = "BIOS-88";} else {mem_size = boot_params.alt_mem_k;who = "bios-e801";} E820.nr_map = 0;e820_add_region (0, Lowmemsize (), E820_ram); E820_add_region (High_memory, mem_size <<, E820_RAM );} /* In case someone cares ... */return WHo;} 

1. Eliminate overlapping portions of memory segments

2. Copy the memory layout information from the Boot_params.e820_map to the e820

Append_e820_map (Boot_params.e820_map, boot_params.e820_entries) will invoke the function:

static int __init __append_e820_map (struct e820entry *biosmap, int nr_map) {while (Nr_map) {  U64 start = Biosmap->ad Dr;u64 size = Biosmap->size;u64 End = start + size;u32 type = biosmap->type;/* Overflow in + bits? Ignore the memory map. */if (Start > End) return-1;e820_add_region (start, size, type);  Loop nr_map times add memory blocks to e820; biosmap++;nr_map--;} return 0;}
void __init e820_add_region (u64 start, u64 size, int type) {__e820_add_region (&e820, start, size, type);}
struct E820map e820;

The physical memory has been read out of the BIOS and stored in the global variable e820 .

After building the memory

Setup_arch------------->e820_end_of_ram_pfn;

/*
* Partially used pages is not Usable-thus
* We are rounding upwards:
*/
MAX_PFN = E820_END_OF_RAM_PFN ();

static unsigned long __init e820_end_pfn (unsigned long limit_pfn, unsigned type) {int i; unsigned long LAST_PFN = 0;unsigned long max_arch_pfn = number of pages corresponding to max_arch_pfn;/*4g address space */ for (i = 0; i < E820.nr_map ; i++) {/* Loop traversal Memory layout array */struct e820entry *ei = &e820.map[i];unsigned long start_pfn;unsigned long end_pfn;if (Ei->typ E! = type) CONTINUE;START_PFN = ei->addr >> PAGE_SHIFT;END_PFN = (ei->addr + ei->size) >> page_shift;i F (START_PFN >= LIMIT_PFN)/* Start address greater than MAX_ARCH_PFN, ignore */continue;if (End_pfn > LIMIT_PFN) {/* End address greater than Max_arch_ PFN the Direct Maximum page box number is set to MAX_ARCH_PFN*/LAST_PFN = Limit_pfn;break;} if (End_pfn > LAST_PFN)/* The end address of the memory segment is greater than the maximum page box number found previously, reset the maximum page box number */LAST_PFN = END_PFN;}  if (Last_pfn > MAX_ARCH_PFN)/* greater than 4G space */&NBSP;LAST_PFN = MAX_ARCH_PFN;PRINTK (kern_info "LAST_PFN =% #lx MAX_ARCH_PFN = % #lx \ n ", LAST_PFN, MAX_ARCH_PFN); return last_pfn; /* returns the last page frame number */ } 
unsigned long __init e820_end_of_ram_pfn (void) {<span style= "White-space:pre" ></span>return E820_END_PFN (MAX_ARCH_PFN, E820_ram);}


#define MAXMEM (Vmalloc_end-page_offset-__vmalloc_reserve)

Where __vanalloc_reserve is 128M, the memory partition of 4GB is explained

Know:Maxmem is a value slightly less than 896M (896m-8k-4m-4m) is slightly smaller than the upper limit of memory, high-end memory start address


Setup_arch ()-->find_low_pfn_range (). This function is used to divide the boundaries between low-end memory and high-end memory to determine the starting address of high-end memory

/* MAX_LOW_PFN Get updated here */
Find_low_pfn_range ();

/* * Determine low and high memory ranges: */void __init find_low_pfn_range (void) {/* It could update MAX_PFN */if (MAX_PFN <= MAXMEM_PFN)/* Actual physical memory is less than or equal to low-end memory 896m*/lowmem_pfn_init (); Elsehighmem_pfn_init ();}
/* * We have more RAM than fits into lowmem-we try-to-put it into * highmem, also taking the highmem=x boot parameter in To account:/* The number of pages in the high-end address space can be configured at startup, and if not configured, set the size here */void __init highmem_pfn_init (void) {/*MAXMEM_PFN is the maximum physical address-(4m+ 4m+8k+128m); So the size of the low-end memory is actually lower than the 896M we're talking about. */MAX_LOW_PFN = maxmem_pfn;/* Set the dividing line for high-end memory and low-end memory */ if (Highmem_pages = =-1)/* Number of high-end memory pages if the boot is not set */highmem_pages = max_pfn-maxmem_pfn;/* Total number of pages minus the number of low-end pages *//* If the highmem_pages variable is set in the startup item, then this is the judgment here. Because there may be inconsistencies */if (highmem_pages + MAXMEM_PFN < MAX_PFN) MAX_PFN = Maxmem_pfn + highmem_pages;if (highmem_pages + MAXMEM_ PFN > Max_pfn) {printk (kern_warning msg_highmem_too_small,pages_to_mb (MAX_PFN-MAXMEM_PFN), PAGES_TO_MB (highmem_ pages)); highmem_pages = 0;} #ifndef config_highmem/* Maximum Memory usable is, directly addressable */PRINTK (kern_warning "WARNING only%LDMB W Ill is used.\n ", maxmem>>20); if (Max_pfn > Max_nonpae_pfn) printk (kern_warning" use a highmem64g enabled kernel.\ n "); ELSEPRINTK (kern_warning" USe a highmem enabled kernel.\n "); max_pfn = MAXMEM_PFN; #else/*! Config_highmem *//* There is a high-end address situation */#ifndef config_highmem64g/* In the absence of configuration 64G, the size of the memory can not exceed 4g*/if (Max_pfn > Max_nonpae_pfn) { MAX_PFN = MAX_NONPAE_PFN;PRINTK (kern_warning msg_highmem_trimmed);} #endif/*! CONFIG_HIGHMEM64G */#endif/*! CONFIG_HIGHMEM */}
When actual memory is less than 896M
void __init lowmem_pfn_init (void) {/* MAX_LOW_PFN is 0, we already has early_res support *//* to initialize the dividing line to the maximum page frame number of actual physical memory, due to the system's memory Less than 896M, so all memory is low-end memory, such as the need for high-end memory, then part of the allocation from */MAX_LOW_PFN = max_pfn;if (Highmem_pages = =-1) highmem_pages = 0; #ifdef Config_highmem  /* If the user has defined highmem, it needs to allocate high-end memory */if (highmem_pages >= max_pfn) {/       * If the page start address of high-end memory >= The maximum page box number, The */PRINTK (Kern_err msg_highmem_too_big,pages_to_mb (highmem_pages), PAGES_TO_MB (MAX_PFN)) cannot be assigned; highmem_pages = 0;} if (highmem_pages) {/* This condition guarantees that low-end memory cannot be less than 64m*/if (Max_low_pfn-highmem_pages < 64*1024*1024/page_size) {PRINTK (kern_err MSG_LOWMEM_TOO_SMALL,PAGES_TO_MB (Highmem_pages)); highmem_pages = 0;} MAX_LOW_PFN-= highmem_pages; /* Set the dividing line for low, high-end memory */} #elseif (Highmem_pages) PRINTK (kern_err "ignoring highmem size on non-highmem kernel!\n"); #endif}
When the actual physical memory is greater than 896M, it is allocated by Highmem_pfn_init () void __init highmem_pfn_init (void) {MAX_LOW_PFN = MAXMEM_PFN;/*  Sets the dividing line for high-end memory and low-end memory */if (highmem_pages = =-1)/* Number of page frames without high-end memory */highmem_pages = MAX_PFN-MAXMEM_PFN; /* default is Maximum number of page boxes minus maxmem_pfn*/if (highmem_pages + MAXMEM_PFN < MAX_PFN)/* High-end memory page frames plus MAXMEM_PFN less than maximum page frames */MAX_PFN = maxmem_p  FN + highmem_pages; /* Reduce the maximum number of page frames to the first two and */if (Highmem_pages + maxmem_pfn > MAX_PFN) {/* Request high-end memory beyond the range then do not assign */PRINTK (kern_warning msg_highmem_to O_SMALL,PAGES_TO_MB (MAX_PFN-MAXMEM_PFN), PAGES_TO_MB (highmem_pages)); highmem_pages = 0;} #ifndef config_highmem/* Maximum Memory usable is, directly addressable */PRINTK (kern_warning "WARNING only%LDMB W Ill is used.\n ", maxmem>>20); if (Max_pfn > Max_nonpae_pfn) printk (kern_warning" use a highmem64g enabled kernel.\ n "); ELSEPRINTK (kern_warning" use a Highmem enabled kernel.\n "); max_pfn = MAXMEM_PFN; #else/*! Config_highmem */#ifndef config_highmem64gif (Max_pfn > Max_nonpae_pfn) {max_pfn = MAX_NONPAE_PFN;PRINTK (KERN_warning msg_highmem_trimmed);} #endif/*! CONFIG_HIGHMEM64G */#endif/*! CONFIG_HIGHMEM */}




Memory management--detecting memory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.