Linux provides a complex Storage Management System for processors that provide MMU (Storage Manager, which assists the operating system in memory management and hardware support such as virtual/real address translation, the memory that the process can access reaches 4 GB.
The 4 GB memory space of a process is divided into two parts: user space and kernel space. The user space address distribution ranges from 0 to 3 GB (page_offset, in 0x86, It is equal to 0xc0000000), and 3 GB to 4 GB is the kernel space.
In the kernel space, the address range from 3G to vmalloc_start is the physical memory ing area (this area contains the kernel image, the physical page box table mem_map, and so on ), for example, if the memory of the VMWare virtual system we use is 160 MB, then 3G ~ The 3G + M memory should be mapped to the physical memory. After the physical memory ing area, it is the vmalloc area. For a M system, the vmalloc_start location should be around 3G + M (an 8 m gap exists between the physical memory ing zone and vmalloc_start to prevent the gap ), the location of vmalloc_end is close to 4 GB (the system will retain a kb area for dedicated page ing)
The memory requested by kmalloc and get_free_page is located in the physical memory ing area and is physically continuous. They have only a fixed offset with the real physical address, so there is a simple conversion relationship, virt_to_phys () can be used to convert the kernel virtual address to the physical address:
# DEFINE _ Pa (x) (unsigned long) (x)-page_offset)
Extern inline unsigned long comment _to_phys (volatile void * address)
{
Return _ Pa (Address );
}
The above conversion process is to subtract the virtual address 3G (page_offset = 0xc000000 ).
The corresponding function is phys_to_virt (), which converts the physical address of the kernel to a virtual address:
# DEFINE _ VA (x) (void *) (unsigned long) (x) + page_offset ))
Extern inline void * phys_to_virt (unsigned long address)
{
Return _ VA (Address );
}
Both maid () and phys_to_virt () are defined in include \ asm-i386 \ Io. h.
The memory applied for by vmalloc is located in vmalloc_start ~ There is no simple Conversion Relationship Between vmalloc_end and physical addresses. Although they are logically continuous, they do not require continuity physically.
The following program is used to demonstrate the differences between kmalloc, get_free_page, and vmalloc:
# Include <Linux/module. h>
# Include <Linux/slab. h>
# Include <Linux/vmalloc. h>
Module_license ("GPL ");
Unsigned char * pagemem;
Unsigned char * kmallocmem;
Unsigned char * vmallocmem;
Int _ init mem_module_init (void)
{
// It is best to check whether the application is successful for each memory Application
// The following section does not check the Demo code.
Pagemem = (unsigned char *) get_free_page (0 );
Printk ("<1> pagemem ADDR = % x", pagemem );
Kmallocmem = (unsigned char *) kmalloc (100, 0 );
Printk ("<1> kmallocmem ADDR = % x", kmallocmem );
Vmallocmem = (unsigned char *) vmalloc (1000000 );
Printk ("<1> vmallocmem ADDR = % x", vmallocmem );
Return 0;
}
Void _ exit mem_module_exit (void)
{
Free_page (pagemem );
Kfree (kmallocmem );
Vfree (vmallocmem );
}
Module_init (mem_module_init );
Module_exit (mem_module_exit );
Our system has 121 MB of memory space. Once we run the above program, we find that the address of pagemem is 0xc7997000 (about 3G + M) the kmallocmem address is at 0xc9bc1380 (about 3G + 155 m) and the vmallocmem address is at 0xcabeb000 (about 3G + 171 m), which conforms to the memory layout described above.
1. kmalloc () allocates consecutive physical addresses for small memory allocation.
2. _ get_free_page () allocates consecutive physical addresses for full-page allocation.
The following record explains why the above function allocates consecutive physical addresses and whether the returned physical addresses or virtual addresses.
The kmalloc () function is implemented based on slab. Slab is an efficient mechanism for allocating small memory. However, the slab allocation mechanism is not independent. It also divides more fine-grained memory for callers Based on the page distributor. That is to say, the system first allocates a consecutive physical address in the smallest unit of the page with the page distributor, and then kmalloc () splits the address based on the caller's needs.
For the above discussion, we can view the implementation of kmalloc (). The implementation of kmalloc () function is in _ do_kmalloc (), we can see in _ do_kmalloc () the Code finally calls _ cache_alloc () to allocate an slab. In fact, the implementation of functions such as kmem_cache_alloc () also calls this function to allocate a new slab. We keep track of the call path of the _ cache_alloc () function and find that ()
The kmem_getpages () function is used to allocate a physical page. The alloc_pages_node () called by the kmem_getpages () function eventually uses _ alloc_pages () to return a struct page structure, this structure is used by the system to describe physical pages. As mentioned above, slab is implemented on a physical page. Kmalloc () is assigned a physical address.
_ Get_free_page () is the underlying memory allocation function provided by the page distributor to the caller. It allocates continuous physical memory. The _ get_free_page () function is implemented based on buddy. In the physical memory management implemented by buddy, the minimum allocation granularity is page-based. For the above discussion, we can view the implementation of _ get_free_page (). We can see that the _ get_free_page () function is just a very simple encapsulation, its entire function implementation is to call the _ alloc_pages () function unconditionally to allocate the physical memory. The preceding Implementation of the kmalloc () record also mentioned that
_ Alloc_pages () function to allocate physical pages for slab management. So how is this function allocated to the physical page in what region? To answer this question, you can only look at the implementation. We can see that in the _ alloc_pages () function, we tried to call the get_page_from_freelist () function multiple times to retrieve the zone from zonelist, return an available struct page from the page (some call branches here are different because of the logo ). At this point, we can know that the allocation of a physical page is from the zonelist (structure array of a zone)
Zone. So how is zonelist/zone associated with physical pages and initialized? Let's continue to look at the free_area_init_nodes () function. This function is indirectly called by the zone_sizes_init () function during system initialization. The zone_sizes_init () function fills in three areas: zone_dma, zone_normal, and. Call free_area_init_nodes () as a parameter. A pglist_data structure is allocated in this function, which contains the zonelist/zone structure and
The structure of the physical page of struct page. At the end of the function, the free_area_init_node () function is called as a parameter. In this function, the calculate_node_totalpages () function is used to mark the related regions of pglist_data, then, call the alloc_node_mem_map () function to initialize the struct page physical page in the pglist_data structure. Finally, use the free_area_init_core () function to associate pglist_data with zonelist. Through the above analysis, the _ get_free_page ()
The process of allocating physical memory by a function. But there are a few new questions, that is, how is the physical page allocated by the function mapped? Where is it mapped? Here you have to look at the Boot Code related to vmm.
Before looking at the vmm-related Boot Code, let's take a look at the functions of pai_to_phys () and phys_to_virt. As the name implies, it is the conversion from a virtual address to a physical address and from a physical address to a virtual address. Function implementation is very simple. The former calls _ Pa (Address) to convert a virtual address to a physical address, and the latter calls _ VA (addrress) to convert a physical address to a virtual address. Let's look at the two macros _ Pa _ va.
# DEFINE _ Pa (x) (unsigned long) (x)-page_offset)
# DEFINE _ VA (x) (void *) (unsigned long) (x) + page_offset ))
We can see that only the address is added or subtracted from page_offset, and page_offset is defined as 0xc0000000 in x86. This raises another question. Anyone who has written a driver in Linux knows that after allocating a physical address using kmalloc () and _ get_free_page, if you want to get the correct physical address, you need to use javas_to_phys () for conversion. So why is this step necessary? Aren't we allocating physical addresses? How do I still need to switch after the allocation is completed? If a virtual address is returned
In the analysis of pai_to_phys (), why does the page_offset operation only realize address conversion? Do I need to check the page table for the conversion between virtual addresses and physical addresses? On behalf of the above questions, let's look at the vmm-related boot code.
Find vmm content directly from the start_kernel () kernel boot section. We can see that the first function to be concerned is setup_arch (). In this function, the paging_init () function is used to initialize and map the hardware page table (8 MB of memory is mapped before initialization, the record is not recorded here), while paging_init () is called pagetable_init () to map the physical address of the kernel and initialize the relevant memory. In the pagetable_init () function, the first is the PAE/PSE/PGE judgment and settings, and then the kernel_physical_mapping_init ()
Function to map the physical memory of the kernel. In this function, we can clearly see that pgd_idx is mapped based on the starting address of page_offset. That is to say, all physical addresses are initialized cyclically starting from page_offset. Continue to observe. We can see that after the PMD is initialized, all address calculations are incremented by page_offset. It is obvious from the analysis that the physical address is mapped to the virtual address space starting with page_offset. In this way, all the above questions will be answered. The physical pages allocated by kmalloc () and _ get_free_page () are mapped
The virtual address starting with page_offset, that is, the actual physical address and the virtual address have a one-to-one relationship,
It is precisely because of this ing relationship that the allocation of virtual addresses starting with page_offset in the kernel is also the allocation of physical addresses (of course, this has a certain scope, it should be between page_offset and vmalloc_start, and the latter allocates the starting address of the memory for the vmalloc () function ). This explains why the virt_to_phys () and phys_to_virt () functions can be converted between virtual addresses and physical addresses simply by adding/subtracting page_offset, it remains unchanged, so you do not need to query the page table for conversion. This also answers the initial question, namely, kmalloc ()
/_ Get_free_page () allocates a physical address and returns a virtual address (although this sounds awkward ). Because of this ing relationship, you need to subtract page_offset from their return addresses to obtain the real physical address.