Linux Kernel memory allocation __linux

Source: Internet
Author: User
Tags reserved

Memory allocations in the kernel are usually done through kmalloc/kfree, but there are other ways to get memory, all of which together provide an interface for allocating and freeing memory in the kernel. First, Kmalloc/kfree

Malloc/free,kmalloc/kfree, similar to standard C, is the interface used in the kernel for general memory allocations.

Kmalloc/kfree is based on the slab allocator, which is invoked when the system starts, and Kmem_cache_init is created, which creates multiple universal buffer pools in which the slab objects are of 2 integers, and the smallest size is 1. <<kmalloc_shift_low, the largest size is 1<<kmalloc_shift_high, the number of buffer pools is kmalloc_shift_high-kmalloc_shift_low+ 1. These buffer pools are collectively known as universal buffer pools and are used for Kmalloc and Kfree. It can be seen from its implementation mechanism that we cannot use the Kmalloc request size larger than the 1<< Kmalloc_shift_high memory block.

When requesting memory through Kmalloc, the kernel selects the most appropriate buffer pool from the universal buffer pool for memory allocation based on the requested size, and the most appropriate is the one with the smallest slab object size in all buffer pools equal to the request size.

When allocating, you can specify some tags to specify the behavior of the assignment, most commonly gfp_kernel, when you use the tag, you may hibernate, the other is gfp_atomic, the allocation will not hibernate, for the atomic context. There are also many different tags that can refer to the file "Include/linux/gfp.h" which contains the allocation tag and its meaning. Second, the special buffer pool

Generally, the universal buffer pool is sufficient, but for some kernel parts, it may need to repeatedly request, release a fixed size of memory block, you can also choose to create their own dedicated buffer pool, and then from the private pool to apply, release. This involves three APIs: 2.1 Creating a dedicated buffer pool

typedef struct MEMPOOL_S {
	spinlock_t lock;
	int min_nr;		/* NR of elements at *elements *
	/int curr_nr;		/* Current Nr. elements at *elements *
	/void **elements;

	void *pool_data;
	mempool_alloc_t *alloc;
	mempool_free_t *free;
	wait_queue_head_t wait;
} mempool_t;

The design idea of the memory pool is: When you create a memory pool, you first request a specified number of memory objects and save them in a pool of memory, and then, when allocating, first try the general allocation, or remove one from the Memory object reserved from the memory pool if you cannot apply to memory; If the memory pool reserves more than the specified number of memory objects, the memory to be freed is placed in reserved memory without actual release, and a true release is performed if the number of reserved memory objects in the memory pool equals the specified number.

The memory reserved by the memory pool is actually wasted, so it's best not to use it. 3.1 Creating a memory pool

mempool_t *mempool_create (int min_nr,mempool_alloc_t *alloc_fn, mempool_free_t*free_fn, void *pool_data)

This function is used to create a pool of memory, which first uses ALLOC_FN to request MIN_NR memory objects and to save them in a pool of memory.

MIN_NR: Ensure that there are at least as many memory objects in the memory pool ALLOC_FN: A function that is used to customize the true memory allocation, generally available Mempool_alloc_slab FREE_FN: a function for customizing real memory release, General available Mempool_free_slab Pool_data: Parameters that are passed to the user's custom function (ALLOC_FN,FREE_FN)

Memory pools that are not in use can be destroyed with Voidmempool_destroy (mempool_t *pool). 3.2 Request memory from the memory pool/free memory to the memory pool

void *mempool_alloc (mempool_t *pool, intgfp_mask);

It is used to request memory from the pool of memory using tag Gfp_mask, and returns one from the reserved memory object if the normal request fails (that is, when the user's custom allocation function allocation fails when the memory pool was created). Gfp_mask is similar to the flags of Kmalloc.

void Mempool_free (void *element, Mempool_t*pool);

Returns the memory element to the memory pool pool, if the current reserved memory object in pool is less than the min_nr of the memory pool, memory is returned understand the reserved memory object of the memory pool, otherwise invoke the real release function (that is, the user-defined release function provided when the memory pool was created) Iv. allocating large chunks of memory

If a kernel part requires large chunks of memory, you can use page-oriented technology (KMALLOC limit the maximum size of the request)

If a module needs to allocate chunks of memory, it is often preferable to use a page-oriented technology 4.1 to allocate, release pages (using address pointers)

Get_zeroed_page (gfp_t gfp_mask);

Returns a pointer to a new page and fills the page with 0.

__get_free_page (gfp_t gfp_mask);

Similar to Get_zeroed_page, but does not clear the page.

__get_free_pages (unsigned int gfp_mask,unsigned int order);

Allocates and returns a pointer to the first byte in a memory area, which may be several (physically connected

continued) page length but not clear zero.

Order: Is the power exponent, that is, it specifies how many pages are allocated, such as order 0, which means that 2 of the 0 power is 1 pages.

Gfp_mask: Same as the flags of Kmalloc

The allocation may fail, so the caller must handle the failure.

void Free_page (unsigned long addr);

void Free_pages (unsigned long addr,unsigned long order);

These two functions are used to free the page, and note that if order is specified, the application must use the same value when it is released. It should be noted that the APIs here are not available for allocating high-end memory. 4.2 Assigning, releasing pages (using page)

Static inline struct page*alloc_pages (gfp_t gfp_mask, unsigned int order)

#define Alloc_page (Gfp_mask) alloc_pages (gfp_mask, 0)

These two functions are also used to allocate pages, but they return a pointer to the page data structure, not the starting address of the pages. Gfp_mask is similar to the flags of Kmalloc. The order is similar to the __get_free_pages order parameter.

The memory pages that are allocated using them should be returned to the system using the following interface:

void __free_page (struct page *page);

void __free_pages (struct page *page,unsigned int order);

void free_hot_page (struct page *page);

void free_cold_page (struct page *page);

The API here applies to high-end memory. 4.3 vmalloc/vfree

Vmalloc is used to allocate a contiguous chunk of memory from the virtual memory space, although the pages are not contiguous in physical memory (using Alloc_page to obtain each page), but the kernel treats them as a contiguous range of addresses.

The memory obtained from Vmalloc is slightly less efficient, so it is not recommended. In addition, the address returned by Vmalloc must go through the page table to find the real physical address, so if the kernel part needs to use a real physical address, it cannot be allocated.

void *vmalloc (unsigned long size); for memory

void Vfree (void * addr); Used to release memory used by Vmalloc 4.3.1 vmalloc and Kmalloc and __get_free_pages differences:

Vmalloc returns a virtual address, but in fact Kmalloc and __get_free_pages and related functions return virtual addresses, so why is vmalloc inefficient, while the other two are not low, because although Kmalloc and __get_ Free_pages and related functions also return virtual addresses, but the virtual addresses they return are different.

The address range returned by Vmalloc is between Vmalloc_start and Vmalloc_end, where the page table needs to be created when used, and the physical address corresponding to that part of the address may be discontinuous and must be accessed through the page table; and Kmalloc and __get_ The virtual address returned by Free_pages is a regular kernel virtual address space, which is characterized by a discrepancy between the actual physical address and the real one, which means that they are one by one corresponding to the real physical address. On the other hand, it is easy to see that although __get_free_pages and Vmalloc can return large chunks of memory, the memory returned by Vmalloc may be composed of several discrete physical pages, and a page table needs to be created (vmalloc allocation process will be in use ALLOC_ After page (s), Map_vm_area is used to create the pages table), while __get_free_pages returns a continuous set of regular memory pages that have established page tables. Vmalloc cannot be used in an atomic context because it creates a page table and therefore needs to use Kmalloc to allocate memory space for the page table, a process that may hibernate.

4.4 Ioremap

Similar to Vmalloc, a new page table is created when using Ioremap, unlike VMALLOCD, which does not actually allocate any memory, and the return value of Ioremap is a special virtual address that is used to access a specific physical address range. The address obtained using it is to be released using Iounmap. The address returned by Ioremap is best used with/io read-write functions rather than direct access. v. Acquiring large chunks of contiguous physical memory

After the kernel starts, especially after running for a period of time, it is difficult to get large chunks of contiguous memory area by this method, because the method that can obtain large contiguous physical memory by using the above method is to call __get_free_pages, but after the kernel runs for some time, It may be hard to find chunks of physically contiguous memory. If a kernel part does require a large chunk of physically contiguous memory, the best way to do that is to allocate it during the boot process and leave it to yourself.

The way to allocate and preserve memory during startup is to invoke the following APIs:

#include <linux/bootmem.h>

void *alloc_bootmem (unsigned long size);

This function is used to allocate memory regions of the specified size.

void *alloc_bootmem_low (unsigned long size);

This function is used to allocate memory of the specified size in the low-end address area. The low-end address area refers to an address that is less than arch_low_address_limit.

void *alloc_bootmem_pages (unsigned long size);

The function is used to allocate memory regions of a specified size, but the assigned address is addressed to the page.

void *alloc_bootmem_low_pages (unsigned long size); This function is used to allocate memory of the specified size in the low-end address area. The low-end address area refers to an address that is less than arch_low_address_limit. However, the assigned address is addressed to the page.

It is important to note that this allocation is limited, that the use of this allocation of code must be loaded at the start of the system to run, the module is not possible to use this technology. In addition, the memory allocated by this technology is not visible to the memory management subsystem, so it reduces the available memory of the system.

Memory allocated using this technology can be freed with Free_bootmem, but this release does not release memory to the memory management subsystem (unless you return it to the system before the memory management subsystem initializes).

In addition to using BOOTMEM, a new mechanism has been introduced in the newer kernel to allocate reserved memory during the boot phase, which is memblock. Memblock This part acquires information about all the physical memory of the system in the initialization phase and divides them into two classes, conventional memory and reserved memory, which are normal memory when the memory is just discovered, and the kernel part can request and reserve a memory area by Memblock_alloc this API. by Memblock_free, the corresponding memory regions can be released, and their working mechanism is similar to Bootmem. Different from the Bootmem:

Bootmem needs to be invoked after the Bootmem allocator has been initialized, from the code, BOOTMEM is initialized in Start_kernel->setup_arch->do_init_bootmem, It is not until this step that it can be used. The initialization of Memblock is: EARLY_SETUP->EARLY_INIT_DEVTREE->EARLY_INIT_DT_SCAN_MEMORY_PPC (PPC), in PPC, Early_ Setup was prior to the Start_kernel. From the two functional point of view, it can also be seen in succession, Bootmem is a memory allocator, it requires the use of physical memory, and Memblock is the completion of physical memory detection, so bootmem available point of time must be after memblock Memblock uses a linked list to maintain areas and conventional memory, while Bootmem management uses bitmaps to manage, relatively memblock more flexible (Bootmem each assignment for a search may begin to find a contiguous area that satisfies the assignment from the last assignment-terminated address, As a result, fragmentation can be introduced, and if there is a large number of allocations, the search bitmap is also slow, which is also a simple distributor of the drawbacks, specific details may be referred to the code.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.