Linux memory management block memory allocation slab allocator

Source: Internet
Author: User
Tags error code

Linux memory management block memory allocation



Partner Systems

A partner system is a solution that Linux uses to meet requests for memory allocation and deallocation for different blocks of size.

Memory admin

Linux divides memory into three memory management zones, ZONE_DMA Zone_normal Zone_highmem, and manages the three ZONE with three administrative district descriptors.

In the admin area descriptor, there is a free_area array with an element number of 11, corresponding to 1, 2, 4, 8, 16 ... the size of the different blocks, each of which is the type of a struct named Free_area, where the code position mm/mmzone.h

struct Free_area {
struct List_head free_list;
unsigned long nr_free;
};

For example, the free_list of the third element of the array holds a list of 2 free blocks of 4 times, 16 page boxes, and Nr_free saves the number of free blocks.


Block memory allocation function

Code Location MM/PAGE_ALLOC.C

__rmqueue, the incoming parameter is a descriptor pointer to the order and the admin zone zone.

static struct page *__rmqueue (struct zone *zone, usigned int order);

The function will traverse 2 of the Order of the page box size block free list, if the chain table is empty, will order+1 continue to traverse, find a non-empty free block linked list, get its first page box descriptor.

page = List_entry (area->free_list.next, struct page, LRU);

List_entry is a macro that functions as a pointer to the parent structure of a member variable, which means to get a pointer to the page structure where the first parameter area->free_list.next resides.


It then deletes this page-box descriptor from the list and reduces the value of the free_pages in the admin descriptor.

From the above process you can know that when a free block of the size of the order is obtained, if there is no free block of the right size, then the free block of the order corresponding size is detached from the larger free block, and the remainder of the separation is correctly logged to the free_ of the admin area descriptor according to the rules of the partner system. In an area array, the expand function does this by setting the private field of the first page box descriptor of the free block we need to order (because this private is still an order value when the free block is undivided).

return expand (Zone, page, order, Current_order, area);

At this point, you have found the desired free block, which returns the first page box descriptor pointer.


Block Memory release function

Code Location MM/PAGE_ALLOC.C

__free_pages_bulk, the incoming argument is the pointer to the first page box descriptor of the block to be freed, the Zone_mem_map field (that is, the first page box descriptor pointer in the admin area), the admin area descriptor, and the order value of the block

static inline void __free_pages_bulk (struct page *page, struct page *base,
struct zone *zone, unsigned int order)

After removing some of the error code, the main process is to find the partner of the release block and merge the two.

Part of the function implementation, the code to find a partner is a little more obscure, but very clever.

Buddy_idx = (page_idx ^ (1 << order));
Buddy = base + Buddy_idx;

Page_idx is the release block in the Zone_mem_map subscript, in order to find its partner block in the Zone_mem_map subscript, the PAGE_IDX and block size to the different or operation, this step can be the page_idx of the order bit counter, That is, you can get higher than PAGE_IDX a block size subscript or lower than PAGE_IDX a block size subscript, this subscript is the release block of the partner block in the Zone_mem_map subscript. Then call the function Page_is_buddy to check whether the partner block can be merged, if not, exit the loop, if the line, will order+1 and continue to loop to find out if there is a larger partner block can be merged.

coalesced = base + Page_idx;
Set_page_order (coalesced, order);
List_add (&AMP;COALESCED-&GT;LRU, &zone->free_area[order].free_list);
zone->free_area[order].nr_free++;

At the end of the loop, Page_idx records the subscript of the first page box descriptor for the merged block in Zone_mem_map, plus the address of base, Zone_mem_map itself, to get the pointer to the first page box descriptor of the merged block coalesced (the word is vague enough), The merge block becomes the final release block, updates the private field of the first page box descriptor, inserts it into the appropriate list, and then increases the number of free blocks in the list.
At this point, block memory release ends.


Linux Memory Management Slab Allocator



Memory areas and Memory objects

A partner system is a solution that Linux uses to satisfy the allocation and release of memory for different sizes of blocks, and it provides the implementation of page-box assignment requests for the slab allocator.

If we need to request a contiguous physical address and a sequence of memory cells of any length, that is, an indeterminate amount of memory, you need to provide a more granular management solution on top of the partner system.

When Linux allocates memory, it initializes this part of memory to a certain type, that is, a memory object, such as a signal, a process descriptor, a file descriptor, and so on, that is destructor when it is released.

However, the time taken for initialization and destructor is already beyond allocating this portion of memory, so a mechanism is needed to omit memory initialization and memory destructor, which is helpful for improving the performance of memory from allocation to use to release the series of processes.


Slab Allocator

    out of these two requirements, slab appears. The

    Slab allocator was first introduced into the Solaris 5.4 kernel by Jeff Bonwick and is now in use by all major unix-like systems.

    It acts as a middleware between the partner system and the memory area allocation interface.
    For a complete example, assuming that there are no available caches before the slab allocator receives the first memory allocation request,
    The partner system is first called to allocate a page box of sufficient size, and then the part of the memory is initialized according to certain rules and returned. When memory is released, the slab allocator does not immediately deconstruct this part of the memory,
    is marked as available, and the next time the memory allocation request is of the same type, the usable, not-refactored block of memory is used directly,
     In this way, the cache will leave an available list of memory areas of the same type, as long as the list is not empty, the type of memory allocation request will skip the initialization step directly satisfied.

    So the core of the entire slab allocator is caching (hardware cache), a smart mechanism for changing time in space.


Cache

In this case, I have to add a cache of this thing.

Because the consistency between the multilevel caches is implemented by the hardware, so for Linux, CPU chip only a logical cache, unit for the row, generally composed of dozens of consecutive bytes, at the same time, there is a cache controller, which saved from the cache line to the main deposit of the map table, This determines whether to hit the cache when accessing memory.

Linux invokes function kmem_cache_init () and function Kmem_cache_sizes_init () during system initialization to initialize a series of caches and create descriptors for use by the slab allocator.

The slab itself exists in the cache and in the corresponding page frame, which is controlled by the cache controller to keep the cache in line with main memory.


Core Processing Process


Assign a page box to a new slab
Kmem_getpages () invokes the Alloc_pages () function interface of the partner system, returning the first page descriptor pointer.

assigning Slab to cache
Cache_grow assigns a new slab to the cache while calling Cache_init_objs (), which has the main code as follows

static void Cache_init_objs (struct kmem_cache *cachep, struct page *page)
{
for (i = 0; i < cachep->num; i++) {
void *OBJP = Index_to_obj (Cachep, page, i);
if (cachep->ctor)
Cachep->ctor (OBJP);
Set_obj_status (page, I, object_free);
Set_free_obj (page, I, I);
}
}

The first parameter is the descriptor pointer for the specified cache, and the second parameter is the descriptor pointer for the first page box assigned by the partner system in the previous step.
The purpose is to set the constructors of all objects in this new slab to the registered constructors in the cache, and to set them all idle.
As you can see from here, the object type is bound to the cache, that is, all the objects in the slab of this hardware cache for a cache descriptor are of the same type.

Assigning Slab Objects


        Assuming that the previous step initializes a slab that serves a type of object, we now have a set of idle objects of type A.
        When a memory allocation request is received, call Kmem_cache_alloc (Cachep, page) by specifying a buffer with a different constructor. The corresponding free memory object pointer can be obtained.

Summarize

First, the system initializes two sets of caches, the first of which includes a cache of each of the specified types (PCB, Singal, and so on). The second group is a series of 2 power-byte sized general-purpose objects (ranging from 32-131072) to the cache, when other modules of the kernel make memory allocation requests, specify the cache corresponding to the type of memory that you need, for example, when you create a new process, you need to specify the PCB corresponding cache in the first group, The Kmalloc function specifies the cache in the second group, and the slab allocator obtains a free memory object from the specified cache to return to the request module. The
Slab allocator works on allocating and freeing memory objects, minimizing the algorithm for constructing new objects and reconstructing old objects to achieve this space-swapping time.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.