From: https://www.ibm.com/developerworks/cn/linux/l-linux-slab-allocator/
Good operating system performance depends on the ability of the operating system to effectively manage resources. In the past, heap memory manager was the actual specification, but its performance was affected by the demand for memory fragmentation and memory recovery. Now, the Linux kernel uses a method originating from Solaris, but this method has been used for a long time in the embedded system. It allocates memory as objects by size. This article will explore the ideas behind the slab splitter and introduce the interfaces and usage provided by this method.
Dynamic Memory Management
The goal of memory management is to provide a way to share memory among users for various purposes. The memory management method should implement the following two functions:
- Minimum time required for Memory Management
- Maximize available memory for general applications (minimize management overhead)
Memory Management is actually a zero-sum game about trade-offs. You can develop an algorithm that uses a small amount of memory for management, but it takes more time to manage available memory. You can also develop an algorithm to effectively manage the memory, but use more memory. In the end, the needs of specific applications will lead to the selection of such trade-offs.
Each Memory Manager uses a heap-based allocation policy. In this method, large memory (calledHeapIs used to provide memory for user-defined purposes. When you need a piece of memory, you are requested to allocate a certain size of memory to yourself. The heap Manager checks the available memory (using specific algorithms) and returns a piece of memory. Some algorithms used in the search process include:First-fit(The first memory block found in the heap that meets the request) andBest-fit(Use the most suitable memory block in the heap that meets the request ). After the user uses the memory, the memory is returned to the heap.
The root problem of this heap-based allocation policy isFragmentation). When memory blocks are allocated, they are returned at different times in different order. This will leave some holes in the heap, and it takes some time to effectively manage the idle memory. This algorithm usually has a high memory usage efficiency (allocating the required memory), but it takes more time to manage the heap.
Another method is calledBuddy Memory AllocationIs a faster memory allocation technology, which divides the memory into two power-to-power partitions and uses the best-fit method to allocate memory requests. When a user releases the memory, the user checks the buddy block to check whether the adjacent memory block has been released. If yes, memory blocks are merged to minimize memory fragments. The time efficiency of this algorithm is higher, but the use of the best-fit method results in a waste of memory.
This article focuses on Linux kernel memory management, especiallySlab allocation.
Slab Cache
The slab distributor used in Linux is based on an algorithm first introduced by Jeff bonwick for the SunOS operating system. Jeff's distributor is centered around the object cache. In the kernel, a large amount of memory is allocated to a limited set of objects (such as file descriptors and other common structures. Jeff found that the time required to initialize common objects in the kernel exceeds the time required to allocate and release them. Therefore, he concluded that the memory should not be released back to a global memory pool, but the memory should be initialized for a specific purpose. For example, if the memory is allocated to a mutex lock, you only need to execute the mutex lock initialization function (mutex_init
. Subsequent memory allocation does not need to execute this initialization function, because it is in the desired state after the last release and call of the destructor.
The Linux slab splitter uses this idea and other ideas to build a memory distributor with high efficiency in space and time.
Figure 1 shows the high-level organizational structure of the slab structure. At the highest level iscache_chain
This is a list of links cached by slab. This is very useful for the best-fit algorithm and can be used to find the cache (traversing the list) that best suits the desired allocation size ).cache_chain
Each element of iskmem_cache
Structure reference (calledCache). It defines an object pool of a given size to be managed.
Figure 1. Main Structure of slab distributor
Each cache containsSlabsList, which is a continuous memory block (usually a page ). Three slab types exist:
-
slabs_full
-
Fully allocated Slab
-
slabs_partial
-
Partially assigned Slab
-
slabs_empty
-
Empty slab or no objects are assigned
Note:slabs_empty
The slab in the list isRecycling). Through this process, the memory used by slab is returned to the operating system for other users.
Each slab in the slab list is a continuous memory block (one or more consecutive pages) divided into objects. These objects are basic elements for allocation and release from a specific cache. Note that slab is the smallest allocation unit for slab distributor operations. Therefore, if you need to expand slab, this is the minimum value of the extension. Generally, each slab is assigned multiple objects.
Because objects are allocated and released from slab, a single slab can be moved between slab lists. For example, when all objects in an slab are used upslabs_partial
Move from listslabs_full
List. When an slab is completely assigned and an object is releasedslabs_full
Move from listslabs_partial
List. After all objects are releasedslabs_partial
Move listslabs_empty
List.
Motivation behind Slab
Compared with the traditional memory management mode, the slab cache distributor provides many advantages. First, the kernel usually relies on the allocation of small objects, which will be allocated countless times in the system lifecycle. The slab cache distributor provides this function by caching objects of similar sizes, thus avoiding common fragmentation problems. The slab splitter also supports initialization of common objects, thus avoiding repeated initialization of an object for the same object. Finally, the slab splitter supports hardware cache alignment and coloring, which allows objects in different caches to occupy the same cache row, thus improving the cache utilization and improving performance.
API functions
Now let's take a look at the functions that can create a new slab cache, add memory to the cache, destroy the cache application interface (API), and allocate and release objects in slab.
The first step is to create the slab cache structure. You can create it statically as follows:
struct struct kmem_cache *my_cachep; |
|
Linux source code cached by Slab You can find the slab cached source code in./Linux/MM/slab. C.kmem_cache The structure is also defined in./Linux/MM/slab. C. This article focuses on the current implementation in Linux kernel 2.6.21. |
|
Other slab cache functions will then use this reference to create, delete, and allocate instances.kmem_cache
The structure contains the data of each central processor unit (CPU), a set of adjustable (accessible through the proc file system) parameters, statistics, and elements required to manage slab cache.
Kmem_cache_create
Kernel functionskmem_cache_create
Creates a new cache. This is usually executed during kernel initialization or when the kernel module is loaded for the first time. Its prototype is defined as follows:
struct kmem_cache * kmem_cache_create( const char *name, size_t size, size_t align, unsigned long flags; void (*ctor)(void*, struct kmem_cache *, unsigned long), void (*dtor)(void*, struct kmem_cache *, unsigned long)); |
name
The parameter defines the cache name, which is used by the proc file system (in/proc/slabinfo) to identify the cache.size
The parameter specifies the size of the object created for this cache,align
The parameter defines the alignment required for each object.flags
The parameter specifies the option to enable the cache. See table 1.
Table 1. Some options of kmem_cache_create (specified in the flags parameter)
Option |
Description |
Slab_red_zone |
Insert a flag in the object header and tail to support the check of buffer overflow. |
Slab_poison |
Fill in slab in a known mode to monitor the objects in the cache (the objects belong to all objects, but can be modified externally ). |
Slab_hwcache_align |
The specified cache object must be aligned with the hardware cache line. |
ctor
Anddtor
The parameter defines an optional object constructor and destructor. Constructor and destructor are user-provided callback functions. When a new object is allocated from the cache, it can be initialized through the constructor.
After creating a cache,kmem_cache_create
The function returns a reference to it. Note that this function does not allocate any memory to the cache. Instead, when attempting to allocate objects from the cache (initially empty,RefillThe operation allocates the memory to it. When all objects are used up, you can also use the same operation to add memory to the cache.
Kmem_cache_destroy
Kernel functionskmem_cache_destroy
Used to destroy the cache. This call is executed by the kernel module when it is uninstalled. When calling this function, the cache must be empty.
void kmem_cache_destroy( struct kmem_cache *cachep ); |
Kmem_cache_alloc
To allocate an object from a named cache, you can usekmem_cache_alloc
Function. The caller provides the cache from which objects are allocated and a set of labels:
void kmem_cache_alloc( struct kmem_cache *cachep, gfp_t flags ); |
This function returns an object from the cache. Note that if the cache is currently empty, this function will callcache_alloc_refill
Add memory to the cache.kmem_cache_alloc
The flags option andkmalloc
The flags options are the same. Table 2 lists some of the Flag options.
Table 2. Flag options of kmem_cache_alloc and kmalloc kernel functions
Flag |
Description |
Gfp_user |
Allocate memory to users (This call may cause sleep ). |
Gfp_kernel |
Allocate memory from the kernel RAM (This call may cause sleep ). |
Gfp_atomic |
Force the call to a non-sleep state (useful for interrupt handlers ). |
Gfp_highuser |
Allocate memory from high-end memory. |
|
NUMA slab allocation For the NUMA (non-uniform memory access) architecture, the allocation function for a specific node iskmem_cache_alloc_node . |
|
Kmem_cache_zarloc
Kernel functionskmem_cache_zalloc
Andkmem_cache_alloc
Similarly, it only executesmemset
Operation to clear an object before returning it to the caller.
Kmem_cache_free
To release an object back to slab, you can usekmem_cache_free
. The caller provides cache references and objects to be released.
void kmem_cache_free( struct kmem_cache *cachep, void *objp ); |
Kmalloc and kfree
The most common memory management functions in the kernel are:kmalloc
Andkfree
Function. The two functions are prototype as follows:
void *kmalloc( size_t size, int flags ); void kfree( const void *objp ); |
Note thatkmalloc
). Howeverkmalloc
Andkfree
The slab cache is similar to the previously defined function.kmalloc
Instead of naming an slab cache for the objects to be allocated, it cyclically traverses the available cache to find the cache that can meet the size limit. After finding it__kmem_cache_alloc
) Allocates an object. To usekfree
Release an object, and the cache of the allocated object can be calledvirt_to_cache
OK. This function will return a cache reference and then__cache_free
Use this reference to release an object in the call.
|
Common Object allocation In the slab source code,kmem_find_general_cachep You can execute a cache search to find the slab cache that best fits the desired object size. |
|
Other functions
The slab cache API also provides other useful functions.kmem_cache_size
The function returns the size of the object managed by the cache. You can also callkmem_cache_name
To retrieve the name of the specified cache (defined when the cache is created ). The cache can be reduced by releasing idle slab. You can callkmem_cache_shrink
. Note that this operation (called recycling) is automatically executed by the kernel periodically (throughkswapd
).
unsigned int kmem_cache_size( struct kmem_cache *cachep ); const char *kmem_cache_name( struct kmem_cache *cachep ); int kmem_cache_shrink( struct kmem_cache *cachep ); |
Slab cache example usage
The following code snippet shows the process of creating a new slab cache, allocating and releasing objects from the cache, and then destroying the cache. First, you must definekmem_cache
Object, and then initialize it (see Listing 1 ). This specific cache contains 32-byte objects and is aligned with the hardware cache (by the flag ParameterSLAB_HWCACHE_ALIGN
Definition ).
Listing 1. Creating a New Slab Cache
static struct kmem_cache *my_cachep; static void init_my_cache( void ) { my_cachep = kmem_cache_create( "my_cache", /* Name */ 32, /* Object Size */ 0, /* Alignment */ SLAB_HWCACHE_ALIGN, /* Flags */ NULL, NULL ); /* Constructor/Deconstructor */ return; } |
With the allocated slab cache, you can now allocate an object to it. Listing 2 provides an example of allocating and releasing objects from the cache. It also shows the usage of two other functions.
Listing 2. Assigning and releasing objects
int slab_test( void ) { void *object; printk( "Cache name is %s\n", kmem_cache_name( my_cachep ) ); printk( "Cache object size is %d\n", kmem_cache_size( my_cachep ) ); object = kmem_cache_alloc( my_cachep, GFP_KERNEL ); if (object) { kmem_cache_free( my_cachep, object ); } return 0; } |
Finally, listing 3 demonstrates how to destroy the slab cache. The caller must ensure that objects are not allocated from the Cache during the destruction operation.
Listing 3. Destroy the slab Cache
static void remove_my_cache( void ) { if (my_cachep) kmem_cache_destroy( my_cachep ); return; } |
Slab proc Interface
The proc file system provides a simple method to monitor the slab cache for all activities in the system. This file is called/proc/slabinfo. In addition to providing some adjustable parameters that can be accessed from the user space, it also provides detailed information about all slab caches. The current version of slabinfo provides a title, which makes the output more readable. For each slab cache in the system, this file provides information about the number of objects, the number of active objects, and the object size (except for the objects and pages of each slab ). A set of adjustable parameters and slab data are also provided.
To optimize the slab cache, you can simply convert the slab cache name and three adjustable parameters to the/proc/slabinfo file in the form of strings. The following example shows how to increase the limit and batchcount values while keeping the shared factor unchanged (in the format of "cache name limit batchcount shared factor "):
# echo "my_cache 128 64 8" > /proc/slabinfo |
limit
Field indicates the maximum number of objects that can be cached by each CPU.batchcount
The field is the maximum number of global cache objects converted to each CPU cache when the cache is empty.shared
The parameter describes the sharing behavior of the symmetric multi-processor (SMP) system.
Note that you must have Super User privileges to cache and optimize parameters for slab in the proc file system.
Slob distributor
For a small embedded system, there is an slab simulation layer named slob. This slab alternative has advantages in small embedded Linux systems, but even if it saves KB of memory, there are still fragments and difficult to expand. DisableCONFIG_SLAB
The kernel will return to the slob distributor. For more information, see references.
Conclusion
The source code of the slab cache distributor is actually a good part of the readability in the Linux kernel. In addition to the indirect nature of function calls, the source code is also very intuitive. In general, it has good comments. If you want to learn more about the slab cache distributor, we recommend that you start with the source code because it is the latest documentation on this mechanism. The references section below provides references for introducing slab cache splitters, but unfortunately these documents are outdated for the current 2.6 Implementation.