I felt that I was often asked this question during the interview. I also learned the slab mechanism of memcached and found that many servers use this mechanism to allocate memory. So I decided to learn about it.
First, we have a preliminary understanding of the partner System in memory allocation:
In programming and using server software, you often need to allocate a set of consecutive page boxes, and frequently apply for and release consecutive page boxes of different sizes, it is inevitable that many idle page boxes are scattered in the memory blocks of the allocated page box. In this way, even if these page boxes are idle, you may not be able to allocate a large continuous page box. Linux uses a partner system to solve the above problems. Group all idle page boxes into 11 linked lists, each of which contains page blocks with a size of 1, 2, 4, 8, 16, 32, 64,128,256,512, and 1024 consecutive page boxes. You can apply for a maximum of 1024 consecutive page boxes, corresponding to the continuous memory of 4 MB. The physical address of the first page of each page is an integer multiple of the size of the page. For example, the starting address of a block with a size of 16 page frames is a multiple of 16x212. If you want to apply for a block with 256 page frames, first search for the idle block from the linked list with 256 page frames. If not, search for it in the linked list with 512 page frames, if found, the page box is divided into two 256 page boxes, one is allocated to the application, and the other is moved to the linked list of the 256 page boxes. If no idle blocks exist in the linked list of the 512 page boxes, continue to search for the linked list of the 1024 page boxes. If not, an error is returned. When the page block is released, the kernel will take the initiative to combine the two page blocks of mutual partners into a large page block, after successful, you will try to find a partner and merge it into a larger memory block until the block size exceeds the upper limit or no partner exists. The two memory blocks of mutual partners must meet the following conditions: 1. The two blocks have the same size; 2. The physical addresses of the two blocks are continuous; 3. The first physical address is an integer multiple of the two blocks. The slab allocation mechanism is an improvement to the partner algorithm. The Slab (slab allocation) design concept is based on object buffering. The basic idea is to avoid repeated initialization and cleaning operations. Slab is mainly used for memory objects that are frequently released without configuration. Replace malloc/free
The improvements are as follows:
It does not need to initialize or recycle the memory zone. For efficiency considerations, Linux does not call the object's constructor or destructor, but sets the pointer to both functions to null. The main purpose of introducing slab in Linux is to reduce the number of times the partner algorithm is called.
In fact, the kernel often uses a memory area repeatedly. For example, as long as the kernel creates a new process, it is necessary to allocate a memory area for the data structure (task_struct, open file object, etc.) related to the process. When the process ends, these memory areas are withdrawn. Because Process Creation and revocation are frequent, earlier Linux versions spend a lot of time allocating or revoking these memory blocks repeatedly. From linux2.2, save frequently-used pages in the cache and use them again. It can be classified based on the memory usage frequency. For memory areas that are expected to be frequently used, you can create a set of dedicated buffers of a specific size for processing to avoid the generation of internal fragments. For memory areas that are rarely used, you can create a set of common buffers (such as the power of 2 used in linux2.0) for processing. Even if this processing mode generates fragments, it also has little impact on the performance of the entire system.
The use of hardware high-speed cache also provides another reason to minimize the number of calls to the partner algorithm, because each call to the partner algorithm will "dirty" the hardware high-speed cache. Therefore, this increases the average number of accesses to the memory.
The slab allocation mode puts the object group into the buffer zone.
For small objects, place the slab description structure slab_t in the slab. For large objects, the slab structure is removed and stored in a centralized manner. The coloring area in slab is described as follows: each slab header has a small area, which is called "coloring area )". The size of the coloring area makes the starting address of each object in slab aligned according to the "cache line" in the cache (the size of the first-level cache row of 80386 is 16 bytes, pentium is 32 bytes ). Because slab is composed of one or more pages (up to 32), each slab starts from a page boundary and is naturally aligned by the cache buffer lines. However, the object size in slab is uncertain. The purpose of setting the coloring area is to push the starting address of the first object in slab back to the position aligned with the buffer row. Because there are multiple slab instances in a buffer zone, we should arrange the sizes of the slab coloring areas in each buffer zone as much as possible to different sizes, so that in different slab, objects in the same relative position can be staggered in the cache to improve the cache access efficiency. The last object on each slab will also have a small waste zone, which is a compensation for the color area size, which depends on the color area size, and the relative size of each slab object. However, the sum of the region and the colored area is a constant for each slab of the same object. The size of each object is basically the size of the required data structure. Only when the size of the data structure is not aligned with the buffer row in the cache, several bytes are added to align it. Therefore, the starting address of all objects on an slab must be aligned according to the cache row in the cache. Refer:
In-depth analysis
Linux
Kernel source http://oss.org.cn/kernel-book/
Memcached source code analysis
How to Reduce memory fragments when allocating memory