Memory management under the GLIBC

Last Update:2015-04-01 Source: Internet

Author: User

Tags valgrind

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the decoding process we also encountered a similar problem, the first decoding of the audio compared to 60s, consumption of 3G of memory, reset after the memory is not returned to the operating system, the second time even if the solution of a 10s of audio

I mentioned a few weeks ago that I was assigned by the project team to do some research into Linux under the memory management mechanism. Because our products encounter some of the "strange" problems associated with it. These issues and related situations can be summarized as follows:

Let's introduce the relevant background. Since we are 3D software, users often have the "import/export" of various geometry requirements. A file that stores this data may contain more than one geometry, and there may be a variety of basic elements, such as thousands of patches/polygons, in each geometry. These elements are small in themselves, but a lot in number.
When importing geometry for the first time, it will occupy a lot of memory (for example, 1.5G), the memory is not released after all kinds of "clean" operation without shutting down the software, and then when the same geometry is imported again, the memory is not increased obviously, but if the import operation is done again, Memory is also heavily occupied (about 1G or more).
The above experiment, replace the first import geometry1, then clean up the scene, and then import Geometry2, at this time the memory consumption of geometry2, than the amount of memory consumed when the first import Geometry2 alone is smaller.
Valgrind is a collection of tools that are frequently used under Linux to check various memory management issues. We did a dedicated memory leak test with Valgrind's Memcheck component and did not find a clear leak.
Our products also have versions on Mac platforms. Get the experiment on Mac OS X and find the same code that behaves differently. Each time the scene is cleaned, considerable memory (about 600-800MB) is returned to the operating system (OS), but not exactly equal to the amount of memory before the geometry is imported.
You can determine that the malloc function on Linux is implemented with the glibc Ptmalloc. And the Mac does not use GLIBC, is its own implementation. (Specific information unknown origin)
Our products, while allocating memory for these needs, are "packaged", but are primarily designed to check if memory is exhausted to provide timely warning. In the final analysis, the standard glibc dispenser (__libc_malloc (size_t)) is used.

The above descriptions are based on objective facts. And I explore the main means, is based on these facts search the Internet (google/Baidu). A few days down the harvest quite abundant. The following summarizes some of the gains.

A similar case:
- "Memory leak" caused by glibc memory allocation mechanism
  
  We are developing a class database system has a memory module, there is a suspected "memory leak" problem, the phenomenon is as follows: Memory module memory is not returned to the operating system, such as memory module occupied by 10GB, after releasing memory, through the top command or/proc/pid/ Status View occupies memory sometimes still 10G, sometimes 5G, sometimes 3G, etc, the behavior of memory release is indeterminate.
- A large number of related questions can be found on the StackOverflow. You can search by yourself. Like this: Linux Allocator Does not Release Small Chunks of Memory
malloc ()/free (), mmap (), BRK (), and the relationship between the user program-->GLIBC-->linux kernel
- malloc ()/free ( ) is the two very basic functions of the C language responsible for memory allocation/deallocation . However, as a C standard, ANSI C does not specify how they should be implemented specifically. So on each system level platform (Windows, Mac, Linux, etc.), the underlying memory manipulation is not the same when invoking these two functions.
- Under Linux, the implementation of malloc ()/free () is owned by the GLIBC library . This is a fairly low-level library that will communicate with the underlying system (invoking the system API) according to certain policies. Because of this glibc relationship, the user program does not directly interact with the Linux kernel, but is managed by the glibc, so it can be considered that GLIBC provides a default version of the memory manager, when it comes to memory management. Their relationship is like this: the user program---->GLIBC---->linux kernel.
- GLIBC uses Ptmalloc as its memory manager implementation. About Ptmalloc How to manage memory, I read a lot of tutorials, which I think the most transparent, want to know the truth of the students recommend to go there to see. Here is a sloppy summary of yourself, not suitable as a learning book (all link over).
  - The BRK-assigned chunk list can only be released linearly from top start. Release the middle chunk, which cannot be returned to the OS, but is linked into the Bins/fast bins container.
  - Mmap allocates memory, which is equivalent to mapping a chunk directly from physical memory. When this memory is released, it can be returned directly to the OS.
  - For a reqest of memory, whether it is allocated by BRK or by mmap, this is determined by the GLIBC strategy mechanism.
  - There is a threshold that can regulate this strategy. By default, less than 128kb is allocated by BRK, and greater than equals is allocated by mmap.
  - But the modern GLIBC implementation (which has not yet been investigated) supports dynamic tuning of threshold technology. By default, on a 64-bit system, the BRK can be dynamically resized from 128kb to 32MB. The adjustment strategy can basically be summed up as: found that the top can release more than 256kb of available memory, the threshold adjusted to 256kb. And so on until 32MB.
  - This threshold can also be artificially controlled. See the link below for details.
  - I have written a small program to verify the above points, and I find it is true. The test is about, with a doubly linked list (std::d eque) to load the design of Chuck, according to the instructions, either for the end of a chunk, either eject a chunk from the tail end, or pop a chunk from the first end, observe the memory usage. found that for small size of the chunk, from the end of the pop-up elements, memory can be released, but the chunk from the first side of the eject, the memory is not released, if the chunk is large enough, whether from the end or the first side, the memory can be released.
  glibc uses these two mechanisms to manage the memory of the user program, which is intentionally designed. After all, the cost of communication with the underlying system is expensive, and if you manipulate a large amount of small chunks of memory at will, it is equivalent to frequent communication with system calls, which obviously reduces the efficiency of the program. Putting small chunks of memory into a heap maintained by the BRK is equivalent to implementing a cache, which can be saved together and returned to the system. To be fair, this kind of design is smart.
  However, it does not have smart enough to do well. First, because its implementation is relatively simple, it only maintains a pointer to the top of the heap. So if you want to return it to the system, you must return it from top to bottom. Imagine this situation, if the heap top has a block of memory has been occupied, and the following all the memory is useless. Can the memory below be returned to the system? Unfortunately, this design determines that the answer is not. There is a "hole (Hole)" problem.
  In addition, this design is not friendly to some user programs that frequently request/release small chunks of memory due to business requirements. Such 3D software, like ours, is a typical case: a huge geometry that is actually made up of thousands of small patches, each of which is not large enough to be large in quantity. So our software is faced with "the memory has been released, but not returned to the system" the strange problem. The best strategy for dealing with this problem should be to design and use a "Dedicated memory pool" technology that is suitable for our software in the early stages, applying for a contiguous chunk of memory space, and manually "cutting" it for many facets. It will be returned to the system in batches according to the situation. In short, the design of their own memory management solution is always flexible, depending on the needs of the project to build.
  In other words, although glibc has developed this somewhat "tough" memory management scheme, but also provides some methods to allow the adjustment of related thresholds (threshold), although we can not interfere with how to manage memory, but at least through these methods, decided "how big, how small is small" And the "number of savings to return" such issues.
Mallopt () and Malloc_trim (0)
- Mallopt is a special function to regulate the relevant thresholds, the details are not said, man manual is quite clear. The section below is still left to yourself. If you want to know more about the students please click here.
  
  #include < malloc.h >
  
  int mallopt (int param, int value);
  
  M_mmap_threshold
  
  For allocations greater than or equal to the limit specified (in bytes) by m_mmap_threshold The can ' t is satisfied from T The He free list, the memory-allocation functions employ mmap (2) Instead of increasing the program break using SBRK (2).
  
  Allocating memory using MMAP (2) has the significant advantage, the allocated memory blocks can always be independently Released back to the system. (By contrast, the heap can being trimmed only if memory are freed at the top end.) On the other hand, there was some disadvantages to the use of mmap (2): deallocated space is not placed on the free list fo R reuse by later allocations; Memory May is wasted because mmap (2) allocations must be page-aligned; And the kernel must perform the expensive task of zeroing out memory allocated via MMAP (2). Balancing these factors leads to a default setting of 128*1024 for the M_mmap_threshold parameter.
  
  The lower limit for this parameter is 0. The upper limit is default_mmap_threshold_max:5121024x768 on 32-bit systems or 41024x768sizeof (long) on 64- Bit systems.
  
  Note:nowadays, GLIBC uses a dynamic mmap threshold by default. The initial value of the threshold is 128*1024, if blocks larger than the current threshold and less than or equal t O Default_mmap_threshold_max is freed, the THRESHOLD is adjusted upwards to the size of the freed block. When dynamic Mmap thresholding was in effect, the threshold for trimming the heap was also dynamically adjusted to be twice The dynamic mmap threshold. Dynamic adjustment of the mmap threshold is disabled if any of the M_trim_threshold, M_top_pad, M_mmap_threshold, or M_mma P_max parameters is set.
- Malloc_trim () is a very interesting function. "Interesting" I still don't quite understand how it works at the moment. Here is also I would like to ask you a place (if you have insight, please advise). According to the man manual, it should be responsible for telling glibc how much free space is left on top of the heap in the heap queue maintained by BRK, and the rest of the free space is returned to the system. and the manual clearly stated that it could not return memory other than the top of the heap. The following is an official description of the man Handbook:
  
  The Malloc_trim () function attempts to release free memory at the top of the heap (by calling SBRK (2) with a suitable argu ment).
  
  The pad argument specifies the amount of free space to leave untrimmed at the top of the heap. If This argument are 0, only the minimum amount of memory are maintained at the top of the heap (i.e., one page or less). A nonzero argument can be used to maintain some trailing space at the top of the "the" the "Order to" the future allocations To be made without has to extend the heap with SBRK (2).
  
  According to the description,malloc_trim (0) should simply return all the free memory on top of the heap to the system, and it should not be able to return the free memory underneath the heap (those "holes"). However, my own small-program experiments have overturned this assertion. When I called malloc_trim (0) , I found that all the free memory in the heap was returned to the system, including the holes. However, these memory addresses are still maintained in the free list bing/fast bin, and when a small memory block is requested again, always the front hole is returned from the system again "to" and then given to the caller. This appears to be Malloc_trim (0) very advanced, and I certainly welcome its excellent performance. But because of this behavior and the official manual description of discrepancies, let me understand the model to be quite confused, really miss sister ...
  My platform for experimentation is Linux RH5. The code was also posted (poorly written). Considering that it will look bloated here, I share it here. Note that this version has replaced the doubly linked list with static arrays, purely for experimentation, and the effect is the same.

Some of the lessons from this thought
- Note that when writing geometry related functions later, use the std::vector operation as carefully as possible. Try to batch reserve a piece of memory to use. Reduce the operation of push_back a single element while the container is full, which is very prone to fragmentation.
- In addition, even if a std::vector is allocated on the stack (meaning that the stack is recycled), it is also important to note that the queue it maintains is allocated on the heap. In other words, the memory that one such temporary object has manipulated can still produce fragmentation. If such a function is called frequently, there will be a lot of fragmentation.
- And even if we did a Shrink_to_fit job (std::vector<t*> (v). Swap (v)), if the inside is fragmented, it will also reside in the free_list of BRK maintenance and will not be released.

Memory management under the GLIBC

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More