Linux heap allocation

Last Update:2015-05-25 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Heap heap allocation at the user level: malloc function for heap memory allocation

void malloc (size_t size);

Virtual memory address layout for the process:

For users, the primary focus is on user space. After you enlarge the user space, you can see that it is mainly divided into the following sections:

Code: This is the lowest address part of the entire user space, which is the instruction (the executable machine code compiled by the program)
Data: The initialized global variables are stored here
BSS: An uninitialized global variable is stored here
Heap: Heaps, this is where we focus on this article, the heap from low address to high address growth, the next BRK related system call is to allocate memory from here
Mapping area: This is the zone associated with the MMAP system call. Most actual malloc implementations consider allocating large chunks of memory through mmap, which is not discussed in this article. This region grows from high address to low address
Stack: This is the stack area, which grows from high address to low address

Heap memory grows from low address to high address: The malloc function is primarily used for the allocation of linear addresses of virtual memory

It is also important to note that because Linux is memory-mapped by page, if break is set to not align by page size, the system will actually map a full page at the end, so that the actual mapped memory space is larger than the place where break points. But it is dangerous to use the address after break (although there may be a small block of available memory after the break)

The virtual memory address space that the process faces can only be used if it is mapped to a physical memory address by page. Limited by physical storage capacity, the entire heap virtual memory space may not all be mapped to actual physical memory. Linux maintains a break pointer that points to an address (linear address space) of the heap space. The address space from the heap start address to the break is mapped well and can be accessed by the process, and from break to top is an unmapped address space, which is the classic segmentation fault if the program accesses this space.

From the operating system perspective, the process allocates memory in two ways, with two system invocations: BRK () and mmap () (regardless of shared memory).

1, BRK is the data segment (. data) The highest address pointer _edata to the high address of the push;

2, Mmap is in the virtual address space of the process (heap and the middle of the stack, called the file map area) to find a piece of free virtual memory.

Both methods allocate virtual memory, do not allocate physical memory (inaccurate, system calls execute kernel functions, allocate memory), a page break occurs when the allocated virtual address space is first accessed, the operating system is responsible for allocating physical memory, and the mapping between virtual memory and physical memory is established.

The difference between how the two processes allocate memory:

1, for large chunks of memory applications, glibc directly using the MMAP system call to divide it into another virtual address for the process to use alone; when the Block memory is released, the memory is freed (both virtual and physical memory) using the UNMMAP system call, and there is no memory fragmentation in the middle of the process.

2, for the small block of memory application, after the program starts, the process will get a heap at the bottom of the address, each time the process memory request, GLIBC will increase the heap top up to expand the memory space, that is, we say the heap address upward growth. When you operate on these small chunks of memory, there is a problem with memory fragments. In fact, the BRK and SBRK system call is to adjust the heap top address pointer (break pointer).

(Note that the memory fragment described here is still based on the physical memory.)

When is the heap memory allocated by BRK released?

When Glibc discovers that the heap top has a continuous 128k space is idle, it will be called through the BRK or SBRK system to adjust the location of the heap top and return the occupied memory to the system. At this point, the kernel frees up the physical memory that is occupied by removing the corresponding linear zone.

Let me tell you a memory hole problem:

A scene, the heap top has a piece of memory in use, and the following is a large amount of contiguous memory has been released, then this block of memory can be released? Can the corresponding physical memory be released?

I'm sorry, I can't.

That is to say, as long as the top portion of the application memory is still occupied, I release more memory below, will not be returned to the system, still occupy the physical memory. Why is that?

root cause : This is mostly the same as the kernel when dealing with the heap, it is too simple, it can only adjust the heap top pointer to adjust the way the adjustment program occupies the linear area, but only by adjusting the linear area to free memory. So as long as the heap top is not reduced, the memory consumed will not be released.

b between A and D has passed free (b), but at this point the physical and linear memory of C is not released, just marked as freed space, but the break pointer does not move, edata==break? No backtracking. In most malloc implementations, the memory freed by the free function is not returned directly to the operating system (that is, freeing up physical memory), but rather to the freelist array. b corresponds to the virtual memory and physical memory are not released, because there is only one _edata pointer, if push back, then d this memory how to do ?

Of course, B is a memory that can be reused, and if this time comes with a 40K request (the same size as the previous B), then malloc will probably return the B memory back.。

So if the next time there is a new virtual memory address assignment: First look at the freelist array is not used but is free of the appropriate space, if any, return the linear address space. If not, start assigning from the break pointer position

All in all: the virtual linear address space is also possible to produce fragments (the fragment here is because the virtual space of free memory is not released, resulting in the next allocation of virtual space, can not be used), linear space and physical memory is released together

Memory fragmentation and memory holes are all meant to be

Question: Since the heap memory brk and SBRK can not be directly released, why not all use mmap to allocate, Munmap directly released?
Since the fragments in the heap cannot be released directly, leading to a suspected "memory leak" issue, why does malloc not all use mmap to implement it (Mmap allocated memory can be free through Munmap for true release)? Instead, use mmap only for large chunks of memory larger than 128k?

In fact, the process to the OS to request and release the address space interface Sbrk/mmap/munmap are system calls, frequent calls to system calls are compared to consume system resources. Also, after the memory requested by Mmap is Munmap, the re-application will result in more pages being interrupted. For example, using MMAP to allocate 1M space, the first call resulted in a large number of page faults (1m/4k), and when the Munmap is allocated again after 1M space, there will be a large number of pages interrupted again. A fault is a kernel behavior that causes the CPU to be consumed in a kernel state. In addition, allocating small memory using mmap can result in more shards in the address space and a greater management burden on the kernel. While the heap is a contiguous space, and the heap fragmentation is not returned to the OS, if the shards are reusable, accessing the memory again is likely not to cause any system calls and missing pages, which will significantly reduce CPU consumption. Therefore, in the GLIBC malloc implementation, fully consider the differences and advantages and disadvantages of SBRK and mmap behavior, the default allocation of large chunks of memory (128k) to use Mmap to obtain the address space, but also through the mallopt (M_mmap_threshold, <siz e>) To modify this critical value.

Linux heap allocation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux heap allocation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux heap allocation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support