Linux Virtual memory management has several key concepts :
How is the Linux virtual address space distributed? How is malloc and free allocating and freeing memory? How can I see the fragmentation of memory inside a heap? Since the heap internal memory brk and SBRK can not be directly released, why not all use mmap to allocate, Munmap directly released?
Linux Virtual memory management has several key concepts:
1, each process has a separate virtual address space, the virtual address of the process access is not a real physical address;
2. Virtual address can be mapped by the page table on each process (in the kernel virtual address space of each process) and physical address to obtain a real physical address;
3, if the virtual address corresponding to the physical address is not in physical memory, resulting in a page fault, the actual allocation of physical address, while updating the process of the pages table, if the physical memory is exhausted, the memory replacement algorithm to eliminate the partial page to the physical disk.
How is the Linux virtual address space distributed?
Linux uses virtual address space, which greatly increases the addressing space of the process, from low address to high address, respectively :
1 , read-only segment: This part of space is read-only, not writable; (includes: Code snippet, rodata segment (c constant string and constant defined by # define))
2 , data segment: space for saving global variables and static variables;
3 , heap: is usually said dynamic memory, malloc/new mostly from this. The position of the top of the heap can be dynamically adjusted by functions BRK and SBRK.
4 , File mapping area: memory that maps physical space, such as dynamic library, shared memory, and so on, is usually the virtual address space allocated by the MMAP function.
5 , Stacks: The context space used to maintain function calls, typically 8M, can be viewed through ulimit–s.
6 , kernel virtual space: an area of memory that is not visible to user code, managed by the kernel (the page table is stored in the kernel virtual space).
is a typical virtual address space distribution for 32-bit systems (from in-depth computer systems).
+ the bit system has a 4G address space:
Where 0X08048000~0XBFFFFFFF is the user space, 0XC0000000~0XFFFFFFFF is the kernel space, including kernel code and data, process-related data structures (such as page tables, kernel stacks), and so on. In addition,%esp holds stacks top, changes to the low address direction, BRK/SBRK function controls the top _edata of the heap change to the high address direction .
- What about the results of the bit system? Does the 64-bit system have a 2^64 address space?
In fact, the virtual address space division of a 64-bit system has changed:
1 , the size of the address space is not 2^32, nor is it 2^64, but generally 2^48.
Because there is no need to 2^64 such a large address space, too much space will only lead to waste of resources. 64-bit Linux typically uses 48 bits to represent the virtual address space, and 40 bits to represent the physical address,
This can be viewed through #cat/proc/cpuinfo:
2, wherein, 0X0000000000000000~0X00007FFFFFFFFFFF represents the user space, 0xffff800000000000~ 0xFFFFFFFFFFFFFFFF represents the core space, a total of 256TB (2^48) Address space.
The characteristic of these two intervals is that the 47th bit is the same as the 48~63 bit, if these bits are 0 for user space, otherwise the kernel space is represented.
3, the user space from the low address to high address is still read-only segment, data segment, heap, file mapping area and stack ;
Second, how does malloc and free allocate and release memory?
How do I see how many times a process has a missing pages break ?
View with # ps-o majflt,minflt-c program command
Majflt representative Major fault, Chinese name is big error, Minflt represents minor fault, Chinese name is small error .
These two values represent the number of fault pages that have occurred since the start of a process.
You can use the command ps-o Majflt minflt-c program to view the Majflt of the process, the value of Minflt, both of which are cumulative values, which accumulate from the start of the process. We can pay more attention to these two values when we stress test the high-performance requirements of the program.
If a process uses mmap to map a large data file to the virtual address space of a process, we need to focus on the value of Majflt because the damage to performance compared to Minflt,majflt is fatal, and the time-consuming order of random reads of a disk is several milliseconds, And Minflt only have a lot of time to affect performance.
After a missing pages break, what are the actions performed?
when a process occurs with a missing fault, the process falls into the kernel state and performs the following actions :
1 , check that the virtual address you want to access is legitimate
2. Find/Assign a physical page
3, fill the physical page content (read the disk, or directly set 0, or do nothing)
4. Establish a mapping relationship (virtual address to physical address)
Re-execute the command that occurred with a missing pages interrupt
If the 3rd step, need to read the disk, then this time the fault is Majflt, otherwise it is minflt.
The principle of memory allocation
From the operating system perspective, the process allocates memory in two ways, with two system invocations:BRK and mmap (regardless of shared memory).
1 , BRK is the highest address pointer of the data segment (. data) _edata toward the high address;
2 , Mmap is to find an idle virtual memory in the virtual address space of the process (in the middle of the heap and stack, called the file map area) .
Both of These methods allocate virtual memory and no physical memory is allocated . in the first access to the allocated virtual address space, a page break occurs, the operating system is responsible for allocating physical memory, and then establish a mapping between virtual memory and physical memory.
In the standard C library, Malloc/free function allocations are provided to release memory, which is implemented by brk,mmap,munmap these system calls.
Here's an example of how memory allocation works:
situation One, malloc less than 128k of memory, the use of BRK allocation of memory, the _edata to high address push (only the virtual space, not the corresponding physical memory (so no initialization), the first read/write data, causing the kernel page fault, the kernel allocates the corresponding physical memory, and then the virtual address space to establish a mapping relationship such as:
1 , when the process starts, the initial layout of its (virtual) memory space is shown in 1.
Where themmap memory-mapped file is in the middle of the heap and stack (for example, libc-2.2.93.so, other data files, etc.), the memory-mapped file is omitted for the sake of simplicity.
_edata the pointer (defined inside the glibc) points to the highest address of the data segment.
2 . After the process calls A=malloc (30K), Memory space 2:
The malloc function invokes the BRK system call and pushes the _edata pointer toward the high address by 30K, completing the virtual memory allocation.
you may ask: just put the _edata+30k to complete the memory allocation?
The fact is, _edata+30k just complete the allocation of virtual address, a This memory is still no physical page corresponding to it, until the first time the process read and write a memory, a page break occurs, this time, the kernel allocates a memory corresponding to the physical pages. That is, if malloc assigns a block of content and then never accesses it, a corresponding physical page is not assigned.
3, the process calls B=malloc (40K) after the memory space 3.
situation Two, malloc memory greater than 128k, allocate memory using MMAP, and find a free memory allocation between the heap and the stack (corresponding to independent memory and initialized to 0), such as:
4 , process call C=malloc (200K), Memory space 4:
By default, themalloc function allocates memory, and if the request memory is greater than 128K (which can be adjusted by the M_mmap_threshold option), it is not to push the _edata pointer, but instead use the MMAP system call to allocate a piece of virtual memory from the middle of the heap and stack.
This is done mainly because:
BRK the allocated memory needs to wait until the high address memory is freed (for example, before B is released, A is not possible to release, which is why memory fragmentation occurs, and when Austerity ( see below), and mmap allocated memory can be released separately.
Of course, there are other benefits, there are disadvantages, and then specific, interested students can go to see the glibc inside the malloc code.
5 , process call D=malloc (100K), memory space 5;
6 , process calls free (c), and C corresponds to the virtual memory and physical memory released together.
7 , the process calls free (B), as shown in 7:
B the corresponding virtual memory and physical memory are not released, because there is only one _edata pointer, if push back, then d this memory what to do ?
Of course, b this memory, can be reused, if this time to a 40K request, then malloc will probably put b this memory back .
8 , the process calls free (D), as shown in 8:
B and D are connected and become a piece of 140K of free memory.
9 , by default:
When the free memory of the highest address space exceeds 128K (which can be adjusted by the M_trim_threshold option), the memory crunch operation (TRIM) is performed. in the previous step free, the highest address was found to have more than 128K of memory, and the memory crunch became shown in Figure 9.
The Truth
Say the memory allocation principle, then the test module in the kernel CPU consumption is very high reason is clear: each request to malloc a piece of 2M of memory, by default, malloc call Mmap allocate memory, the request ends, call Munmap free memory. Assuming that each request requires 6 physical pages, then each request will produce 6 page faults, at 2000 of the pressure, 10,000 page faults per second, which do not need to read disk resolution, so called Minflt; page break in the kernel state execution, Therefore, the kernel CPU consumption of the process is very large. The fault of the pages is scattered throughout the processing of the request, so the allocation statement time-consuming (10US) relative to the processing time (1000US) of the entire request is very small.
Solutions
Change dynamic memory to static allocation, or start with malloc for each thread and then save it in Threaddata. However, due to the particularity of this module, static allocation, or startup time allocation is not OK. In addition, Linux under the default stack size limit is 10M, if you allocate a few m of memory on the stack, there is a risk.
Disable malloc calls Mmap allocate memory, and suppress memory crunch.
At the start of the process, add the following two lines of code:
Mallopt (M_mmap_max, 0); Disable malloc calls Mmap allocate memory
Mallopt (M_trim_threshold,-1); Suppress Memory crunch
Effect: After adding these two lines of code, with the PS command to observe, the pressure is stable, majlt and Minflt are 0. The system-State CPU of the process dropped from 20 to 10.
Third, how to see the fragmentation of memory inside the heap?
GLIBC provides the following structures and interfaces to view the usage of memory and mmap within the heap.
struct Mallinfo {
int Arena; /* non-mmapped space allocated from System */
int ordblks; /* Number of free chunks */
int smblks; /* Number of fastbin blocks */
int hblks; /* Number of mmapped regions */
int HBLKHD; /* Space in mmapped regions */
int usmblks; /* Maximum total allocated space */
int fsmblks; /* Space available in freed Fastbin blocks */
int uordblks; /* Total Allocated space */
int fordblks; /* Total free space */
int keepcost; /* Top-most, releasable (via Malloc_trim) space */
};
/* returns the memory usage of heap (Main_arena), returning it as a mallinfo structure */
struct Mallinfo mallinfo ();
/* output the usage of heap and mmap to stderr*/
void Malloc_stats ();
The following examples can be used to verify the Mallinfo and malloc_stats output results.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <malloc.h>
size_t heap_malloc_total, Heap_free_total,mmap_total, Mmap_count;
void Print_info ()
{
struct Mallinfo mi = mallinfo ();
printf ("Count by itself:\n");
printf ("\theap_malloc_total=%lu heap_free_total=%lu heap_in_use=%lu\n\tmmap_total=%lu mmap_count=%lu\n",
heap_malloc_total*1024, heap_free_total*1024, heap_malloc_total*1024-heap_free_total*1024,
mmap_total*1024, Mmap_count);
printf ("Count by mallinfo:\n");
printf ("\theap_malloc_total=%lu heap_free_total=%lu heap_in_use=%lu\n\tmmap_total=%lu mmap_count=%lu\n",
Mi.arena, Mi.fordblks, Mi.uordblks,
MI.HBLKHD, mi.hblks);
printf ("from malloc_stats:\n");
Malloc_stats ();
}
#define ARRAY_SIZE 200
int main (int argc, char** argv)
{
char** Ptr_arr[array_size];
int i;
for (i = 0; i < array_size; i++)
{
Ptr_arr[i] = malloc (i * 1024);
if (i <//GLIBC) default 128k or above using mmap
{
Heap_malloc_total + = i;
}
Else
{
Mmap_total + = i;
mmap_count++;
}
}
Print_info ();
for (i = 0; i < array_size; i++)
{
&nbs p; if (i% 2 = = 0)
continue;
Free (ptr_arr[i]);
if (I < 128)
{
Heap_free_total + = i;
}
Else
{
Mmap_total-= i;
mmap_count--;
}
}
printf ("\nafter free\n");
Print_info ();
return 1;
}
In This example, the first loop allocates a block of memory for each member of the pointer array in the size of the index (KB) and counts the heap and mmap memory allocations by a 128 boundary;
The second loop is an odd-numbered item under the free index, while updating the count. The program's count is compared with the Mallinfo/malloc_stats interface to get the result, and the print_info is printed to the terminal.
The following is an execution result:
Count by itself:
heap_malloc_total=8323072 heap_free_total=0 heap_in_use=8323072
mmap_total=12054528 mmap_count=72
Count by Mallinfo:
heap_malloc_total=8327168 heap_free_total=2032 heap_in_use=8325136
mmap_total=12238848 mmap_count=72
From Malloc_stats:
Arena 0:
System bytes = 8327168
In Use bytes = 8325136
Total (incl. mmap):
System bytes = 20566016
In Use bytes = 20563984
Max Mmap regions = 72
Max mmap bytes = 12238848
After free
Count by itself:
heap_malloc_total=8323072 heap_free_total=4194304 heap_in_use=4128768
mmap_total=6008832 mmap_count=36
Count by Mallinfo:
heap_malloc_total=8327168 heap_free_total=4197360 heap_in_use=4129808
mmap_total=6119424 mmap_count=36
From Malloc_stats:
Arena 0:
System bytes = 8327168
In Use bytes = 4129808
Total (incl. mmap):
System bytes = 14446592
In Use bytes = 10249232
Max Mmap regions = 72
Max mmap bytes = 12238848
From the above, the program statistics and the information obtained by Mallinfo basically coincide, where heap_free_total represents the sum of memory fragments that have been freed in the heap.
If you want to know how many fragments are in the heap, it is worthwhile to Fsmblks, Smblks, ordblks in the mallinfo structure, which represent the total number of fragments in different size intervals, 0~80 bytes, 80~512 bytes,512~128k, respectively. If the value of Fsmblks and smblks is too large, the fragmentation problem may be more serious.
However, the MALLINFO structure has a very fatal problem, that is, its member definitions are all int, in the 64-bit environment, the structure of the uordblks/fordblks/arena/usmblks can easily lead to overflow, should be the legacy of history, use to pay attention!
Four, since the heap memory brk and SBRK can not be directly released, why not all use mmap to allocate, Munmap direct release?
Since the fragments in the heap cannot be released directly, leading to a suspected "memory leak" issue, why does malloc not all use mmap to implement it (Mmap allocated memory can be free through Munmap for true release)? Instead, use mmap only for large chunks of memory larger than 128k?
In fact, the process to the OS to request and release the address space interface Sbrk/mmap/munmap are system calls, frequent calls to system calls are compared to consume system resources. Also, after the memory requested by Mmap is Munmap, the re-application will result in more pages being interrupted. For example, using MMAP to allocate 1M space, the first call resulted in a large number of page faults (1m/4k), and when the Munmap is allocated again after 1M space, there will be a large number of pages interrupted again. A fault is a kernel behavior that causes the CPU to be consumed in a kernel state. In addition, allocating small memory using mmap can result in more shards in the address space and a greater management burden on the kernel.
while the heap is a contiguous space, and the heap fragmentation is not returned to the OS, if the shards are reusable, accessing the memory again is likely not to cause any system calls and missing pages, which will significantly reduce CPU consumption . Therefore, in the GLIBC malloc implementation, fully consider the differences and advantages and disadvantages of SBRK and mmap behavior, the default allocation of large chunks of memory (128k) to use Mmap to obtain the address space, but also through the mallopt (M_mmap_threshold, <siz e>) To modify this critical value.
v. How to view the process's missing pages information?
The following commands can be used to view the missing pages interrupt information
Ps-o majflt,minflt-c <program_name>
Ps-o majflt,minflt-p <pid>
Where:: Majflt representative Major fault, referring to the big mistake;
Minflt represents minor fault, which refers to small errors.
These two values represent the number of fault pages that have occurred since the start of a process.
The difference between Majflt and Minflt is:
Majflt indicates that the disk needs to be read and written, either the memory corresponding page needs to load into physical memory on disk, or it may be that there is not enough physical memory at this time to retire some physical pages to disk.
See:: http://blog.163.com/[email protected]/blog/static/132229655201210975312473/
Six, in addition to glibc Malloc/free, there are other third-party implementation?
In fact, many people began to accuse GLIBC memory management implementation, especially high concurrency performance and memory fragmentation problems are more serious, so there are a number of third-party tools to replace the implementation of GLIBC, the most famous is Google's Tcmalloc and Facebook The Jemalloc.
There are a lot of resources on the Internet, you can check it yourself (using a third-party library, you can use malloc in a third-party library without modifying the code).
References:
"In-depth understanding of computer Systems", chapter 10th
Http://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt
https://www.ibm.com/developerworks/cn/linux/l-lvm64/
Http://www.kerneltravel.net/journal/v/mem.htm
http://blog.csdn.net/baiduforum/article/details/6126337
http://www.nosqlnotes.net/archives/105
Original address: http://blog.163.com/[email protected]/blog/static/132229655201210975312473/
Linux environment memory allocation principle Mallocinfo