Problem:
When writing a program, we often find that the program uses more memory than we applied for. In order to optimize the memory usage of the program, we want to optimize the memory usage, but I find that my Code cannot be optimized. What should I do? Now we put our focus on malloc. After all, the memory we applied for from the system is completed through it. If we don't know about it, we cannot thoroughly optimize the memory usage.
Here is a small example.
// G ++-o malloc_addr_vec mallc_addr_vec.cpp Compilation
2 # include <iostream>
3 using namespace std;
4 int main (int argc, char * argv [])
5 {
6 int malloc_size = atoi (argv [1]);
7 char * malloc_char;
8 for (size_t I = 0; I <1024*1024; ++ I ){
9 malloc_char = new char [malloc_size];
10}
11 while (1) {}// check the memory usage
12 return 0;
13}
The test environment in this article is 64-bit in Linux. After compiling with G ++ as an executable file, start with different startup parameters and run the top command to view the memory occupied by the program, here we mainly look at RES indicators
RES -- Resident size (kb)
The non-swapped physical memory a task has used.
Test Case:
1. Each new 1 Byte Do 1024*1024 times
./Malloc_addr_vec 1
Memory usage after the program is started
Memory consumption 32 MB
2. Each new 24 Byte Do 1024*1024 times
./Malloc_addr_vec 24
Memory usage after the program is started
Memory consumption 32 MB
3. Each new 25 Byte Do 1024*1024 times
./Malloc_addr_vec 25
Memory usage after the program is started
Memory consumption 48 MB
Why does the new 1 Byte consume the same memory as the new 24Byte system ?, Why is the memory occupied by each new 25 byte different from that occupied by each new 24Byte?
I don't know if you have paid attention to this issue when writing programs. When I met one time, I spoke out: What the fuck malloc.
Cause analysis:
In most cases, the compiler and the C library transparently help you deal with alignment issues. POSIX indicates that the addresses returned through malloc (), calloc (), and realloc () are aligned for any C type.
The alignment parameter must meet two requirements.
1. It must be a power of 2.
2. It must be an integer multiple of (void *).
As to why it should be an integer multiple of (void *), I am not sure yet. Wait for you to find out...
Based on this principle, the 32-bit and 64-bit alignment units are 8 bytes and 16 bytes respectively.
However, this does not explain the above test results, because the minimum unit (MINSIZE) allocated by the system malloc is not the alignment unit.
To learn more, download the glibc source code from the GNU website and view its malloc. c file.
View Code
1 # ifndef INTERNAL_SIZE_T
2 # define INTERNAL_SIZE_T size_t
3 # endif
4 # define SIZE_SZ (sizeof (INTERNAL_SIZE_T ))
5 # ifndef MALLOC_ALIGNMENT
6 # define MALLOC_ALIGNMENT (2 * SIZE_SZ)
7 # endif
8
9
10 struct malloc_chunk {
11 INTERNAL_SIZE_T prev_size;/* Size of previous chunk (if free ).*/
12 INTERNAL_SIZE_T size;/* Size in bytes, including overhead .*/
13 struct malloc_chunk * fd;/* double links -- used only if free .*/
14 struct malloc_chunk * bk;
15 };
16
17 An allocated chunk looks like this:
18 chunk-> +- +-+
19 | Size of previous chunk, if allocated |
20 +-+ -+
21 | Size of chunk, in bytes | M | P |
22 mem-> +- +-+
23 | User data starts here ....
24 ..
25. (malloc_usable_size () bytes ).
26. |
27 nextchunk-> +- +-+
28 | Size of chunk |
29 +-+ -+
30
31
32 # define MALLOC_ALIGN_MASK (MALLOC_ALIGNMENT-1)
33 # define MIN_CHUNK_SIZE (sizeof (struct malloc_chunk ))
34 # define MINSIZE/
35 (unsigned long) (MIN_CHUNK_SIZE + MALLOC_ALIGN_MASK )&~ MALLOC_ALIGN_MASK ))
36/* pad request bytes into a usable size -- internal version */
37 # define request2size (req )/
38 (req) + SIZE_SZ + MALLOC_ALIGN_MASK <MINSIZE )? /
39 MINSIZE :/
40 (req) + SIZE_SZ + MALLOC_ALIGN_MASK )&~ MALLOC_ALIGN_MASK)
Here, the macro request2size is the memory alignment operation of glibc, And the MINSIZE is the minimum unit of memory occupied when malloc is used. According to the macro definition, the MINSIZE in 32-bit systems is 16 bytes, and in 64-bit systems, the MINSIZE is generally 32 bytes. You can also know from the request2size. For a 64-bit system, the requested memory is 1 ~ 24 bytes, the system memory consumption is 32 bytes. When the applied memory is 25 bytes, the system memory consumption is 48 bytes. For a 32-bit system, the requested memory is 1 ~ At 12 bytes, the system memory consumption is 16 bytes. When the applied memory is 13 bytes, the system memory consumption is 24 bytes.
Generally, their difference is the pointer size. The formula is
Max (MINSIZE, in_use_size)
In_use_size = (required size + 2 * pointer size-pointer size) align to MALLOC_ALIGNMENT
(For the reason for the above calculation, refer to section 4th "chuck" of the glibc memory pool management ptmalloc article and find out the internal implementation source code of malloc)
To prove the correctness of this theory, we need to calculate how much memory is spent by one malloc. We use the following code to test on 32bit Linux and 64bit Linux respectively.
2 # include <stdio. h>
3 # include <stdlib. h>
4 int main ()
5 {
6 char * p1;
7 char * p2;
8 int I = 1;
9 printf ("% d \ n", sizeof (char *));
10 for (; I <100; I ++)
11 {
12 p1 = NULL;
13 p2 = NULL;
14 p1 = (char *) malloc (I * sizeof (char ));
15 p2 = (char *) malloc (1 * sizeof (char ));
16 printf ("I = % d \ n", I, (P2-P1 ));
17}
18
19 getchar ();
20}
The test results are as follows:
32bit
View Code
1 ---------------------
2 Linux 32bit
3 ---------------------
4
5 I = 1 16
6 I = 2 16
7 I = 3 16
8 I = 4 16
9 I = 5 16
10 I = 6 16
11 I = 7 16
12 I = 8 16
13 I = 9 16
14 I = 10 16
15 I = 11 16
16 I = 12 16
17 I = 13 24
18 I = 14 24
19 I = 15 24
20 I = 16 24
21 I = 17 24
22 I = 18 24
23 I = 19 24
24 I = 20 24
25 I = 21 32
26 I = 22 32
27 I = 23 32
28 I = 24 32
29 I = 25 32
30 I = 26 32
31 I = 27 32
32 I = 28 32
33 I = 29 40
34 I = 30 40
35 I = 31 40
36 I = 32 40
37 I = 33 40
38 I = 34 40
39 I = 35 40
40 I = 36 40
41 I = 37 48
42 I = 38 48
43 I = 39 48
44 I = 40 48
45 I = 41 48
46 I = 42 48
47 I = 43 48
48 I = 44 48
49 I = 45 56
50 I = 46 56
51 I = 47 56
52 I = 48 56
53 I = 49 56
54 I = 50 56
55 I = 51 56
56 I = 52 56
57 I = 53 64
58 I = 54 64
59 I = 55 64
60 I = 56 64
61 I = 57 64
62 I = 58 64
63 I = 59 64
64 I = 60 64
65 I = 61 72
66 I = 62 72
67 I = 63 72
68 I = 64 72
69 I = 65 72
70 I = 66 72
71 I = 67 72
72 I = 68 72
73 I = 69 80
74 I = 70 80
75 I = 71 80
76 I = 72 80
77 I = 73 80
78 I = 74 80
79 I = 75 80
80 I = 76 80
81 I = 77 88
82 I = 78 88
83 I = 79 88
84 I = 80 88
85 I = 81 88
86 I = 82 88
87 I = 83 88
88 I = 84 88
89 I = 85 96
90 I = 86 96
91 I = 87 96
92 I = 88 96
93 I = 89 96
94 I = 90 96
95 I = 91 96
96 I = 92 96
97, I = 93, 104
98 I = 94 104.
99 I = 95 104
100 (I = 96 104)
101 I = 97 104.
102 (I = 98 104)
103 I = 99 104.
64bit
View Code
1 -------------------
2 Linux 64bit
3 -------------------
4 8
5 I = 1 32
6 I = 2 32
7 I = 3 32
8 I = 4 32
9 I = 5 32
10 I = 6 32
11 I = 7 32
12 I = 8 32
13 I = 9 32
14 I = 10 32
15 I = 11 32
16 I = 12 32
17 I = 13 32
18 I = 14 32
19 I = 15 32
20 I = 16 32
21 I = 17 32
22 I = 18 32
23 I = 19 32
24 I = 20 32
25 I = 21 32
26 I = 22 32
27 I = 23 32
28 I = 24 32
29 I = 25 48
30 I = 26 48
31 I = 27 48
32 I = 28 48
33 I = 29 48
34 I = 30 48
35 I = 31 48
36 I = 32 48
37 I = 33 48
38 I = 34 48
39 I = 35 48
40 I = 36 48
41 I = 37 48
42 I = 38 48
43 I = 39 48
44 I = 40 48
45 I = 41 64
46 I = 42 64
47 I = 43 64
48 I = 44 64
49 I = 45 64
50 I = 46 64
51 I = 47 64
52 I = 48 64
53 I = 49 64
54 I = 50 64
55 I = 51 64
56 I = 52 64
57 I = 53 64
58 I = 54 64
59 I = 55 64
60 I = 56 64
61 I = 57 80
62 I = 58 80
63 I = 59 80
64 I = 60 80
65 I = 61 80
66 I = 62 80
67 I = 63 80
68 I = 64 80
69 I = 65 80
70 I = 66 80
71 I = 67 80
72 I = 68 80
73 I = 69 80
74 I = 70 80
75 I = 71 80
76 I = 72 80
77 I = 73 96
78 I = 74 96
79 I = 75 96
80 I = 76 96
81 I = 77 96
82 I = 78 96
83 I = 79 96
84 I = 80 96
85 I = 81 96
86 I = 82 96
87 I = 83 96
88 I = 84 96
89 I = 85 96
90 I = 86 96
91 I = 87 96
92 I = 88 96
93 I = 89 112.
94 I = 90 112
95 I = 91 112
96 I = 92 112.
97, I = 93, 112
98 I = 94 112.
99 I = 95 112
100 (I = 96 112)
101 I = 97 112.
102 (I = 98 112)
103 I = 99 112.
After learning about the principle of malloc memory, the optimization of program memory usage is targeted. We can request Memory Based on the memory alignment principle to create our efficient memory pool to avoid invisible resource waste.
For example, the memory pool of STL is 8 bytes aligned, and the free_list size of the memory pool is
Free_list [0] --------> 8 byte
Free_list [1] --------> 16 byte
Free_list [2] --------> 24 byte
Free_list [3] --------> 32 byte
......
Free_list [15] ------- & gt; 128 byte
We can optimize it
32bit OS 16-4 + n * 8
64bit OS 32-8 + n * 16
N = (0, 1, 2, 3... max)
In this way, the availability of memory pools of the same size will be higher...
From bear | Zealot Yin