Introduction to C language memory
The memory allocation and memory alignment issues of the operating system are very important for the formation program design, the understanding of memory allocation directly affects the code quality, accuracy, efficiency, and the programmer's judgment on memory usage, overflow, and leakage. Memory alignment is often ignored. Understanding the principle and method of memory alignment helps programmers determine access to illegal memory. Generally, the memory occupied by c/c ++ programs is divided into the following five types:
1. stack: automatically allocated by the system, automatically created and released by the program. Function parameters, local variables, return values, and other information are included.
2. heap: free to use, no need to determine the size in advance. In which case the programmer needs to manually apply for and release. If it is not released, the operating system's garbage collection mechanism will be withdrawn after the program ends. For example, s = (char *) malloc (10 ),
3. static zone/Global zone (static): the storage area of global variables and static variables. The program is released by the System
4. Constant zone: Memory zone used to store Constants
5. Code area: store code
For example:
# Include
Int quanju;/* global variable, global zone/static zone )*/
Void fun (int f_jubu);/* code area */
Int main (void )/**/
{
Int m_jubu;/* stack )*/
Static int m_jingtai;/* static variable, global zone/static zone )*/
Char * m_zifum, * m_zifuc = "hello";/* the pointer is on the stack. Point to the string "hello", located in the text constant area */
Void (* pfun) (int);/* stack )*/
Pfun = & fun;
M_zifum = (char *) malloc (sizeof (char) * 10);/* the pointer content points to the allocated space, located in heap )*/
Pfun (1 );
Printf ("& quanju: % x/n", & quanju );
Printf ("& m_jubu: % x/n", & m_jubu );
Printf ("& m_jingtai: % x/n", & m_jingtai );
Printf ("m_zifuc: % x/n", m_zifuc );
Printf ("& m_zifuc: % x/n", & m_zifuc );
Printf ("m_zifum: % x/n", m_zifum );
Printf ("& m_zifum: % x/n", & m_zifum );
Printf ("pfun: % x/n", pfun );
Printf ("& pfun: % x/n", & pfun );
Getch ();
Return 0;
}
Void fun (int f_jubu)
{
Static int f_jingtai;
Printf ("& f_jingtai: % x/n", & f_jingtai );
Printf ("& f_jubu: % x/n", & f_jubu);/* stack, but it is different from m_jubu in the main function */
}
Stack and stack
1. Application Method
Stack:
Automatically assigned by the system. For example, declare a local variable char c in the function; the system automatically opens up space for c in the stack.
Heap:
The programmer needs to manually apply and specify the size. In c, the malloc function is complete. For example, p1 = (char *) malloc (10)
2. system response after application
Stack:
As long as the remaining stack space is larger than the requested space, the system will provide the program with memory. Otherwise, an exception will be reported, prompting stack overflow.
Heap:
Most operating systems have a linked list that records idle memory addresses. When the system receives a program application, it traverses the linked list to find the first heap node with a space greater than the requested space, delete the node from the idle node linked list and allocate the space of the node to the program. In addition, for most systems, the size of the allocation will be recorded at the first address in the memory space, so that the free function in the code can correctly release the memory space. In addition, because the size of the heap node is not necessarily equal to the applied size, the system automatically places the excess part in the idle linked list.
3. Application size limit
Stack:
In Windows, the stack is a data structure extended to a low address and a continuous memory area. This statement indicates that the stack top address and the maximum stack capacity are pre-defined by the system. In WINDOWS, the stack size is 2 MB (OR 1 MB, in short, it is a constant determined during compilation. If the requested space exceeds the remaining space of the stack, overflow will be prompted. Therefore, the space available from the stack is small.
Heap:
The heap is a data structure extended to the high address and a non-sequential memory area. This is because the system uses the linked list to store the idle memory address, which is naturally discontinuous, And the traversal direction of the linked list is from the low address to the high address. The heap size is limited by the valid virtual memory in the computer system. It can be seen that the space obtained by the heap is flexible and large.
4. Comparison of Application Efficiency
Stack:
It is automatically allocated by the system, and the speed is fast. It cannot be controlled by programmers.
Heap:
The memory manually allocated by programmers is generally slow and prone to memory fragments, but it is most convenient to use.
5. Storage content in heap and stack
Stack:
During function calling, the first entry to the stack is the address of the next executable statement of the function call statement, and then the parameters of the function. In most C compilers, parameters are written from right to left into the stack, followed by local variables in the function. Note that static variables are not included in the stack. When the function call ends, the local variable first goes out of the stack, then the parameter, and the top pointer of the stack points to the address of the initial storage, that is, the next instruction in the function, where the program continues to run.
Heap:
Generally, the heap size is stored in one byte in the heap header. The specific content in the heap is arranged by the programmer
Memory alignment Problems
In modern computers, memory space is divided by byte. Theoretically, it seems that access to any type of variables can start from any address, however, the actual situation is that access to specific variables is often performed at specific memory addresses, which requires various types of data to be arranged in space according to certain rules, instead of sequential emissions, this is alignment. Usually, we do not need to consider alignment when writing a program. The compiler selects an alignment policy suitable for the target platform for us. Of course, we can also notify the compiler to pass the pre-compilation command to change the Alignment Method for the specified data.
1. Reasons for memory alignment
The processing of buckets varies greatly by hardware platform. Some platforms can only access certain types of data from some specific addresses. This may not be the case for other platforms, but the most common problem is that alignment of data storage according to the requirements suitable for their platforms will result in a loss of access efficiency. For example, some platforms start from the even address each time they read data. If an int type (assuming 32-bit) is stored at the beginning of the even address, a read cycle can be read, if the data is stored at the beginning of the odd address, it may take two read cycles and splice the high and low bytes of the two read results to obtain the int data. Obviously, reading efficiency is greatly reduced. This is also a game of space and time.
2. Correct Handling of byte alignment
For a standard data type, its address only needs to be an integer multiple of its length, while the non-standard data type is aligned as follows:
A. array: Alignment Based on the basic data type. The first alignment is followed by the natural alignment.
B. Union: Align Based on the Data Type with the maximum length.
C. struct: each data type in the struct must be aligned.
Start from the first address of the struct and search for the first address x that meets the conditions for each member in turn. The condition is x % N = 0, in addition, the length of the entire structure must be the minimum integer multiple of the largest value in the alignment parameter used by each member.
3. Alignment rules
Each compiler on a specific platform has its own default "alignment coefficient" (also called alignment modulus ). Programmers can use the pre-compiled command # pragma pack (n), n =, and 16 to change this coefficient. n is the alignment coefficient you want to specify ". The alignment rules are as follows:
A. data member alignment rules: data member of a structure (struct) (or union). The first data member is placed in a place where the offset is 0, in the future, the alignment of each data member is performed according to the value (or default value) specified by # pragma pack and the smaller value in the length of the data member type. Find the address after the last alignment that can be divisible by the current alignment Value
B. overall alignment rules for a structure (or union): After data members align themselves, the structure (or union) itself also needs to be aligned. it is mainly reflected in whether or not to fill the Null Byte after the last element is aligned. alignment is performed according to the smaller value (or default value) specified by # pragma pack and the maximum length of the data member type in the structure (or combination ).
C. Combined with 1 and 2, we can infer that when the n value of # pragma pack is equal to or greater than the length of all data member types, the n value will not produce any effect.
4. There are four concept values:
1. Alignment of the Data Type itself: Alignment of the basic data type described above.
2. Specify the alignment value: # The alignment value specified when pragma pack (value) is used.
3. The alignment value of a struct or class: The value with the largest alignment value among its members.
4. Valid alignment values of data members, struct, and classes: the alignment value itself and the smaller value in the specified alignment value.
Due to different platforms and compilers, I used gcc version 4.1.2 20080704 (Red Hat 4.1.2-52) to discuss how the compiler alignment each member in the struct data structure. For example:
1. struct {
Int;
Char B;
Short c;
}; # Struct A contains A four-byte int, A one-byte char, and A two-byte short data. Therefore, the space used by A is 7 bytes. However, because the compiler needs to align data members in space, the sizeof (strcut A) value is 8.
2. struct B {
Char B;
Int;
Short c;
};
# Assume that B is discharged from the address space 0x0000. The alignment value is not defined in this example. In the author's environment, this value is 4 by default. The first member variable B's own alignment value is 1, which is smaller than the specified or default alignment value 4, so its valid alignment value is 1, therefore, the storage address 0x0000 is 0 x 0000% 1 = 0. the alignment value of the second member variable a is 4, so the valid alignment value is 4. Therefore, it can only be stored in the four consecutive bytes from the starting address 0x0004 to 0x0007, review 0 x 0004% 4 = 0, which is close to the first variable. The third variable c has its own alignment value of 2, so the valid alignment value is also 2, which can be stored in the two bytes from 0x0008 to 0x0009, Which is 0 x 0008% 2 = 0. Therefore, B content is stored from 0x0000 to 0x0009. Then, let's look at the alignment value of Data Structure B as the maximum alignment value in its variable (here it is B), so it is 4, so the valid alignment value of the structure is also 4. According to the requirements of the structure, 0x0009 to 0x0000 = 10 bytes, (10 + 2) % 4 = 0. Therefore, 0x0000A to 0x000B is also occupied by struct B. Therefore, B has 12 bytes from 0x0000 to 0x000B, sizeof (struct B) = 12
3. # pragma pack (2)/* specify to align by 2 bytes */
Struct C {
Char B;
Int;
Short c;
};
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
# Use the pre-compiled command # pragma pack (value) to tell the compiler. The first variable B's own alignment value is 1 and the specified alignment value is 2. Therefore, the valid alignment value of B is 1. Suppose C starts from 0x0000, then B is stored in 0x0000, conforms to 0 x 0000% 1 = 0; the second variable, its own alignment value is 4, and the specified alignment value is 2, so the valid alignment value is 2, therefore, the sequence is stored in four consecutive bytes, namely 0x0002, 0x0003, 0x0004, and 0 x 0002%. The alignment value of the third variable c is 2, so the valid alignment value is 2, which is stored in sequence.
In 0x0006, 0x0007, 0 x 0006% 2 = 0. Therefore, from 0x0000 to 0x00007, a total of eight characters are stored in the C variable. And C's own alignment value is 4, so the valid alignment value of C is 2. Again 8% 2 = 0, C only occupies eight bytes from 0x0000 to 0x0007. So sizeof (struct C) = 8
4. # pragma pack (1)/* specify to align by 1 byte */
Struct D {
Char B;
Int;
Short c;
};
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
# The value of sizeof (struct C) is 7.
5. union E {
Int a [5];
Char B;
Double c;
}; # In my opinion, the variable shared memory in union should take the longest value as the standard, that is, 20. The default memory alignment of each variable in E must be aligned with the maximum double 8 bytes. Therefore, it should be sizeof (E) = 24
Note:
1. the array alignment value is min (array element type, which specifies the alignment length). However, the elements in the array are stored continuously and the actual length of the array is used for storage.
For example, char t [9], the alignment length is 1, which actually occupies 9 consecutive bytes. Then, the number of bytes to be filled before the next element is determined based on the alignment length of the next element.
2. nested struct hypothesis
Struct
{
......
Struct B B;
......
};
The alignment length of string B in string A is min (the alignment length of string B, the specified alignment length ). the alignment length of the B struct is: the alignment length in the overall alignment rules of the above two structures.