Storage area of C language programs
After a program written in C language is compiled with links, a unified file is formed, which consists of several parts and several other parts will be generated when the program is running, each part represents a different storage region:
- Code segment (Code or Text): A Code segment consists of machine codes in the program. In C, program statements are compiled to form machine code. During program execution, the program counter of the CPU points to each piece of code in the code segment and runs in sequence by the processor.
- Read-only data segments (RO data): read-only data segments are some data that will not be changed by the program. These data segments are similar to table-based operations because these variables do not need to be changed, therefore, you only need to place it in read-only memory.
- Read/write data segments initialized (RW data): initialized data is declared in the program and has initial values. These variables need to occupy the storage space, during program execution, they must be located in the read/write memory area and have initial values for reading and writing during program running.
- Uninitialized read/write data segment (BSS): The uninitialized read/write data is declared in the program, but there are no initialized variables. These variables do not need to occupy the storage space before the program runs.
- Heap: the heap memory only appears when the program is running. It is usually allocated and released by the programmer. If a programmer does not release an operating system, the operating system can reclaim the memory after the program ends.
- Stack: the stack memory only appears when the program is running. The variables used inside the function, function parameters, and returned values use the stack space, which is automatically allocated and released by the compiler.
Memory layout of the C language target file
Code segment, read-only data segment, read/write data segment, uninitialized data segment belongs to the static region, while heap and stack belong to the dynamic region. Code segments, read-only data segments, and read/write data segments are generated after connection. uninitialized data segments are opened during program initialization, and stacks are allocated and released during program running.
The C language program is divided into two states: Image and runtime. The compiled and connected images only contain code segments, read-only data segments, and read/write data segments. Before running the program, uninitialized data segments are dynamically generated, and heap and stack regions are dynamically formed when the program is running.
In general, in a static image file, each part is called a Section, and each part at runtime is called a Segment.
C language program section
- Code snippet: the code snippet is generated by each function. Each statement of the function is compiled and compiled to generate the binary machine code (the compiler determines the specific system structure of the machine code ).
- Read-only Data segment (RO Data): the read-only Data segment is generated by the Data used in the program. This Data segment does not need to be changed during running, therefore, the compiler will put this data segment into the read-only part. The read-only global variables, read-only local variables, and constants used in the program are stored in the read-only data zone during compilation. Note: The global variable const char a [100] = {"ABCDEFG"} is defined. a read-only data zone with a size of 100 bytes is generated and initialized using "ABCDEFG. If it is defined as: const char a [] = {"ABCDEFG"}, an 8-byte read-only data segment (and '\ 0') is generated based on the string length '), therefore, full initialization is generally required in read-only data segments.
- Read/write Data segments (RW Data): read/write Data segments represent a part of the Data areas that can be read or written in the target file. In some cases, they are also called initialized Data segments, this part of data segment and code segment, like read-only data segment, belong to the static area in the program, but it is writable. Generally, initialized global variables and local static variables are placed in the read/write data segment, for example, defining static char B [100] = {"ABCDEFG"} in the function "}; the read/write data zone must be initialized in the program. If it is defined only and has no initial value, the read/write data zone will not be generated and will be positioned as the uninitialized data zone (BSS ). If a global variable (a variable defined outside the function) is added with static modification, this indicates that the global variable can only be used in the file and cannot be used by other files.
- Uninitialized data segment (BSS): similar to the read/write data segment, it also belongs to the static data segment, but the data in this segment has not been initialized. Therefore, it will only be identified in the target file, instead of a segment in the target file, which will be generated at runtime. The uninitialized data segment is generated only during the initialization phase, so its size does not affect the size of the target file.
Note the following points when using variables in C language programs:
- The variables defined in the function body are usually on the stack and do not need to be managed in the program. They are processed by the compiler.
- The memory allocated by functions such as malloc, calloc, and realloc is stored on the heap. The program must ensure that free is used to release the memory. Otherwise, memory leakage may occur.
- Global variables are defined for all functions in vitro, and static variables are placed both inside and outside the function in the global zone.
- The variables defined by const are stored in the read-only data zone of the program.
Mid-program use
The following is a simple example to illustrate the correspondence between variables and segments in C language. The global zone (static zone) in the C language program corresponds to the following segments: RO Data; RW Data; BSS Data. generally, the directly defined global variable is in the uninitialized Data zone. If the variable is initialized, it is in the initialized Data zone (RW Data), and the const is placed in the read-only Data zone.
Const char ro [] = {"this is read only data"}; // read-only data zone static char rw_1 [] = {"this is global read write data "}; // initialized read/write data segment char BSS_1 [100]; // uninitialized data segment const char * ptrconst = "constant data"; // put the string in the read-only data segment int main () {short B; // occupies 2 bytes of char a [100] on the stack; // 100 bytes are opened on the stack, the job value is its first address char s [] = "abcdefg"; // s occupies 4 bytes on the stack. // "abcdefg" is placed in the read-only data storage area, occupies 8 bytes of char * p1; // p1 occupies 4 bytes of char * p2 = "123456"; // p2 occupies the stack, the content pointed to by p2 cannot be changed. // "123456" is in the read-only data zone static char rw_2 [] = {"this is local read write data "}; // static char BSS_2 [100]; // static int c = 0; // global (static) initialization zone p1 = (char *) malloc (10 * sizeof (char); // allocate the memory area in the heap zone strcpy (p1, "xxxx"); // put "XXXX" in the read-only data zone, 5 bytes free (p1); // use free to release the memory indicated by p1 return 0 ;}
The read/write data segment contains the initialized global variable static char rw_1 [] and the local static variable static rw_2 []. the difference is that during the compilation, it can be used in the email department or in the entire file. Rw_1 [] will be placed in the read/write data zone regardless of whether there is static modification, but whether it can be referenced by other files or not. The latter is different. It is a local static variable placed in the read and write data area. If there is no static modification, its meaning changes completely, it will be a local variable opened up in the stack space, rather than a static variable. Here rw_1 [], rw_2 [] does not have a specific value, indicating that the size of the static zone is determined by the length of the subsequent strings.
For uninitialized data areas BSS_1 [100] And BSS_2 [100], the difference is that the former is a global variable and can be used in all files; the latter is a local variable, it is used only within the function. The value after Initialization is not set for the uninitialized data segment. Therefore, the value must be used to specify the size of the region. The editor sets the length to be increased in BSS based on the size.
Stack space is mainly used for the storage of the following three data:
- Dynamic variables inside the Function
- Function Parameters
- Function return value
Stack space is dynamically opened and recycled. In the process of function calling, if there are many layers of function calling, the required stack space will gradually increase. For parameter passing and return values, if a large struct is used, the stack space used will also be relatively large.