I. Storage Area of C language programs
1. an executable program (binary file) is formed by C language code (text file). Three phases are required: compilation, assembly, and connection. In the compilation process, C language text files are used to generate an assembler. In the compilation process, the assembler is used to form the binary machine code. In the connection process, the binary machine code files generated by various source files are combined into one file.
2. After the C language program is compiled and connected, a unified file is formed, which consists of several parts. When the program runs, several other parts are generated. Each part represents a different storage area:
1> code segment (code or text)
A code segment consists of the machine code executed in the program. In the C language, the program statements are compiled to form the machine code. During program execution, the program counter of the CPU points to each machine code in the code segment and runs in sequence by the processor.
2> read-only data segment (RO data)
Read-only data segments are data that will not be changed by the program. They are used in a similar way as table-based operations. Because these variables do not need to be changed, therefore, you only need to place it in read-only memory.
3> read/write data segments initialized (RW data)
Initialized data is declared in the program and has initial values. These variables need to occupy the storage space. During program execution, they need to be located in the read/write memory area, it also has initial values for reading and writing during program running.
4> data segment not initialized (BBS)
Uninitialized data is declared in the program, but there are no initialized variables. These variables do not need to occupy storage space before the program runs.
5> heap)
Heap memory only appears when the program is running. Generally, it is allocated and released by programmers. If an operating system exists and the program is not released, the operating system may have memory after the program (such as a process) ends.
6> stack (statck)
The heap memory only appears when the program is running. The variables used inside the function, function parameters, and returned values use the stack space, which is automatically allocated and released by the compiler.
3. The code segment, read-only data segment, read/write data segment, and uninitialized data segment belong to the static region, while the heap and stack belong to the dynamic region. Code segments, read-only data segments, and read/write data segments are generated after connection. uninitialized data segments are opened during program initialization, and heap and stack are allocated and released during program execution.
4. C language programs are divided into two states: Image and runtime. In the compiled-connected image, only the code segment (text), read-only data segment (R0 data), and read/write data segment (RW data) are included ). Before running the program, the uninitialized data segment (BSS) is dynamically generated, and the heap and stack areas are also dynamically generated when the program is running.
Note: 1. In general, in a static image file, each part is called a section, and each part in the runtime is called a segment ). If not detailed, it is collectively referred to as a segment.
2. After the C language is compiled and connected, the code segment (text), read-only data segment (RO data), and read/write data segment (RW data) will be generated ). In addition to the preceding three regions, the system also includes the areas of the uninitialized data segment (BBS), heap, and stack.
Ii. CIDR blocks of C language programs
1. Classification of segments
The target code generated by each source program contains all the information and functions that the source program needs to express. The generation of each segment in the target code is as follows:
1> code)
Code segments are generated by various functions in the program. Each function statement is compiled and compiled to generate binary machine code.
2> read-only data segment (RO data)
The read-only data segment is generated by the data used in the program. The features of this part of data do not need to be changed during running. Therefore, the compiler will put the data into the read-only part. Some C-language syntaxes generate read-only data segments.
2. read-only data segment (RO data)
The read-only data segment (RO data) is generated by the data used in the program. This part of data does not need to be changed during running. Therefore, the compiler puts the data in the read-only part. Read-only data segments are generated in the following cases.
N read-only global variables
Define the global variable const char a [100] = "abcdefg" to generate a read-only data zone of 100 bytes and use the string "abcdefg" for initialization. If it is defined as const char a [] = "abcdefg" and the size is not specified, an 8-byte read-only data segment is generated based on the length of the "abcdefgh" string.
N read-only local variables
For example, the variable const char B [100] = "9876543210" defined in the function, its initialization process and global variables.
Constants used in n programs
For example, if printf ("information \ n") is used in a program, which contains a String constant, the compiler automatically puts the constant "information \ n" into the read-only data zone.
Note: In const char a [100] = {"abcdefg"}, 100 bytes of data are defined, however, only the first eight bytes are initialized (7 characters and '\ 0' indicating the terminator '). In this usage, the actual byte meters are initialized, but cannot be written in the program. In fact, it is useless. Therefore, full initialization is generally required in read-only data segments.
3. read/write data segments (RW data)
Read/write data segments represent some data areas that can be read or written in the target file. In some cases, they are also called initialized data segments. This part of data segment and code, like read-only data segment, belongs to the static area in the program, but has the characteristics of Association of Science and Technology.
N global variables initialized
For example, the global variable char a [100] = "abcdefg" is defined outside the function"
N local static variables initialized
For example, define static char B [100] = "9876543210" in the function ". The data and arrays defined by static in the function will be compiled into read/write data segments.
Note:
The read/write data zone must be initialized in the program. If there is only a definition and no initial value, the read/write data zone is not generated and is defined as the uninitialized data zone (BSS ). If a global variable (a variable defined externally by the function) is added with a static modifier, it is written in the form of static char a [100], which indicates that it can only be used within the file and cannot be used by other files.
4. Data Segment not initialized (BSS)
Uninitialized data segments are often called BSS (abbreviated as block start by symbol ). Similar to reading and writing data segments, it also belongs to the static data zone. However, the data in this section is not initialized. Therefore, it will only be identified in the target file, instead of a segment in the target file, which will be generated at runtime. The uninitialized data segment is generated only during the initialization phase, so its size does not affect the size of the target file.
3. Note the following when using variables in C language programs:
1. variables defined in the function body are usually on the stack and do not need to be managed in the program. They are processed by the compiler.
2. The memory allocated by functions such as malloc, calloc, and realoc is on the heap. The program must ensure that freee is released after use; otherwise, memory leakage may occur.
3. All functions define global variables in vitro. variables with the static modifier are stored inside or outside the function in the global zone (static zone ).
4. Use the variables defined by const to store in the read-only data zone of the program.
Note:
In C language, you can define static variables: static variables defined in the function body can only be valid in the function body; static variables defined in all functions in vitro can only be valid in this file, it cannot be used in other source files. For global variables without static modification, it can be used in other source files. These differences are the concept of compilation. If variables are not used as required, the compiler reports an error. The global variables that use static and do not use static modification will be placed in the global (static) of the program ).
Iv. Use of the middle section of the program
The global zone (static zone) in C language corresponds to the following sections:
Read-only data segment: R0 data
Read/write data segment: RW data
Uninitialized data segment: BSS data
Generally, the defined global variable is in the uninitialized data zone. If the variable is initialized, it is in the initialized data zone (RW data ), the const modifier will be placed in the read-only area (R0 data ).
Eg:
Const char Ro [] = "This Is A readonlydata"; // read-only data segment. The content in the RO array cannot be changed. Ro is stored in the read-only data segment.
Char rw1 [] = "this is global readwrite data"; // initialized read/write data segment, which can change the content in the array rw1. The value should be assigned instead of the "this is global readwrite data" address to rw1. The value of "This is global readwrite data" cannot be changed. Because it is a text constant placed in a read-only data segment
Char bss_1 [100]; // uninitialized data segment
Const char * ptrconst = "constant data"; // "constant data" is placed in the read-only data segment. The value of ptrconst cannot be changed because it is an address assignment. Ptrconst points to the address for storing "constant data", which is a read-only data segment. However, the ptrconst address value can be changed because it is stored in the read/write data segment.
Int main ()
{
Short B; // B is placed on the stack and occupies 2 bytes.
Char A [100]; // 100 bytes need to be opened on the stack. The value of A is its first address.
Chars [] = "ABCDE"; // s occupies 4 bytes on the stack, and "ABCDE" itself is placed in the read-only data storage area, accounting for 6 bytes. S is an address constant and cannot change its address value. That is, s ++ is incorrect.
Char * P1; // P1 occupies 4 bytes on the stack
Char * P2 = "123456"; // "123456" is placed in the read-only data storage area, accounting for 7 bytes. P2 on the stack, the content pointed to by P2 cannot be changed, but the address value of P2 can be changed, that is, P2 ++ is correct.
Static char bss_2 [100]; // partial uninitialized data segment
Static int C = 0; // local (static) initialization Zone
P1 = (char *) malloc (10 * sizeof (char); // The allocated memory area is in the heap area.
Strcpy (P1, "XXX"); // "XXX" is placed in the read-only data storage area, which occupies 5 bytes.
Free (P1); // use free to release the memory pointed to by P1
Return 0;
}
Note:
1. the read-only data segment must include the const data defined in the Program (for example, const char Ro []) and the data to be used in the program, for example, "123456 ". For the definition of const char Ro [] and const char * ptrconst, the memory to which they direct is located in the read-only data area, and the content to which they direct cannot be modified. The difference is that the former does not allow you to modify the Ro value in the program, and the latter allows you to modify the ptrconst value in the program. For the latter, it will not be allowed to modify the value of ptrconst in the program as follows:
Const char * const ptrconst = "const data ";
2. read/write data segments include the initialized global variable static char rw1 [] and local static variable static char
Rw2 []. The difference between rw1 and rw2 is that during compilation, it is used inside the function or can be used in the entire file. For the former, static modification means that the rw1 variable can be accessed when other files of the control program are used. If static modification exists, rw1 cannot be used in other C language source files, this affects compilation-connection, but the variable rw1 will be placed in the read/write data segment regardless of static. For the latter rw2, it is a local static variable and placed in the read/write data zone. Without static modification, its meaning will completely change, it will open up local variables in the stack space, rather than static variables.
3. The data segment is not initialized. bss_1 [100] And bss_2 [200] in Case 1 represent the uninitialized data segment in the program. The difference is that the former is a global variable and can be used in all files; the latter is a local variable and only used inside the function. The value after Initialization is not set for the uninitialized data segment. Therefore, the value must be used to specify the size of the region,
The compiler will set the length to be added in BBS according to the size.
4. The stack space includes the internal variables used in the function, such as short B and char a [100], and the value of P1 in char * P1.
1. The memory pointed to by the variable P1 is built on the heap space. The heap space can only be used within the program, but the heap space (for example, the memory pointed to by P1) can be passed as the return value to other functions for processing.
2 stack space is mainly used for the storage of the following three types of data:
A. dynamic variables in the function
B. Function Parameters
C. Return Value of the Function
3. Stack space is mainly used for dynamic variables in the function. The space of variables is opened before the function starts. After the function exits, the compiler automatically recycles the space.
4. Let's look at an example:
# Include <stdio. h>
Int main ()
{
Char * P = "Tiger ";
P [1] = 'I ';
P ++;
Printf ("% s \ n", P );
}
Prompt after compilation: Segment Error
Analysis:
Char * P = "tiger"; the system opened up 4 bytes of P values on the stack ." Tiger is stored in the read-only storage area. Therefore, the content of tiger cannot be changed. * P = "tiger" indicates address assignment. Therefore, P points to the read-only storage area, therefore, changing the content pointed to by P may cause a segment error. However, because P is stored on the stack, the value of P can be changed, so P ++ is correct.
V. Use of const
1. Preface:
Const is a key word in C language. It specifies that a variable cannot be changed. Using const can improve the robustness of a program. In addition, while watching other people's code, you can clearly understand the role of const and help you understand the program.
2. Const variables and constants
(1) The value of a const variable is stored in a read-only data segment, and the start value cannot be changed. It is called a read-only variable.
The format is const int A = 5; here we can use a instead of 5.
(2) constant: it also exists in the read-only data segment, and its value cannot be changed. The format is "ABC", 5.
3. Contents of the const variable and const limitation
Let's first look at an example:
# Include <stdio. h>
Typedef char * pstr;
Intmain ()
{
Char string [6] = "tiger ";
Const char * P1 = string;
Const pstr P2 = string;
P1 ++;
P2 ++;
Printf ("P1 = % s \ NP2 = % s \ n", P1, P2 );
}
After the program is compiled, the system prompts that the error is
Error: Increment of read-only variable 'p2'
1> the basic form of const is const char m;
The limit m is unchangeable.
2> replace M, const char * PM in Formula 1;
The limit * PM is unchangeable. Of course PM is variable, so P1 ++ is correct.
3> replace char and const newtype m in Formula 1;
The limit m is immutable. pstr in the problem is a new type. Therefore, P2 is immutable in the problem and P2 ++ is incorrect.
(3) const and pointer
In the type declaration, const is used to modify a constant. There are two methods:
1> const is in front
Const int nvalue; // nvalue is const
Const char * Pcontent; // * Pcontent is const, and pconst is variable
Const (char *) Pcontent; // Pcontent is const, * Pcontent is variable
Char * const Pcontent; // Pcontent is const, * Pcontent is variable
Const char * const Pcontent; // both Pcontent and * Pcontent are const
2> const is equal to the above declaration.
Int const nvalue; // nvalue is const
Char const * Pcontent; // * Pcontent is const, and Pcontent is variable
(Char *) constpcontent; // Pcontent is const, * Pcontent is variable
Char * const Pcontent; // Pcontent is const, * Pcontent is variable
Char const * const Pcontent; // both Pcontent and * Pcontent are const
Note: const and pointer usage are common confusions in C language. The following are two-day rules:
(1) draw a line along the number *. If the const is on the left side of *, the const is used to modify the variable pointed to by the pointer, that is, the pointer points to a constant. If the const is on the right side, const is to modify the pointer itself, that is, the pointer itself is a constant. You can view the actual meaning of the above statement based on this rule, and I believe it will be clear at a glance.
(2) For const (char *); Because char * is a whole, it is equivalent to a type (such as char), so this is to limit the pointer to const.