Introduction to memory allocation of C language programs learning _c language

Source: Internet
Author: User
Tags constant data structures numeric value stack pop

Storage area of C language Program

C Language Program after the compilation-link, will form a unified file, it consists of several parts, in the program run will produce several other parts, each part represents a different storage area:

Snippet (code or TEXT): The code snippet consists of machine code in the program. In C, program statements are compiled to form machine code. During the execution of the program, the CPU's program counter points to every piece of code in the snippet and runs sequentially by the processor.
Read-only data segment (RO data): read-only data segments are used by the program will not be changed data, using these methods similar to look-up table operations, because these variables do not need to change, so only need to place in the read-only memory.
Initialized Read and write data segment (RW data): Initialized data is a variable declared in a program that has an initial value that takes up space in the memory, which needs to be in a read-write memory area when the program is executed, and has an initial value for reading and writing when the program is running.
Uninitialized read and write data segment (BSS): Uninitialized read and write is declared in the program but not initialized variables that do not need to occupy storage space before the program runs.
Heap (heap): Heap memory appears only when the program is running, and is typically allocated and released by programmers. In the case of an operating system, if the programmer is not freed, the operating system can reclaim memory after the program is finished.
Stack (stack): Stack memory only occurs when the program is running, variables used inside functions, parameters of functions, and return values will use stack space, and stack space is automatically allocated and released by the compiler.
The memory layout of the C language target file is shown in the figure:

Code snippets, read-only data segments, read and write data segments, uninitialized data segments belong to static regions, and heaps and stacks belong to dynamic regions. Code snippets, read-only and read-write data segments will be generated after the connection, uninitialized data segments will be opened at the time the program is initialized, and the heap and stack will be allocated and released in the running of the program.

C Language Program is divided into image and run-time two states, the image that is formed after compiling the connection will contain only code snippets, read-only segments, and read and write data segments, dynamically generating uninitialized data segments before the program runs, and dynamically forming heap areas and stack regions as the program runs.

In general, in a static image file, each part is called a section, while the various parts of the runtime are called segments (Segment), sometimes collectively referred to as segments.

section of the C language program

Code Snippets: Code snippets are generated by each function, and each statement of the function will eventually be edited and compiled to generate binary machine code (which architecture's machine code is determined by the compiler).
Read-only data segment (RO data): A read-only segment is generated by the data used in the program, which is characterized by the fact that it does not need to be changed in the runtime, so the compiler places the data segment in the Read-only section. Read-only global variables in C language, read-only local variables, constants used in programs, etc. are put into the read-only data area at compile time. Note: Define the global variable const char a[100]={"ABCDEFG"}, generate a read-only data area of size 100 bytes, and initialize it with "ABCDEFG". If defined as: const char a[]={"ABCDEFG"}, generates 8-byte read-only data segments (and ' ") based on the string length, so you typically need to do a full initialization in read-only data segments.
Read-write data segment (RW data): Read-write data segment represents a part of the target file can read or write data area, in some cases they are also referred to as initialized data segments, which, like read-only data segments, are static areas of the program, but are writable in nature. Normally initialized global variables and local static variables are placed in the read and write data segments, for example, to define static char b[100]={"ABCDEFG" in a function, the read-write data area is characterized by the fact that the program is initialized and, if only defined, has no initial value, the read-write data area is not generated. Instead, it is positioned as an uninitialized data area (BSS). If a global variable (a variable defined outside the function) is added to the static modifier, this means that it can only be used within a file and not by other files.
Uninitialized data segment (BSS): Similar to read-write data segments, it also belongs to a static data area, but the data in that segment is not initialized. So it will only be identified in the target file, not really a paragraph in the destination file, which will be generated at run time. Uninitialized data segments are only generated during the initialization phase of the runtime, so its size does not affect the size of the destination file.
In the C language program, the use of variables also have the following points to note:

Variables defined in the function body are usually on the stack and do not need to be managed in the program, and are handled by the compiler.
The memory space allocated by the function allocating memory with Malloc,calloc,realloc is on the heap, and the program must ensure that free release is used, otherwise a memory leak will occur.
All functions are defined as global variables, and static variables are placed in the global area either inside or outside the function.
Variables defined with const are placed in the read-only data area of the program.

Use of the middle part of the program

The following is a simple example to illustrate the corresponding relationship between variables and segments in C language. C Language Program in the global area (static area), the actual corresponding to the following segments: RO Data; RW Data; BSS data. In general, directly defined global variables are not initialized in the data area, if the variable is initialized in the initialized data area (RW data), plus the const will be placed in the read-only data area.

 const char ro[] = {"This is read only data"}; Read-only data area static char rw_1[] ={"This is global read write data"};                Initialized Read and write data segment char bss_1[100];      Uninitialized Segment const char *PTRCONST = "constant data";                 The string is placed in the read-only segment int main () {short B;               On the stack, occupy 2 bytes char a[100];           On the stack to open 100 bytes, the value of the work is its first address char s[]= "ABCDEFG";                 s on the stack, occupies 4 bytes//"ABCDEFG" itself placed in the read-only data store, accounting for 8 bytes Char *p1;            P1 on the stack, occupies 4 bytes char *p2= "123456"; P2 on the stack, P2 points cannot be changed,//"123456" in read-only data area static char rw_2[]={"This is local read write data"}          ;//local initialized read and write data segment static char bss_2[100];             Partially uninitialized data segment static int c = 0;  The global (static) initialization zone p1= (char *) malloc (A * sizeof (char));            Allocating memory areas in the heap area strcpy (P1, "xxxx");                 "XXXX" in the read-only data area, accounting for 5 bytes free (p1);
Use free to release the memory that the P1 points to return 0; }

The read-write data segment contains a global variable static char rw_1[] that is initialized and a local static variable static rw_2[]. The difference is that when compiling, it is used in the letter department or can be used throughout the document. For rw_1[] Regardless of whether there is static modification, it will be placed in the read and write data area, only can be referenced by other files or not. For the latter is not the same, it is a local static variable, placed in the read and write data area, if not static modification, its meaning completely changed, it will be opened in the stack space of local variables, rather than static variables, here rw_1[],rw_2[] after no specific value, Indicates that the static area size is determined by the length of the back string.

For uninitialized data area bss_1[100] and bss_2[100], the difference is that the former is a global variable and can be used in all files, which is a local variable and is used only within the function. Uninitialized data segments do not set subsequent initialization values, so you must specify the size of the range using a numeric value, and the compiler will set the length of the BSS to be increased based on size.

The stack space is mainly used for storage of the following 3 data:

    1. Dynamic variables inside a function
    2. Parameters of the function
    3. The return value of the function

Stack space is dynamically opened and recycled. In the function call process, if the function calls of more levels, the need for the stack space is gradually increased, for the parameters of the transfer and return value, if the use of larger structure, the stack space in use will be relatively large.

The comparison between Heap and stack

1. Application method

Stack: Automatically allocated by the system. For example, declare a local variable int b in a function; The system automatically opens up space for B in the stack.

Heap: Require the programmer to apply, and indicate the size, in C malloc function, C + + is the new operator.

Such as

P1 = (char *) malloc (10); P1 = new CHAR[10];

Such as

P2 = (char *) malloc (10); P2 = new CHAR[20];

But note that P1, p2 itself is in the stack.

2. Response of the system after application

Stack: As long as the remaining space of the stack is larger than the application space, the system will provide memory for the program, otherwise it will be reported abnormal stack overflow.

Heap: You should first know that the operating system has a linked list that records free memory addresses, when the system receives a program application, it traverses the list to find the first heap node that is larger than the requested space, and then deletes the node from the Free node list and assigns the node space to the program.

For most systems, the size of this assignment is recorded at the first address in the memory space, so that the DELETE statement in the code can properly release the memory space.

Because the size of the found heap node does not necessarily equal the size of the application, the system automatically puts the extra part back into the free list.

3. Application Size Limit

Stacks: In Windows, stacks are data structures that extend to a low address and are a contiguous area of memory. The address of the top of the stack and the maximum capacity of the stack are predetermined by the system, in Windows, the size of the stack is 2M (also some say 1M, in short, a compile-time constant), if the application space over the stack of remaining space, will prompt overflow. Therefore, the space can be obtained from the stack is small.

Heap: The heap is a data structure that is extended to a high address and is a contiguous area of memory. This is because the system is used to store the free memory address of the list, nature is discontinuous, and the link list of the traversal direction is from the low address to the high address. The size of the heap is limited by the virtual memory available in the computer system. This shows that the heap to obtain a more flexible space, but also relatively large.

4. Comparison of application efficiency

The stack is automatically allocated by the system, faster. But programmers are out of control.

A heap is a memory that is allocated by new, typically slower, and prone to memory fragmentation, but is most convenient to use.

In addition, in Windows, the best way is to allocate memory with VirtualAlloc, he is not in the heap, not the stack, but directly in the process of the address space to keep a fast memory, although the most inconvenient to use. But fast, and most flexible.

5. Storage content in heaps and stacks

Stacks: When a function is called, the first stack is the address of the next instruction in the main function (the next executable statement of the function call statement). Then there are the parameters of the function, in most C compilers, the parameters are pushed from right to left, and then the local variables in the function. Note that static variables are not in the stack.

When the function call is finished, the local variable first goes out of the stack, then the argument, and the last stack pointer points to the address that was first saved, the next instruction in the main function, where the program continues to run.

Heap: The size of the heap is usually stored in a byte at the head of the heap. The specifics of the heap are arranged by the programmer.

6. Comparison of Access efficiency

Char s1[] = "a";

char *s2 = "B";

A is assigned at run time, while B is determined at compile time, but in future accesses, the array on the stack is faster than the string that the pointer points to (for example, the heap).

Like what:

int main () {

char a = 1;

Char c[] = "1234567890";

Char *p = "1234567890";

A = c[1];

A = p[1];

return 0;

}

The corresponding assembly code

10:a = c[1];

00401067 8A 4D F1 mov cl,byte ptr [ebp-0fh]

0040106A 4D FC mov byte ptr [ebp-4],cl

11:a = p[1];

0040106D 8B EC mov edx,dword ptr [ebp-14h]

00401070 8A al,byte ptr [edx+1]

00401073 FC mov byte ptr [ebp-4],al

The first reads the elements in the string directly into the register CL, while the second is to read the pointer value into the edx, and then read the characters according to EdX, obviously slow.

7. Summary

The main difference between heap and stack consists of the following points:

(1) different ways of management;

(2), the space size is different;

(3), can produce different fragments;

(4), the growth direction is different;

(5) different ways of distribution;

(6), the distribution efficiency is different;

Management mode: For the stack, is the compiler automatically managed, without our manual control; for the heap, the release of the work by the programmer control, easy to produce memory leak.

Space size: Generally in 32-bit system, heap memory can reach 4G of space, from this point of view heap memory is almost no limit. But for the stack, generally there is a certain amount of space, for example, under the VC6, the default stack space size is 1M. Of course, this value can be modified.

Fragmentation problem: For the heap, frequent new/delete is bound to cause the memory space discontinuity, resulting in a large number of fragments, so that the program efficiency is reduced. For the stack, this problem will not exist, because the stack is advanced after the queue, they are so one by one corresponding, so that there will never be a memory block from the middle of the stack pop-up, before he pops up, the last of his stack content has been ejected, detailed can refer to the data structure.

Growth direction: For the heap, the growth direction is upward, that is, toward the memory address of the direction of increase, for the stack, its growth direction is downward, is to the memory address to reduce the direction of growth.

Allocation method: The heap is dynamically allocated and there is no statically allocated heap. There are 2 ways to allocate the stack: static allocation and dynamic allocation. Static allocations are done by the compiler, such as the allocation of local variables. The dynamic allocation is allocated by the Mallo function, but the dynamic allocation and heap of the stack are different, and his dynamic allocation is released by the compiler without our manual implementation.

Allocation efficiency: The stack is a machine system provides the data structure, the computer will be at the bottom of the stack to provide support: the allocation of special registers to store the stack address, pressure stack out of the stack have specific instructions to execute, which determines the stack of high efficiency. The heap is provided by the C + + function library, and its mechanism is very complex, for example, in order to allocate a piece of memory, the library function will search the heap memory for the available space of sufficient size in a certain algorithm (the specific algorithm can refer to the data structure/operating system), if there is not enough space (possibly due to too much memory fragmentation), It is possible to call the system function to increase the memory space of the program data segment, so that there is a chance to get enough memory and then return. Obviously, the heap is much less efficient than the stack.

From here we can see that heap and stack, because of the use of a large number of new/delete, easy to create a lot of memory fragmentation, because there is no special system support, inefficient, because of the possible user-state and nuclear mentality of switching, memory applications, the cost becomes more expensive. So the stack in the program is the most widely used, even if the function of the call also use stack to complete, function calls in the process of parameters, return address, EBP and local variables are used to store the stack. So, we recommend that you try to use stacks instead of heaps.

Although the stacks have so many benefits, they are not as flexible as the heap, and sometimes allocate a lot of memory space, or use a heap better.

Whether it is a heap or a stack, prevent the occurrence of cross-border phenomena (unless you deliberately make them out of bounds), because the result of the crossing is either a program crash or a heap or stack structure that destroys the program, producing unexpected results.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.