Unix System programming () process memory layout

Source: Internet
Author: User

The memory allocated by each process is made up of many parts, often referred to as "segments (segment)".

The text snippet contains the program machine language instruction that the process is running. A text segment has a read-only property to prevent the process from accidentally modifying its own directives through an error pointer.

Because multiple processes can run the same program at the same time, the text segments can be shared so that a copy of the program code can be mapped to the virtual address space of all those processes.

The initialization data segment contains both global and static variables that are explicitly initialized. When the program loads into memory, the values of these variables are read from the executable file.

Uninitialized data segments include global variables and static variables that are not explicitly initialized.

Before the program starts, the system initializes all memory in this section to 0.

For historical reasons, this segment is often referred to as the BSS segment, which stems from the old version of assembly language mnemonic "block started by symbol".

Storing the initialized global and static variables separately from the uninitialized global and static variables is primarily due to the fact that there is no need to allocate storage space for uninitialized variables when the program is stored on disk.

Instead, the executable simply records the location of the uninitialized data segment and the desired size until the runtime allocates the space by the program loader.

A stack is a segment that dynamically grows and shrinks, consisting of a stack frame (stack frames).

The system assigns a stack frame to each currently called function.

The stack frame stores the function's local variables (so-called automatic variables), arguments, and return values.

A heap is a region of memory that can be dynamically allocated at run time (for variables).

The top of the heap is called program break.

For initialized and uninitialized data segments, the less common, but clearer, appellation is the user-initialized data segment (user-initialized data segment) and the 0 initialization segment (zero-initialized data segment).

The size command displays the segment size of the text segment, initialization data segment, non-initialized data segment (BSS) of the binary executable file.

The "segment" Here should not be confused with some hardware architectures, such as hardware fragmentation (segmentation) used in x86-32.

Instead, the paragraph here refers to the logical partitioning of the process virtual memory in a UNIX system. Sometimes the term "area" (section) is used instead of paragraph because the term used in the popular executable format (ELF) specification is more consistent with the "zone".

The following program shows the different types of C language variables and explains in comments which segment each variable belongs to.

These instructions are correct if a non-optimized compiler is used, and in the Application Binary interface (ABI), all parameters are passed through the stack. In fact, the optimization compiler assigns frequently used variables to registers, or simply deletes the variables completely.

In addition, some ABI needs to pass function arguments and results through registers, not stacks.

But this example is intended to show the mapping between the C language variables and the process segments.

The Application Binary interface (ABI) is a set of rules that specify how binary executables should exchange information at run time with certain services, such as those provided by the kernel or function library. The ABI specifically specifies which registers and stack addresses are used to exchange information and the meaning of the value exchanged, and once compiled for a particular ABI, the binary executable should be able to run on any system of the same ABI. By contrast, a standardized API can only be compiled with source code to ensure application portability.

Although SUSV3 is not specified, in most UNIX implementations (including Linux) the C language programming environment provides 3 global symbols (Sysmbol): Etext,edata and end, which can be used within a program to obtain the corresponding program text segment, Initializes the address of the data and the next byte at the end of the non-initialized data segment.

With these symbols, you must explicitly declare the following:

extern Char EText, edata, end;

Shows the layout of various memory segments in the X86-32 architecture, the top of which is marked with argv, environ space for storing program command-line arguments (obtained through the argv parameter of the main function in C), and a list of process environments (discussed later). The hexadecimal address in the figure will vary depending on the kernel configuration and the program link options.

The area of gray shown in the figure indicates that these ranges are not available in the process virtual address space, that is, the page table is not created for these areas.

Unix System programming () process memory layout

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.