Analyzes the program memory Layout

Source: Internet
Author: User

Original article title:Anatomy of a program in memory

Original article address:Http://duartes.org/gustavo/blog/

 

[Note: I have a limited level, so I have to pick some wonderful foreign experts.ArticleTranslate. I will review it myself, and I will share it with you.]

 

The memory management module is the heart of the operating system.ProgramAnd system management are very important. In several articles in the future, I will focus on actual memory problems, but I will not shy away from the technical insider. Because many concepts are common, most examples are taken from32BitX86PlatformLinuxAndWindowsSystem. The first article in this series describes the application memory layout.

 

Every process in a multitasking operating system runs in its own memory sandbox. This sandbox isVirtual Address Space(Virtual Address Space), In32Bit modeAlways one4 GBMemory Address Block. These virtual addresses pass throughPage table(Page table) Ing to the physical memory. The page table is maintained by the operating system and referenced by the processor. Each process has a set of its own page tables, but there is a hidden situation. As long as the virtual address is enabled, it will act on the machine runningAll softwareIncluding the kernel itself. Therefore, some virtual addresses must be reserved for the kernel:

 

 

 

 

 

This does not mean that the kernel uses so much physical memory. It only means that it can have such a large address space and map it to the physical memory according to the needs of the kernel. The kernel space is relatively high in the page table.Privileged level(Ring 2Or below), so as long as a user-State program attempts to access these pages, it will lead to a page error (Page Fault). InLinuxIn, the kernel space persists and is mapped to the same physical memory in all processes. KernelCodeAnd data are always addressable, ready to handle interruptions and system calls at any time. In contrast, the ing of address space in user mode changes with the process switching:

 

 

 

 

The blue area indicates the virtual address mapped to the physical memory, and the white area indicates the unmapped part. In the preceding example,FirefoxIt uses a considerable amount of virtual address space because it is a legendary memory-consuming user. Each band in the address space corresponds to a different memory segment (Memory segment), Such as heap and stack. Remember, these segments are just simple memory address rangesIntelProcessor segmentNoLink. In any case, the following isLinuxStandard process memory segment layout:

 

 

 

When the computer is happy, safe, cute, and running properly, the starting virtual address of each segment of each process isCompletely consistentThis also opens the door for Remote Detection of program security vulnerabilities. An absolute memory address, such as the stack address and library function address, must be referenced during the mining process. Remote attackers must rely on the consistency of the address space layout to find and choose these addresses. If you make them guess right, someone will be screwed up. As a result, the random layout of address space is becoming increasingly popular.LinuxThroughStack,Memory ingSegment,HeapTo disrupt the layout. Unfortunately,32The bit address space is quite compact, leaving little space for randomization,WeakenedThis skillEffect.

 

The top segment in the process address space is the stack, and mostProgramming LanguageStore local variables and function parameters. When you call a method or function, a newStack Tracing(Stack frame. Stack shards are cleared when the function returns. It may be because the data is strictly compliant.LIFOThis simple design means that you do not need to use a complex data structure to track stack content. You only need a simple pointer to the top of the stack. Therefore, the pressure stack (Pushing) And rollback (Popping) The process is very rapid and accurate. In addition, the continuous reuse of stack space helps to keep active stack memory inCPUCache in progressTo accelerate access. Every thread in a process has its own stack.

 

By constantly pushing data into the stack, the memory area corresponding to the stack will be exhausted if the data exceeds its capacity. This will trigger a page fault (Page Fault), AndLinuxOfExpand_stack () Processing, it will call Acct_stack_growth () To check whether there is a suitable place for Stack growth. If the stack size is lower Rlimit_stack (Usually 8 MB Under normal circumstances, the stack will be extended, and the program continues to run happily, so it does not feel what has happened. This is a general mechanism for extending stacks to the desired size. However, if the maximum stack space is reached, the stack overflow ( Stack Overflow ), The Program receives a segment error ( Segmentation fault ). When the mapped stack area is extended to the desired size, it will not contract back, even if the stack is not so full. This is like the federal budget, which is always growing.

 

Dynamic stack growth is the only situation where access to areas not mapped to memory (white areas in the figure) is allowed. Any other access to areas not mapped to the memory will trigger a page failure, resulting in a segment error. Some mapped areas are read-only, so attempting to write these areas will also cause segment errors.

 

Below the stack is our memory ing segment. Here, the kernel maps the file content directly to the memory. Any application can useLinuxOfMMAP () System Call ( Implementation ) Or Windows Of Createfilemapping () / Mapviewoffile () Request this ing. Memory ing is a convenient and efficient file I/O So it is used to load the dynamic library. It is also possible to create an anonymous memory ing that does not correspond to any files. This method is used to store program data. In Linux If you pass Malloc () Request a large block of memory, C The runtime will create such an anonymous ing instead of using heap memory. 'Bulk 'means Ratio Mmap_threshold Large. The default value is 128kb , You can use Mallopt () Adjustment.

 

Speaking of heap, it is the next address space. Like the stack, the heap is used for memory allocation during runtime, but the difference is that the heap is used to store data whose lifetime is irrelevant to function calls. Most languages provide the heap management function. Therefore, meeting the Memory Request becomes a common task of the Language Runtime Library and kernel. InCIn the language, the heap allocation interface isMalloc ()Series of functions, and in languages with the garbage collection function (suchC #), This interface isNewKeyword.

 

If the heap has enough space to meet memory requests, it can be processed by the Language Runtime Library without kernel involvement. Otherwise, the heap will be extended and passBRK ()System Call (Implementation) To allocate the memory block required by the request. Heap management is veryComplexNeeds to be refinedAlgorithmTo cope with the messy allocation mode in our program, optimize the speed and memory usage efficiency. The time required to process a heap request will change significantly. Real-Time System passSpecial purpose distributorTo solve this problem. The heap may also become fragmented, as shown in:

 

 

 

Finally, let's take a look at the bottom of the memory segment: BSS , Data segment, code segment. In C Language, BSS And the data segment stores the content of static (global) variables. The difference is that BSS The stored static variable content is not initialized, and their values are not directly stored in the programSource code. BSS The memory area is anonymous: it is not mapped to any file. If you write Static int cntactiveusers , Then Cntactiveusers The content will be saved in BSS .

 

On the other hand, the data segment is stored in the static variable content initialized in the source code. This memory area is not anonymous. It maps some program binary images, that is, static variables with initial values specified in the source code. So, if you writeStatic int cntworkerbees = 10, ThenCntworkerbeesThe content is saved in the data segment, and the initial value is10. Although the data field maps to a file, it is a private memory ing, which means that the memory here will not affect the file to be mapped. This is also required. Otherwise, assigning a value to the global variable will change the binary image on your hard disk, which is unimaginable.

 

The example of the Data Segment in is more complex because it uses a pointer. In this case, the pointerGonzo(4Byte memory address)ValueSave it in the data segment. The actual string it points to is not here. This string is saved in the code segment. The code segment is read-only. It saves all your code and other fragmented things, such as the string literal value. The code segment maps your binary files to the memory, but write operations in this area will make your program receive a segment error. This helps prevent pointer errors, although not inCWhen programming languages, it is so effective to take precautions. The following sections and variables are displayed:

 

 

 

You can read the file/Proc/pid_of_process/mapsTo testLinuxMemory area in the process. Remember that a segment may contain many areas. For exampleMMAPThe dynamic library has a similarBSSAnd the additional area of the Data Segment. The next article describes these regions "(Area. Sometimes people refer to "data segments", which refer to all data segments.+ BSS +Heap.

 

You can useNmAndObjdumpCommand to view the binary image, print the symbols, their addresses, segments, and other information. It should be noted that the preceding virtual address layout isLinuxIs a "flexible layout "(Flexible Layout), And this is the default method for some years. It assumes that we have a valueRlimit_stack. If this is not the case,LinuxReturn to "classic layout "(Classic Layout), As shown in:

 

 

 

Let's talk about the layout of the virtual address space. The next article will discuss how the kernel tracks these memory regions. We will analyze the memory ing to see how the file read and write operations are associated with it, and the meaning of memory usage overview.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.