20135302 Wei Quiet--"in-depth understanding Computer System" 7th Chapter study notes

Source: Internet
Author: User

"In-depth understanding of computer Systems" chapter 7th links

The main contents of this chapter:

    •    link--static link, dynamic link (     link includes two main tasks: symbol parsing and relocation)    
    •    symbol--global symbol and local symbol , symbol table, symbol resolution    
    •    link file creation and reference--GCC, AR RCS, Sharedj, and fpic command parameters    
    •    relocation--reposition entry, reposition symbol reference (PC relative reference and absolute reference)    
    •    target file--relocatable the target file (which details the structure and format of the elf relocatable file), Executable target file, shared destination file    
A         link (linking) is the process of collecting and combining various pieces of code and data into a single file that can be loaded (or copied) into memory and executed.


A link can be executed at compile time, when the source code is translated into machine code, or it can be executed at load time , when the loader is loaded into storage and executed, or even at run time . Executed by the application.
1. Compiler driver

Most compiled systems provide compiler drivers (compiler Driver), which represent the language preprocessor, compiler, assembler, and linker that the user invokes when needed.

GNU compilation system compiled source code:

    • First, run the C preprocessor (CPP)and translate the. c files into. i files;
    • Next, run the C compiler (CC1)and translate the. i file into an ASCII assembly-language file. s file;
    • Then, run the assembler (AS)and translate the. s file into a relocatable target file. o file;
    • Finally, run the Linker (LD)and combine each. o file to create an executable target file.
2. Static link

The UNIX static linker (linker) LD, with a set of indexable target files and command-line arguments as input, generates a fully-linked executable target file that can be loaded and run as output . The relocatable destination file that you enter is made up of a variety of different code and data sections (section). Directives in one section, initialized global variables in another section, and uninitialized variables in another section.

In order to construct an executable file, the linker must complete two main tasks:

    • symbolic parsing (symbol resolution). The target file definition and reference symbol. The purpose of symbolic parsing is to associate each symbol reference with exactly one symbol definition.
    • Relocation (relocation). The compiler and assembler generate code and data sections starting at address 0. The linker repositions These sections by linking each symbol definition to a memory location and then modifying all references to those symbols so that they point to the memory location.
3. target file

There are three forms of the destination file: The target file can be relocated . You can merge with other relocatable target files at compile time to create an executable target file.

    • executable target file . Can be copied directly to the memory and executed.
    • share the destination file . is dynamically loaded into storage and linked at load or run time.

The compiler and assembler generate a relocatable target file (including a shared destination file). The linker generates executable target files.
Modern UNIX systems use an executable and a linked format (ELF).


To relocate a target file

A typical relocatable target file contains the following sections:
. Text: The machine code of the compiled program.
. Rodata: Read-only data.
. Data: A global C variable that has been initialized. Local c variables are stored in the stack at run time, and are not present in the. Data section, nor in the. BSS section.
. BSS: Uninitialized global C variable.

4. Symbols and Symbols table

Each relocatable target module m has a symbol table that contains information about the symbols defined and referenced by M.

    • In the context of the linker, there are three different symbols:
      • 1, the global symbol defined by M and can be referenced by other modules. The global linker symbol corresponds to a non-static C function and a global variable that is defined as not with the C static property.
      • 2. Global symbols defined by other modules and referenced by module M. These symbols are called external symbols (external) and correspond to C functions and variables that are defined in other modules.
      • 3. Local symbols only defined and referenced by module M. Some local linker symbols correspond to C functions and global variables with static properties.
    • Symbol table
      • Each symbol is associated with a section of the target, and is represented by the sections field.
      • Section field three special pseudo-sections
        • ABS: Symbols that should not be relocated.
        • UNDEF: undefined symbol, referenced in this target module, but defined elsewhere.
        • COMMON: Uninitialized data destination that is not assigned a location.
      • Ndx=1 represents the. Test section, ndx=3 represents the. Data section.
5. Symbol parsing
    • Global symbols for multiple definitions
      • Strong symbols: Functions and global variables that have already been initialized
      • Weak symbol: Uninitialized global variable

      • Rules:

        规则1:不允许有多个强符号。规则2:如果有一个强符号和多个弱符号,那么选择强符号。规则3:如果有多个弱符号,那么从这些弱符号中任意选择一个。
    • Static Library Links
所有的编译系统都提供一种机制,将所有相关的目标模块打包成为一个单独的文件,称为静态库(Linux下是存档文件,Windows下是lib),可以用做链接器的输入。
      • When the linker constructs an output executable file, it copies only the target modules that are referenced by the application in the static library.
      • Archive: A set of connected relocatable target files with a header that describes the size and location of each member's target file. The archive file name is identified by the suffix. A.
      • Link with-static parameter: tells the compiler driver that the linker should build a fully-linked executable target file that can be loaded into storage and executed, without further linking at load time.
6. Reposition
    • Reposition sections and Symbol definitions:
      • The linker merges all sections of the same type into a new aggregation section of the same type, assigns the run-time memory address to the new aggregation section, assigns to each section defined by the input module, and assigns to each symbol defined by the input module.
      • At this point, each instruction and global variable in the program has a unique run-time memory address.
    • To reposition a symbol reference in a section:
      • The linker modifies the references to each symbol in the Code section and data section so that they point to the correct run-time address.
      • A linker relies on a data structure in a relocatable target module called a relocation entry.
    • Reposition Symbol Reference
      • Relative references
      • Absolute references
7. Executable target file and load (1) Executable target file
    • The C program starts with a set of ASCII text files that have been converted to a binary file, and this binary contains all the information needed to load the program into memory and run it.

    • Segment Header table: The sequential slices of the executable are mapped to contiguous memory segments, and the section Header table describes the relationship.

(2) Loading executable target file
加载器将可执行目标文件中的执行代码和数据从磁盘拷贝到存储器中,然后通过跳转到程序的第一条指令或入口点来运行该程序。这个将程序拷贝到存储器并运行的过程叫做加载。
UNIX program run-time memory Image:

    • The user stack is always the largest legitimate user address to begin with, increasing downward (to the low memory address direction). The segment that starts at the top of the stack is reserved for code and data for the part of the operating system that resides in the memory (that is, the kernel).

    • When the loader is running, it creates a memory image as shown. Guided by the Header table in the middle of the executable file, the loader copies the relevant contents of the executable file to the code and data segments.
    • Next, the loader jumps to the entry point of the program, which is the address of the symbol _start. The startup code at the _start address is defined in the target file CTRL.O and is the same for all C programs.

8. Dynamic Connection Shared Library
    • A shared library is a target module that, at run time, can be loaded into any memory address and linked to a program in memory. This process, called dynamic linking, is performed by a program called a dynamic linker.
    • Shared libraries are also known as shared destinations, which are typically represented by the. So suffix in Unix systems. Microsoft's operating system uses a large number of shared libraries, which are called DLLs (dynamic-link libraries).
    • Shared libraries are "shared" in two different ways (called "implicit linking" and "Show link" in Windows, respectively).
      • First, in any given file system, there is only one. So file for a library. All executable target files referencing the library share the code and data in this. so file, rather than being copied and embedded in the executable file that references them as the contents of the static library.
      • Second, in memory, a copy of the. Text section of a shared library can be shared by a different running process.
    • Location-Independent code pic
        编译库代码,使得不需要链接器修改库代码就可以在任何地址加载和执行这些代码。
      • User-to-GCC use -fPIC option instructs GNU to generate pic code
9. Tools for working with target files
    • AR: Create a static library, insert, delete, list, and extract members.
    • Readelf: Displays the complete structure of a target file, including all information encoded in the ELF header. Contains the size and NM features.
    • OBJDUMP: The mother of all binary tools, capable of displaying all the information in a target file. Its biggest function is disassembly. The binary instruction in the text section.
    • LDD: Lists the shared libraries that are required by an executable file at run time.
    • STRINGS: Lists all printable strings in a destination file.
    • STRIP: Removes the information from the target file for the symbol.
    • NM: Lists the symbols defined in the symbol table for a target file.
    • Size: The name and size of the section in the destination file.

20135302 Wei Quiet--"in-depth understanding Computer System" 7th Chapter study notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.