"In-depth understanding of computer Systems" chapter Seventh study notes

Source: Internet
Author: User

Seventh Chapter Links

Name: Wang Wei No.: 20135116

First, about the link

1. Meaning

A link is the process of collecting and combining various pieces of code and data into a single file that can be loaded (or copied ) into memory and executed. Links are executed automatically by the linker program.

2. Execution time

    • Compile-time
    • At load time
    • Run-time

Second, compiler driver

To Drive the compiler: invokes the language preprocessor, compiler, assembler, and linker on behalf of the user when needed.

Third, static link

1. STATIC linker

  The UNIX static linker (linker) LD, with a set of indexable target files and command-line arguments as input, generates a fully-linked executable target file that can be loaded and run as output. The relocatable destination file that you enter is made up of a variety of different code and data sections (section). Directives in one section, initialized global variables in another section, and uninitialized variables in another section.

2. Two tasks of linker

    • Symbolic parsing: Associate each symbol reference with exactly one symbol definition
    • Relocation: The linker connects each symbol definition to a memory location, and then modifies the symbol references so that they point to the memory location, thereby repositioning

Iv. Target Documents

Three forms of the target file

    • relocatable target file (compiler and assembler can be generated)
    • Executable target file (linker can be generated)
    • Shared destination file (compiler and assembler can be generated)

Five, can relocate the target file

  • . Text: The machine code of the compiled program.
  • . Rodata: Read-only data, such as format strings in printf statements and jump tables for switch statements.
  • . Data: A global C variable that has been initialized.
  • . BSS: Uninitialized global C variable. In the target file, this section does not occupy the actual space, it is just a placeholder.
  • . symtab: A symbol table that holds information about functions and global variables that are defined and referenced in the program.
  • . Rel.text: A list of locations in the. Text section, which you need to modify when the linker combines this target file with other files.
  • . Rel.data: Relocation information for any global variables referenced or defined by the module.
  • . Debug: A debug symbol table whose entries are local variables and type definitions defined in the program, global variables defined and referenced in the program, and the original C source file. This table is only available when the build driver is called with the-G option.
  • . Line: The mapping between the row numbers in the original C source program and the machine directives in the. Text section.
  • . strtab: A string table whose contents include: the symbol table in the. Symtab and. Debug sections, and the section name in the section header. The string table is a null-terminated sequence of strings.

Vi. Symbols and Symbols table

1. In the context of the linker, there are three different symbols:

    • A global symbol that is defined by M and can be referenced by other modules. The global linker symbol corresponds to a non-static C function and a global variable that is defined as not with the C static property.
    • A global symbol that is defined by another module and referenced by the module M. These symbols are called external symbols (external) and correspond to C functions and variables that are defined in other modules.
    • Local symbols that are defined and referenced only by module M. Some local linker symbols correspond to C functions and global variables with static properties.

2. Symbol table

Seven, Symbolic analysis

1. How the linker resolves global symbols for multiple definitions

(1) Strong symbols: functions and initialized global variables

(2) Weak symbol: Uninitialized global variable

(3) Processing rules:

    • Multiple strong symbols are not allowed.
    • If you have a strong symbol and multiple weak symbols, select the strong symbol.
    • If there are multiple weak symbols, select one of these weak symbols.

2. Link to Static Library

All compilation systems provide a mechanism to package all relevant target modules into a single file, called a static library, which can be used as input to the linker. Among them, under Linux is the archive file, under Windows is lib.

During the symbolic parsing phase, the linker scans the relocatable destination and archive files in the same order that they appear on the compiler driver command line, from left to right. (The driver automatically translates all the. c files in the command line into an. o file), and in this scan, the linker maintains a set of relocatable target files E (the files in this collection are merged to form an executable file), an unresolved symbol (that is, a symbol that references but not yet defined) U, and a set of symbols defined in the previous input file, E, U, and D are empty at the beginning.

Seven, re-positioning

1, relocation of two steps

(1) Reposition section and symbol definitions

    • The linker merges all sections of the same type into a new aggregation section of the same type, assigns the run-time memory address to the new aggregation section, assigns to each section defined by the input module, and assigns to each symbol defined by the input module.
    • Once this is done, each instruction and global variable in the program has a unique runtime memory address.

(2) Redefining a symbol reference in a section

    • The linker modifies the references to each symbol in the Code section and data section so that they point to the correct run-time address.
    • A linker relies on a data structure in a relocatable target module called a relocation entry.

2. Relocation Entries

(1) whenever the assembler encounters a target reference to the final position location, it generates a relocation entry that tells the linker how to modify the reference when it merges the target file into an executable file.

(2) the relocation entry for the code is placed in the. Rel.text.

(3) The relocation entry for the initialized data is placed in the. Rel.data.

(4) The ELF defines 11 different relocation types. Two of the most basic relocation types:

    • *R_386_PC32 relocate a reference to a relative address using a 32-bit PC.
    • *r_386_32 relocate a reference that uses a 32-bit absolute address.

    • Offset: The section offset of the reference that needs to be modified
    • Symbol: Identifies the symbols to which the modified reference should be directed
    • Type: tells the linker how to modify a new reference

3. Reposition Symbol References

(1) Relative reference

(2) Absolute Reference

Viii. executable target file

Nine, loading executable target file

The loader copies the execution code and data from the disk into storage in the executable target file, and then runs the program by jumping to the first instruction or entry point of the program. The process of copying the program to memory and running it is called loading.

    • In a 32-bit Linux system, the code snippet always starts at the address 0x08048000.
    • The data segment is located at the next 4KB aligned address.
    • The run-time heap grows on the next first 4KB aligned address after the read/write segment, and increases by calling the malloc library.
    • One segment is reserved for the shared library.
    • The user stack is always the largest legitimate user address to begin with, increasing downward (to the low memory address direction). The segment that starts at the top of the stack is reserved for code and data for the part of the operating system that resides in the memory (that is, the kernel).

X. Dynamic Link Sharing Library

1, the disadvantages of the static library:

    • When a static library is updated, the program that uses the library needs to be re-linked with the updated library.
    • Because programs that use static libraries copy the target modules that are referenced by the application in the static library when they are linked, the code for functions such as printf and scanf are copied to the text segment of each running process at run time, creating redundancy and wasting scarce memory resources.

2. Shared Library

    • A shared library is a target module that, at run time, can be loaded into any memory address and linked to a program in memory. This process, called dynamic linking, is performed by a program called a dynamic linker.
    • Shared libraries are also known as shared destinations, which are typically represented by the. So suffix in Unix systems. Microsoft's operating system uses a large number of shared libraries, which are called DLLs (dynamic-link libraries).
    • Shared libraries are "shared" in two different ways (called "implicit linking" and "Show link" in Windows, respectively).
    • First, in any given file system, there is only one. So file for a library. All executable target files referencing the library share the code and data in this. so file, rather than being copied and embedded in the executable file that references them as the contents of the static library. Second, in memory, a copy of the. Text section of a shared library can be shared by a different running process.

Xi. loading and linking shared libraries from the application

#include<dlfcn.h>

void *Dlopen(const char *filename,int flag); //return: A pointer to a handle if successful, or null if an error occurs

void * Dlsym (void *handle,char *symbol) ; //return: pointer to symbol if successful, null

int dlclose (void *handle); //return: 0 If successful, 1

< Span class= "Hljs-params" >const char *dlerror ( void)

12. Location-independent code (PIC)

  • < Span class= "Hljs-params" > Compile library code, This allows the code to be loaded and executed at any address without the need for the linker to modify the library code.
  • User-to-GCC use -fPIC option instructs GNU to generate pic code

13. Tools for processing target files

14. Summary

  Links can be done at compile time by a static compiler, or by a dynamic linker at load time and at run time. The linker processes the binary file that becomes the target file, and it has three different forms: relocatable, executable, and shared. The relocatable target file is merged into an executable target file by a static linker, which can be loaded into memory and executed. Shared destination files (shared libraries) are linked and loaded by the dynamic linker at run time, or implicitly when the calling program is loaded and executed, or when the program calls the function of the Dlopen library as needed.

"In-depth understanding of computer Systems" chapter Seventh study notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.