Link process and link compilation process

Source: Internet
Author: User

Link process and link compilation process

Note: there may be a lot to add here, including the content of the target file that can be relocated.

Symbol and symbol table

There are three different symbols in the context of the ld and other connectors. Each relocated target module m has a symbol table that contains information about the symbols defined and referenced by m.

Global symbols defined by m that can be referenced by other modules. It is a non-static c function and a global variable defined as without the c static attribute.

Global symbols defined in other modules and referenced by module m. Corresponds to the c functions and variables defined in other modules, external.
Only the local symbols defined and referenced in module m. Some are c Functions and global variables corresponding to static attributes. These symbols are everywhere in module m, but cannot be referenced by other modules. Both the section corresponding to module m in the target file and the name in the corresponding source file can obtain the local symbol.

The symbol table in. symtab does not contain any symbols corresponding to local non-static program variables. These symbols are managed in the stack at runtime, And the linker does not catch a cold with these symbols.

Local process variables of the static attribute are not managed in the stack. On the contrary, the compiler. data and. bss allocates space for each definition and creates a local Linker with a unique name in the symbol table.

Global variables and functions with static attributes are private to the module. On the contrary, it can be accessed by other modules.

The symbol table is constructed by the assembler and is output to the. s file by the compiler.

Each entry in this table contains name and value. Value indicates the address of the symbol. For relocated modules, value is the offset from the starting position of the section of the defined target. Also, the size is the target size. Each symbol is associated with a section of the target file .. Ndx 1 indicates the. text section, and 3 indicates the. data section.

The linker parses a symbolic reference by associating each reference with a fixed symbolic definition in the symbol table of the relocatable target file it inputs.

Symbol Parsing

First, I already have a certain concept for the symbol table, that is, the symbol table that can be redefined (including the definition of the symbol ). The first time this table was used, it was still quite novel. Now it has become a powerful tool for learning symbolic parsing.
Why do I need to parse the symbol reference? Because these symbols are used in other parts of the Code.
What are the characteristics of symbolic parsing for different types of symbols? It is very easy to parse the symbols referenced and defined in the same module.
What happens when the compiler encounters a global symbol that is not defined locally? Of course, it will assume that the symbol is defined in other modules, and then it will be searched (this process will generate a linker symbol table entry and hand it over to the linker for processing) if the linker cannot find the referenced symbol in any of its modules, it will output an error message ..
(Note Haha, the following is an example, which is very simple and often encountered. However, c. learncodethehardway introduces a lot of static library and dynamic library compilation links, which can be further understood here .)
Due to the existence of overload in cpp and Java, the linker seems to use a technology called mangling (it should be very interesting to look for more information)

How does the linker work?
What is the concept of strong/weak symbols? A global variable without Initialization is a weak symbol.
The logic of the linker is not complex.
But in special circumstances, how does the linker parse the global symbols of multiple definitions?
Rules:
1. Multiple strong symbols are allowed.
2. If there is one strong symbol and multiple weak symbols, select a strong symbol. (An example is output here, which is very interesting)
3. If there are multiple weak symbols, select any one of these weak symbols. (Repeated symbols of different types may be very different here)

Link to static library (previously written module)

Note: There are a lot of examples and tips here (just some experience in this area ).
In addition to static libraries, what other alternative statements can be used?
Why does the system support the concept of a database? It defines a broad set of standard IO, string operation box certificates, mathematical functions, and so on. They are in libc.. If you do not use a static library, you can use multiple methods to do this (for example, let the compiler identify calls to standard functions, and directly generate the corresponding code (this complexity of the compiler will be a terrible disaster .... Of course there are other shortcomings. ))
What about another one? It is to place all the standard C functions in a single relocated target module, and then the application programmer can connect this module to their executable files... (Compared with the previous method, at least the implementation of the compiler and the implementation of the standard function can be separated) What are the disadvantages? Each executable file contains a full copy of a standard function set, which is a waste of disk space. Even worse, every running program copies its own functions in the memory, which is a great waste.
Of course, we can also create an independent relocated file for each standard function and put them in a directory that everyone knows to solve some of these problems. (This requires application programmers to introduce the appropriate target modules to their executable files)
What is a static library? What are the benefits?
In unix systems, static libraries are stored in disks in a special file format called archive. An archive file is a set of connected relocated target files (with a header used to describe the size and location of each member's target file ). The archive file name is marked by the suffix..

Now there is a very important question: how does the linker use a static library to parse references?
...

Relocation
Is the next step of symbolic parsing. The input module is merged and the runtime address is assigned to each symbol.

Executable target file
Our c program started with a set of ascii text files that have been converted into a binary file that contains all the information required to load the program to the memory and run it.

Load the executable target file
The operating system code of a loader to run executable programs.

Dynamic Link shared library

Load and link a shared library from an application

P.s .,I have mentioned this too briefly. If you have time, I will detail the image parsing and content.

References

Csapp: Computer Systems A Programmer's perspective
Http://c.learncodethehardway.org/book/ex28.html
Http://c.learncodethehardway.org/book/ex29.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.