Deep understanding of computer systems (3) link knowledge [with diagram]

Source: Internet
Author: User
Overview

● This chapter mainly describes the structure of the ELF File.

● Concept of static library

● The concept of a dynamic library (also called a shared library) is generally used in an operating system, and common applications do not play a major role.

● Program loading process.

The explanation of the link in this book is not detailed enough. At the end of the chapter, the author admitted that there is no good record link in the computer system literature. Because the link is at the intersection of compilers, Computer Architectures, and operating systems, he needs to understand code generation, machine language programming, program instantiation, and virtual memory. It happens not to fall into a common computer system field.

This section describes the Linux x86 system. The detailed details of a standard elf target file may vary, but the concepts are the same.

After reading this chapter, the concept of "symbol" is very vague.

7.1 compile the driver

Let's talk about the compilation system. Most compilation systems provide compilation drivers that call language preprocessing, compilers, compilers, and connectors as needed. I drew a structure chart myself.

7.2 static Link

7.3 there are three target files in the target file: the target file can be relocated, the executable target file and the shared target file (that is, the dynamic link library). The names of the target files on each system are inconsistent, UNIX is called. out, Windows NT is called PE (portable executable ). Modern Unix uses the ELF format (executable and linkable format for execution and link formats ). The following describes in detail the leftmost figure of "relocated target file. Three steps are involved: generate an executable file for a target file, and then map the file to the memory. The elf header describes the size and byte order of the system words that generate the file. Each part between the ELF and the section header table is called a section. Text: The machine code of the compiled program. rodada: read-only data, such as the format string in the printf statement. . Data: The initialized global C variable. Local variables are saved in the stack during running. That is, the global C variable is no longer in the data section or in the BSS section. BSS: not initialized. It does not occupy the actual space. It is just a placeholder. Therefore, uninitialized variables do not need to occupy any actual disk space. C ++ weakens the BSS segment. Either not. . Symtab: A symbol table that stores "information about functions and global variables defined and referenced in programs ". . Rel. Text: A list of positions in the. text section. (For Future relocation). Rel. Data: Relocation information of any global variables referenced or defined by the module. . Debug: debug the symbol table. Its content is the local variables and Type Definitions defined in the program. . Line: The ing between the line number of the source C program and the machine commands in. Text. . Strtab: A string table.

You can locate the structure of the target file: This gives you an in-depth understanding of the program segment, data segment, BSS segment, and symbol table.

7.4 relocated target file-reference 7.3

7.5 symbol and symbol table

A symbol table is an array that contains a struct.

typedef struct {    int name;/*String table offset*/    int value;/*Section offset, or VM address*/    int size;/*Object size in bytes*/    char type:4,/*Data, fund,section,or src file name (4 bits)*/        binding:4;/* Local of global(4bits)*/    char reserved;/*Unused*/    char section;/*Section header index ABS UNDEF*/}Elf_Symbol;

7.6 symbol Parsing

The principle is that the compiler only allows one definition for each local symbol in each module. In addition, it is difficult to parse the global symbols because multiple target files may define the same symbols. C ++ and Java use mangling to support overloading.

For more information about the global symbols of multiple definitions, see the following program:

/*foo.c*/                      /*bar.c*/#include <stdio.h>             int x;void f(void);                  void f()int x =15213;                  {int main()                         x = 15212;{                               }    f();    printf("x=%d\n",x);    return 0;}

You can guess that the output result is 15212. This is because the X global variable in bar. C is not initialized, and the function f uses the X variable in the foo file.

Use the following rules to process multiple defined Symbols Based on UNIX connectors:

● Rule 1: Multiple strong symbols are not allowed.

● Rule 2: if there is a strong symbol and multiple weak symbols, select a strong symbol (this is the answer to the above question, and the initialized int x = 15213 is a strong symbol, int X; it is a weak symbol)

● Rule 3: If multiple weak symbols exist, select any one of these weak symbols (what a terrible situation)

Static Library

Some relocated target files written in advance are packaged into a separate file, which can be used as the connector input. When the connector constructs an output executable file, it only copies the target module referenced by the application in the static library. (Dynamic Link Library, also known as shared library, will be explained later ).

In Unix systems, static databases are stored on disks in a special file format that is archive. An archive file is a set of connected relocated target files. A header is used to describe the size and location of the target file of each member. The suffix of the archive file is.. Can I understand the structure of the. A file like this? (Drawing by yourself)

The following shows a static database connection process:

7.7 relocation

7.8 executable files

See the central part of the figure in section 7.3. The executable file is very similar to the target file that can be relocated. The only difference is that the executable file contains two sections: "init" and "segment header table", ". Rel. Text" and ". Rel. Data.

7.9 load the executable file.

From the figure in section 7.3, we can see that the right part is the loaded program structure. Elf target files are designed to be very easy to load into memory. In UNIX, the total code segment of a program always starts from 0x0804800 (this is the role of virtual memory ). The data segment is in the next 4 kb aligned address. The first 4 kb alignment address after the "read/write segment" is added during runtime, and the address is increased through the malloc library. The stack always grows down.

7.10 dynamic library (Shared Library)

The dynamic library is designed to solve the two drawbacks of the static library. The two drawbacks of the static library are: 1) After the static library is updated, the program needs to obtain the static library and then compile it. 2) different programs may use the same static library, resulting in repeated code loading into the memory in many static libraries.

Shared libraries are a modern innovative product designed to solve the defects of static libraries. A shared library is a target module that can be loaded to any storage address during runtime and linked to a program in the storage. This process is called "Dynamic Link" and is completed by a program called "dynamic linker.

The shared library is shared by Liang Zong: 1) all the programs that reference the Library share the code and data in A. So file, instead of copying the data in a static library. 2) In the memory, a copy of the. text section of a shared library can be shared by different running users to save valuable memory resources. (The dynamic library in UNIX is represented by the. So suffix .)

It is very important to understand the concept of dynamic library = shared library. Dynamic libraries are generally the favorites of large software or operating systems, because for common applications, there are not so many libraries for others to use, most of them are for their own use, so static libraries are enough.

7.11 load and link a shared library from an application

The application may also load and link any shared libraries from the application, without the need to link those libraries to the application during compilation (this is awesome )!

Most updates in windows are based on this technology. In addition, a high-performance Web server is built.

Linux provides a series of simple interfaces for the dynamic linker:

# Include <dlfcn. h> void * dlopen (const char * filename, int flag); // load the shared library void * dlsym (void * handle, char * symbol ); // point to a shared library handle and a symbolic name. Int dlclose (void * handle); // download the shared library const char * dlerror (void); // Fault Tolerance

Java defines a standard calling rule called Java nativeinterface (JNI), which allows Java programs to call local C and C ++ functions. The basic idea of JNI is to compile local C functions, such as Foo, into a shared library, such as Foo. so. when a running Java program tries to call the function Foo, the Java parser uses the dlopen interface (or similar interface) to dynamically link and load Foo. so, and then call Foo.

7.12 location-independent code (PIC)

7.13 tools for processing target files

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.