GCC and OBJ files, dynamic link files, and elf files

Source: Internet
Author: User

(Reprinted) GCC and OBJ files, dynamic link files and elf files

--------------------------------------------------------------------------------
 
1. OBJ file
Programmers write programs by writing a binary file. If we declare a variable char C, that is, the Declaration requires an 8-bit space, then we need to declare to the system that the 8-bit space is reserved. How can we do this? It is to compile a special binary file-OBJ file. The execution file obtained by using the C language compiled by GCC contains not only CPU commands, but also a lot of other information, it has many coff, elf ...... In the last compilation process, the linker LD loads a pile of information into the executable file. For example, if there are multiple links after compilation. o this relocatable file, since the parameter or function name in these files is only in its own location. O the relative location of the file, there is some information to tell the link editor how to modify the section content for relocate, that is, re-reference the address to synthesize a new executable file.

There are two important periods for an OBJ file. One is when the object is being linked.
Disk; when the disk is being executed, it is in the memory. We usually say that LD linker is actually called Link editor. In the final compilation step, LD writes the information into the executable file. If it is static link, it will go to libxxx. a's function library file. After copying the desired program code snippet into an executable file and making it a relocation, write the reference that jumps and jumps into the executable file. This file can be executed.

2. Dynamic Link file
Compared with static link, when copying the original program code to an executable file, dynamic links do not. Link editor writes some information into the executable file. For example, if you need a library name or function name, you must call dynamic linker for program intepreter during the final execution. dynamic linker will follow the name of the required function library, create an executable image by using the desired function name in the memory. Therefore, the execution file with dynamic links is usually executed by the OS exec series system call and dynamic linker such as lD. so union is complete.

Dynamic linker usually does the following work:
(1) load the content of the executable file to process image
(2) load the items required by shared OBJ to process image
(3) Relocation completed

Originally, the virtual addresses in these OBJ files should be offset, and the first address of the file is usually 0x08040800. This is an absolute virtual address, but it is only suitable for executable files. For example, the Linux extuable file is usually:
File offset virtual address
---------------------------
0*0 0x08048000
0x100 0x08048100

The program code in the shared OBJ function library must be position independent code (PIC), which means that its address may vary with different processes. For example, a program only uses libc. so, ld-linux.so, usually this time Lib. so is from 0x40017000, But if another program uses one more libm. so, then libc. so the two printf references (reference) starting from 0x40034000 have different addresses. Therefore, the internal information of this dynamic function library must indicate that these codes are pic.

3. ELF File
(1) Introduction
Currently, the most commonly used is an execution file called the ELF format (executable and linkable format). Elf defines some variables and information to make dynamic links more flexible, there are six types of binary files for an elf, which are common in spec 1.1:

Relocatable: it is the. o file generated during compilation and contains code and data (the data is used when it is connected together with its relocated file and shared object file)

Executable: it is the final executable file, including the code and data shared OBJ: it is the dynamically linked library files under/lib/usr/lib, contains code and data (the data is used by the connector LD and runtime dynamic connector during connection)

Core: The file generated when core dump contains a bunch of garbage data.

Note: These elf files are binary files in a broad sense, not just executable files.

(2) Composition of elf
An elf OBJ file has different requirements and composition names as it exists. It is located on the hard disk during the linking period, including:
Elf Header
Program header table (optional)
SECTION 0
Section 1
Section 2
Section 3
Section...
Section N
Section header table

The elf header contains some ELF format recognition strings (commonly known as magic number) defined by elf, and the general information of OBJ files (shared OBJ, relocatable, or executable; the program header table describes the structure array of segment information and some information prepared for the program running. Segement is not the same as section. It is an element of the program execution period. Therefore, it is necessary during the program execution period and not during the link period. Therefore, if the program does not perform the link action, as long as the program header table exists, the section header table is an index table to record the indexes of each section. The sections is a small set of required data that is divided according to the attribute usage. BSS. data. init. debug. dynamic. fini. text ........., Among them, the most important ones are:

. Text
Saves real CPU commands
. BSS
Save data without initialize
Mainly declared global and static variables
. Data
Save initialize data

The function name used by the program. When the variable name is distributed in Multiple Source Code Directories,
Reference information is used to connect these names. symbol is used to connect linker. Because OBJ files are scattered, the Code set of these OBJ files should be combined, the string table contains many line strings. Each line string is separated by null, and each line string is the name of the symbol and section. A symbol table is a table that contains the definition and reference information of the symbol to be fixed or re-fixed in the future. The OBJ file of shared lib also contains the section. dynsym, which contains the dynamic symbol table and is used for dynamic link. In addition, if you want to debug a program using the debug tool in the future, you must add the-G option during compilation. It will put the required information to the OBJ file based on sumbol and string table, most of the information is stored in a format named stab, which also increases the size of the execution file by nearly three times.

In the different file types of ELF, all the information defined by Elf includes header section ...... It's just that the values are different.

Unix/Linux usually starts from a _ start function instead of a main function. _ start will call main later, so if you want to streamline the program, do not compile it with GCC, you can use _ start to Compile directly (^_^ ). In addition, for example, if section header table does not need to be linked or used, there are also symbol tables of executable files. In fact, all these can be used, but they must be compiled and used together with gas to generate executable files. In fact there are a lot of things, this is why even if there is no call to any function, made of dynamic files, with LDD to see there must be a ld-linux.so libc. So.

A process image in memory is as follows:
Elf Header
Program header table
Segment 0
Segment 1
Segment 2
Segment...
Segment n
Section header table (optional)

Segmenthas text,data, and the osdefinition is different. text.txt exists in the hard disk file. fini and other sections. The data section is based on. data. BSS and other sections. A segment usually contains one or more sections, which are more important to programmers.

In systems that support elf, a program consists of executable files or shared OBJ files. To execute such a program, the system uses those files to create the memory image of the process. To load an elf file to the memory, you must have a program header table (the program header table is a structure array describing the segment information and some information prepared for the program running ). Here are several special sections defined in the elf document. The following are especially useful to programs:

. Fini
Command for saving Process Termination code
Therefore, when a program Exits normally, the system arranges to execute the code in this section.
. Init
Saves executable commands, which constitute the initialization code of the process.
Therefore, when a program starts to run, before the main function is called (called Main in C ),
The system arranges to execute the code in this section

The existence of. init and. Fini sections has a special purpose. If a function is placed in the. init section, the system will execute it before the main function is executed. Similarly, if a function is placed in the. Fini section, the function will be executed after the main function returns. This feature is used by the C ++ compiler to complete global constructor and destructor functions.

When the elf executable file is executed, the system loads the related shared object file before giving control to the executable file. Construct the correct. init and. Fini sections. The constructor and destructor are called in the correct order.

The virtual memory usage in Unix/Linux is as follows:

User area

0x0 ~ 0x0bffffff-> 3 GB

Kernel Area

0x0c000000 ~ 0 xffffffff-> 1 GB

The following program code is used as an example:
Int global;

Static int func1 (void)
{
Static int B;
Int * C;
Int D;
Func2 ();
Return 1;
}
Int func2 (void)
{
Int C;
Static int D;
Return 2;
}
Int main (void)
{
Int;
Static int B;
Int init = 3;
Func1 ();
Return 3;
}

It looks like this in the memory from a Linux execution file:

I386 Linux execution Image

Virtual Address Allocation

| ---------------------------------- | 0x0
| ----------------------------- |
|
| Thread stack |
| ------------------------------ |
|
| ------------------------------ | 0x08048000 text
| Executable | data
| ......
|
|
|
| ------------------------------ |
|
| ------------------------------ | 0x40000000 ld-linux.so
| Libm. So
| Shared lib | libc. So
|
| Stack |
|
3 GB | ----------------------------- |
|
| ------------------------------ | 0xc0000000
|
| Kernel code and data |
|
| ------------------------------- |
4 GB | --------------------------------- | 0 xffffffff

0x08048000 ~ 0x40000000 ~ 0xc0000000 exists in this way.

Image from the C perspective:

Zero X 08048000
| -------------------------------------------------------- |
| ------------------------------------------------ |
| Main () |
| XXXX text |
| Func1 (function) |
| XXXX |
| Func2 |
| XXXX |
| ----------------------------------------------- |
|
| ----------------------------------------------- |
| Int global data |
| Static int B (main) Static int B (func1) |
| Static int C (func2) |
| ----------------------------------------------- |
|
| ----------------------------------------------- |
| Malloc (INT) Heap |
| ----------------------------------------------- |
|
|
|/|
| -------------------------------------------------------- |
| 0x40000000 |
|
|
| -------------------------------------------------------- |
|/|
|
|
| ----------------------------------------------- |
| Func2 int C stack 2 |
| ----------------------------------------------- |
|
| ----------------------------------------------- |
| Func1 int B stack 1 |
| ----------------------------------------------- |
|
| ----------------------------------------------- |
| Main () argv [0] argv [1]… |
| ----------------------------------------------- |
| -------------------------------------------------------- |
0 xbfffffff

Therefore, we can clearly understand the life cycle (Storage Class) of different variables (global, static or auto) and the effective scope (scope) of different variables ).
 
Of course, the kernel code and data are stored in the memory, so the actual address must be converted through the page table. In 0x0 ~ The page table in 0xbfffff. Each process has a different page table, but the page table under 0xc0000000 is the same.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.