GCC and OBJ files, dynamic link files, and elf files

Source: Internet
Author: User

1. OBJ file

Programmers write programs by writing a binary file. If we declare a variable char C, that is, the Declaration requires an 8-bit space, then we need to declare to the system that the 8-bit space is reserved. How can we do this? It is to compile a special binary file-OBJ file. The execution file obtained by using the C language compiled by GCC contains not only CPU commands, but also a lot of other information, it has many coff, elf ...... In the last compilation process, the linker LD loads a pile of information into the executable file. For example, if there are multiple links after compilation. o this relocatable file, since the parameter or function name in these files is only in its own location. o file relative location, there is some information to tell the link Editor (Link
Editor.

There are two important periods for an OBJ file. One is when the object is being linked.

Disk; when the disk is being executed, it is in the memory. We usually say that LD linker is actually called Link editor. In the final compilation step, LD writes the information into the executable file. If it is static link, it will go to libxxx. a's function library file. After copying the desired program code snippet into an executable file and making it a relocation, write the reference that jumps and jumps into the executable file. This file can be executed.

2. Dynamic Link file

Compared with static link, when copying the original program code to an executable file, dynamic links do not. Link editor writes some information into the executable file. For example, the Library name and function name are required. During the final execution, the dynamic linker must be called for program.

Intepreter and dynamic linker will create an executable Image Based on the name of the required function library and place it in the memory. Therefore, the execution file with dynamic links is executed, generally, the system call and dynamic linker of the exec series of the OS, such as lD. so union is complete.

Dynamic linker usually does the following work:

(1) load the content of the executable file to process image

(2) load the items required by shared OBJ to process image

(3) Relocation completed

Originally, the virtual addresses in these OBJ files should be offset, and the first address of the file is usually 0x08040800. This is an absolute virtual address, but it is only suitable for executable files. For example, the Linux extuable file is usually:

File offset virtual address

---------------------------

0*0 0x08048000

0x100 0x08048100

The program code in the shared OBJ function library must be position-independent code.

Independent code (PIC), that is, its address may vary with different processes. For example, a program only uses libc. so, ld-linux.so, usually this time Lib. so is from 0x40017000, But if another program uses one more libm. so, then libc. so the two printf references (reference) starting from 0x40034000 have different addresses. Therefore, the internal information of this dynamic function library must indicate that these codes are pic.

3. ELF File

(1) Introduction

Currently, the most commonly used is an execution file called the ELF format (executable and linkable format). Elf defines some variables and information to make dynamic links more flexible, there are six types of binary files for an elf, which are common in spec 1.1:


Relocatable: it is the. o file generated during compilation and contains code and data (the data is used when it is connected together with its relocated file and shared object file)


Executable: it is the final executable file, including the code and data shared OBJ: it is the dynamically linked library files under/lib/usr/lib, contains code and data (the data is used by the connector LD and runtime dynamic connector during connection)


Core: The file generated when core dump contains a bunch of garbage data.


Note: These elf files are binary files in a broad sense, not just executable files.



(2) Composition of elf

An elf OBJ file has different requirements and composition names as it exists. It is located on the hard disk during the linking period, including:

Elf Header

Program header table (optional)

SECTION 0

Section 1

Section 2

Section 3

Section...

Section N

Section header table

The elf header contains some ELF format recognition strings (commonly known as magic number) defined by elf, and the general information of OBJ files (shared OBJ, relocatable, or executable; the program header table describes the structure array of segment information and some information prepared for the program running. Segement is not the same as section. It is an element of the program execution period. Therefore, it is necessary during the program execution period and not during the link period. Therefore, if the program does not perform the link action, if there is a program Header
A table can be used; a section header table is an index table to record the indexes of each section. A sections is a small set of data that is categorized according to the attribute usage. BSS. data. init. debug. dynamic. fini. text ........., Among them, the most important ones are:

. Text

Saves real CPU commands

. BSS

Save data without initialize

Mainly declared global and static variables

. Data

Save initialize data


The function name used by the program. When the variable name is distributed in Multiple Source Code Directories,

Reference information is used to connect these names. symbol is used to connect linker. Because the OBJ files are scattered, it is necessary to combine the Code sets of these OBJ files.

String table contains many line strings. Each line string is separated by null, and each line string is the name of symbol and section. A symbol table is a table that contains the definition and reference information of the symbol to be fixed or re-fixed in the future. The OBJ file of shared lib also contains the section. dynsym, which contains the dynamic symbol table and is used for dynamic link. In addition, if you want to debug a program using the debug tool in the future, add the-G option during compilation.

And string table are put into debug and many required information to the OBJ file. Most of such information is now stored in a format called Stab, this will also increase the size of the execution file to nearly three times.

In the different file types of ELF, all the information defined by Elf includes header section ...... It's just that the values are different.

Unix/Linux usually starts from a _ start function instead of a main function. _ start will call main later, so if you want to streamline the program, do not compile it with GCC, you can use _ start to Compile directly (^_^ ). In addition, for example, if section header table does not need to be linked or used, there are also symbol tables of executable files. In fact, all these can be used, but they must be compiled and used together with gas to generate executable files. In fact there are a lot of things, this is why even if there is no call to any function, made of dynamic files, with LDD to see there must be a ld-linux.so libc. So.

A process image in memory is as follows:

Elf Header

Program header table

Segment 0

Segment 1

Segment 2

Segment...

Segment n

Section header table (optional)

Segmenthas text,data, and the osdefinition is different. text.txt exists in the hard disk file. fini and other sections. The data section is based on. data. BSS and other sections. A segment usually contains one or more sections, which are more important to programmers.

In systems that support elf, a program consists of executable files or shared OBJ files. To execute such a program, the system uses those files to create the memory image of the process. To load an elf file to the memory, you must have a program header table (the program header table is a structure array describing the segment information and some information prepared for the program running ). Here are several special sections defined in the elf document. The following are especially useful to programs:

. Fini

Command for saving Process Termination code

Therefore, when a program Exits normally, the system arranges to execute the code in this section.

. Init

Saves executable commands, which constitute the initialization code of the process.

Therefore, when a program starts to run, before the main function is called (called Main in C ),

The system arranges to execute the code in this section


The existence of. init and. Fini sections has a special purpose. If a function is placed in the. init section, the system will execute it before the main function is executed. Similarly, if a function is placed in the. Fini section, the function will be executed after the main function returns. This feature is used by the C ++ compiler to complete global constructor and destructor functions.

When the elf executable file is executed, the system will load the control before the executable file.

. Construct the correct. init and. Fini sections. The constructor and destructor are called in the correct order.

The virtual memory usage in Unix/Linux is as follows:


User area


0x0 ~ 0x0bffffff->; 3 GB


Kernel Area


0x0c000000 ~ 0 xffffffff->; 1 GB


The following program code is used as an example:

Int global;


Static int func1 (void)

{

Static int B;

Int * C;

Int D;

Func2 ();

Return 1;

}

Int func2 (void)

{

Int C;

Static int D;

Return 2;

}

Int main (void)

{

Int;

Static int B;

Int init = 3;

Func1 ();

Return 3;

}


It looks like this in the memory from a Linux execution file:


I386 Linux execution Image


Virtual Address Allocation


| ---------------------------------- | 0x0

| ----------------------------- |

|

| Thread stack |

| ------------------------------ |

|

| ------------------------------ | 0x08048000 text

| Executable | data

| ......

|

|

|

| ------------------------------ |

|

| ------------------------------ | 0x40000000 ld-linux.so

| Libm. So

| Shared lib | libc. So

|

| Stack |

|

3 GB | ----------------------------- |

|

| ------------------------------ | 0xc0000000

|

| Kernel code and data |

|

| ------------------------------- |

4 GB | --------------------------------- | 0 xffffffff


0x08048000 ~ 0x40000000 ~ 0xc0000000 exists in this way.


Image from the C perspective:


Zero X 08048000

| -------------------------------------------------------- |

| ------------------------------------------------ |

| Main () |

| XXXX text |

| Func1 (function) |

| XXXX |

| Func2 |

| XXXX |

| ----------------------------------------------- |

|

| ----------------------------------------------- |

| Int global data |

| Static int B (main) Static int B (func1) |

| Static int C (func2) |

| ----------------------------------------------- |

|

| ----------------------------------------------- |

| Malloc (INT) Heap |

| ----------------------------------------------- |

|

|

| \ |/|

| -------------------------------------------------------- |

| 0x40000000 |

|

|

| -------------------------------------------------------- |

|/| \ |

|

|

| ----------------------------------------------- |

| Func2 int C stack 2 |

| ----------------------------------------------- |

|

| ----------------------------------------------- |

| Func1 int B stack 1 |

| ----------------------------------------------- |

|

| ----------------------------------------------- |

| Main () argv [0] argv [1]… |

| ----------------------------------------------- |

| -------------------------------------------------------- |

0 xbfffffff


Therefore, we can clearly understand the life cycle (Storage Class) of different variables (global, static or auto) and the effective scope (scope) of different variables ).

The kernel code and data certainly exist in the memory, so in fact, they must go through the page table

To the actual address. In 0x0 ~ Page table in 0xbfffffff. Each process has different pages.

Table, but the page tables under 0xc0000000 are the same.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.