Elf Format analysis

Source: Internet
Author: User

Recently studied the elf file format, found that a lot of information written more cumbersome, it may be a serious blow to the enthusiasm of the learners, I put my research results and share with you, I hope that my descriptive narrative can be concise.

First, the basic knowledge

Elf is a file format for storing Linux programs. What information does it have inside? Probably contains the prepared computer instructions, data, the computer when needed to read the file into memory, the CPU can be from the memory of a single read instruction to run.

So to be clear about the ELF format, we should be aware of the information needed to run the computer program. So in this section, we add some basic knowledge of computer systems.

Process and virtual memory:

The Linux system allocates 4GB of space to each process, where 0xc0000000 to 0xFFFFFFFF is reserved for the system, mainly for the system (Linux kernel) and process communication and exchange data, the user can use 3GB of space from ( 0X00000000-0XBFFFFFFF).

In fact, the memory of the computer is not so large, for example, we actually use the computer is only 2G, once smaller, just hundreds of M, and a computer not only perform a process, a 4G, assuming there are 10 processes, it will have to use 40G, which has so much memory? In fact this does not matter, because the operating system assigned to the user is virtual memory, the program to be able to use 3 g of memory. As for how the operating system translates virtual memory into physical memory, there is no need for the project architect to develop the application. We can use virtual memory directly, without worrying about other processes violating your memory space.

Creation and execution of process creation and execution:

Basically went through the following steps

1. When the user requests the execution of the program, the operating system reads the executable file stored on the disk, and on the Linux system this file is our elf format file, allocating 4G of virtual memory space to the user.

2. According to the information indicated in the file, put different file contents into the 3g virtual memory allocated for you

3. Then according to the instructions of the file, the system settings set the code snippet and the data segment register

4. Then according to the instructions of the file, jump to the user's Code entry address (usually our main function)

5. From the beginning of main, the computer runs the instructions we gave, processing our data, until the end of our program. Although in this process, the system will switch to other processes many times, but to the user program has no effect, we can feel that the computer just for us to serve.

Through the above we have seen many times that the computer is based on documents indicating this language, so learning elf first to understand the elf indicates that information.

Second, can run elf file.

There are three types of elf files: 1, target files (usually. o); 2. executable file (our execution file) 3, dynamic Library (. So)

Let's talk about the executable file first.

The running files are generally divided into 4 parts, which can be expanded, and we understand that 4 parts is enough.

1, elf file header, this file is a description of the overall information of the elf file, in the 32-bit system is 56 bytes, under the 64-bit system is 64 bytes.

For the executable file, the file header includes information related to process initiation

E_entry Program Entry Address

E_phoff Segment Offset

E_phnum Segment Quantity

2. Segment table, this table is a load indicator, the operating system (the exact loader, some elf files, such as the operating system kernel, is loaded by other programs), the structure of the table is very important.

typedef struct
{
Elf64_word P_type; /* Segment Type */
Elf64_word P_flags; /* Segment Flags *//*segment permissions, 6 for Read and write, 5 for readable and operational
Elf64_off P_offset; /* Segment File Offset *////////*
Elf64_addr p_vaddr; /* Segment Virtual Address */
Elf64_addr p_paddr; /* Segment Physical Address/* Physical memory addresses, this field is useless for applications
Elf64_xword P_filesz; /* Segment size in File */* * length in Files */
Elf64_xword P_memsz; /* Segment size in memory *//length in RAM, general and P_filesz values */
Elf64_xword p_align; /* Segment Alignment *//* Paragraph alignment */

} ELF64_PHDR;


3. Elf theme, for the operational files, the most basic is the data segments and code snippets

4. Section table, for the executable file, not practical, when the link is practical, is the code snippet data segment in the link is a descriptive narrative.

The composition of the entire Elf file can be used to describe the narrative


This image uses a picture of the Linux C programming author Song

The Program Header table is actually what we call the segment table. Segments is to describe the elf file from the angle of execution, and sections describes the elf file from a link point of view.

In this section we'll just run the elf files, so we'll just talk about segment related content.

We will use a sample to explain the system loading ELF process (64-bit platform).

We're writing a simple assembler program.

. Section. Data
. Global Data_item
Data_item:
. Long 3,67,28
. section. Text
. Global _start
_start:
MOV $1,%eax
MOV $4,%ebx
int $0x80

After compiling the link to generate the Hello file, we parse the hello file.

Run: readelf-h. /asm/hello (Readelf-h is the command to read the elf file header)

ELF Header:
magic:7f 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class:elf64
Data:2 ' s complement, little endian
Version:1 (current)
Os/abi:unix-system V
ABI version:0
Type:exec (executable file)
machine:advanced Micro Devices x86-64
version:0x1
Entry Point address:0x4000b0//The entry address of the program is 0x4000b0
Start of program headers:64 (bytes into file)//segment table at file 64 byte offset
Start of section headers:240 (bytes to file)
flags:0x0
Size of this header:64 (bytes)
Size of program headers:56 (bytes)//segment Header Item length is 56 bytes (32 system is 32 bytes)
Number of program Headers:2
Size of section headers:64 (bytes)
Number of section Headers:6
Section header string Table Index:3

For the loading of the program, we care about these three items:

Entry Point address:0x4000b0//The entry address of the program is 0x4000b0

Start of program headers:64 (bytes into file)//segment table at file 64 byte offset

Size of program headers:56 (bytes)//segment Header Item length is 56 bytes (32 system is 32 bytes)

The above tells us that the segment table is at 64 bytes of the file, so let's look at 64 bytes of content.


Run Readelf-l. /asm/hello output segments information. (Readelf-l read segments)

Program Headers:
Type Offset virtaddr physaddr
Filesiz Memsiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0X00000000000000BC 0X00000000000000BC R E 200000
LOAD 0X00000000000000BC 0X00000000006000BC 0X00000000006000BC
0x000000000000000c 0x000000000000000c RW 200000

Section to Segment Mapping:
Segment Sections ...
xx. Text

. Data

We see that the program has two segment, respectively called. Text and. Data

The offset of the. Text is 0,filesiz is 0x0,memsiz is 0XBC, Virtaddr is 0x400000,flags is R E, Indicates that loading will load the contents of the elf file from 0 bytes until OXBC into virtual memory at 0x400000, occupying 0xbc of memory. The permissions to set this memory are re (readable, operational), which is exactly the ELF header, segments table, and code snippet.

Take a look at the address of Elfheader's E_entry 0x4000b0, which is exactly the starting address of the code snippet.

The offset of. Data is 0,filesiz is 0xbc,memsiz is 0x0c, Virtaddr is 0x6000bc,flags is R W, indicating that loading will start the elf file from BC Byte to OXBC + The content at 0XC is loaded into the virtual memory at 0X6000BC, occupying 0x0c-length memory. The permission to set this memory is re (readable, can be run)

Why the de facto address of the data segment is 0X6000BC, not 0x6000000, which is determined by align, Align determines the memory and disk to be mapped in 1M, in the file. Data and. Text is in a page, in the mapping, the entire page is mapped directly to the 0x6000000, so the data segment offset is set to 0x60000bc,0x600000 to 0X6000BC content is not used.

With the above, the system is able to create a process based on the elf file.

In the next section, we'll cover the process of static link compilation.











Elf Format analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.