Introduction to the elf file format2014-11-02
The elf represents executable and linkable forma, a file format used for executable files, destination files, and libraries, similar to the PE file format under Windows. The ELF format was developed and published by the UNIX Systems Lab as a ABI (Application Binary Interface) and was already a standard form of Linux. This paper uses the following simple program to specifically describe the format of the elf file, it is recommended to follow the binary code of the program to read this article.
#include <stdio.h>
int Add (int a,int b)
{
printf ("number are added together\n");
return a + B;
}
int main ()
{
int a,b;
A = 3;
b = 4;
int ret = Add (a,b);
printf ("result:%u\n", ret);
Exit (0);
}
GCC test.c-o Test
gcc test.c-c-o TEST.O
I. Elf Overview
The ELF consists mainly of three types of files: relocatable files (relocatable): Compiler and assembler-generated. o files, which are linker by the executable (executable): linker files that process output for. o files, and processes image shared object files ( Shared object: Dynamic library file. So
The following are three types of examples:
The elf layout is as follows:
As you can see from the diagram, the elf file is conceptually comprised of 5 parts:
ELF header, describing basic information such as architecture and operating system, indicating the location of the section Header table and Program Header table in the file
Program Header table, this is from the operational point of view of the Elf file, mainly gives the information of each segment, in the assembly and link process is useless
Section Header table, which holds all the section information, which is from the perspective of compilation and linking to the Elf file
Sections, that's all the sections.
segments, which is the segment at run time
Note that we can see in fact that sections and segments occupy the same place as explained above. This is from the point of view of linking and loading. The left side is the link view, the right side is the load view, the sections is visible to the programmer, is the concept used for the linker, and segments is not visible to the programmer, is to the loader use the concept. Typically, a segment contains multiple section. Windows PE does not have this Program Header table and section Header table points are unified as section, and are processed only at load time. So the program Header table and the section Header table are optional. Two. The composition structure of the elf
Before introducing this section, put the size of each type of data structure in the definition here.
(1) ELF header
The ELF Header describes basic information such as architecture and operating system, and indicates where the section Header table and Program Header table are located in the file, and each member's interpretation is shown in the comment.
Here is a brief explanation of the meaning of the last field E_shstrndx, "E_shstrndx" is the last member of ELF32_EHDR, which is the abbreviation for section Header string Table index. We know that the Segment table string table itself is also an ordinary paragraph in the Elf file, knowing that its name is often called ". Shstrtab". Then this "e_shstrndx" means ". Shstrtab" the subscript in the segment table, that is, the subscript of the Segment Table string table in the segment table.
Here is the value for each data member of the ELF header structure for test:
You can see the basic information of this elf, such as architecture and operating system, there are 30 section Header table, starting from 4420, each 40 bytes, the Program Header table has 9 segment, Each 32 bytes. Then look at the details from the byte code above. Some structures are marked and can be viewed against the structure above.
(2) Program Header table and Grogram header entry
The Program Header table looks at the elf file from the load point of view, and the destination file does not have the table, and each table entry provides the size, location, flag, access, and information about each segment in the virtual address space and the physical address space. As you can see from the above, there are 9 segment in test, as shown in the following figure:
Some of these are briefly described below. PHDR Save the Header table INTERP specifies the interpreter that must be invoked after the program has been mapped from an executable file to memory. In this case, the interpreter does not imply that the contents of the binary file must be interpreted by another program. It refers to a program that addresses unresolved references by linking to other libraries. Typically, libraries such as/lib/ld-linux.so.2,/lib/ld-linux-ia-64.so.2, and so on, are used to insert the dynamic libraries required for the program to run in the virtual address space. For almost all programs, it is possible that the C standard library must be mapped. The various libraries that need to be added include GTK, math libraries, libjpeg, and so on. Load represents a segment that needs to be mapped from a binary file to a virtual address space. It holds constant data (such as strings), the program's target code, and so on. The dynamic segment holds information that is used by the dynamics Linker (that is, the interpreter specified in Interp). Note Saves the proprietary information
A entry corresponds to a segment, represented by the following data structure
typedef struct
{
/*segment type: pt_load= 1 can be loaded segment * *
Elf32_word p_type;
/* The offset from the header of the file to the first byte of the paragraph
/* Elf32_off P_offset;
* * The first byte of the paragraph is put into the memory of the virtual address
/elf32_addr p_vaddr;
/* In Linux This member does not have any meaning, the value and p_vaddr the same * *
elf32_addr p_paddr;
* * The number of bytes of this segment in the file image
/Elf32_word P_filesz;
/* The number of bytes occupied by the segment in the memory image * *
Elf32_word P_memsz;
* * Section logo * *
elf32_word p_flags;
/*P_VADDR is aligned
/Elf32_word p_align
} ELF32_PHDR;
(3) Section Header table and section header entry
The section table header contains the sections in the file, each of which specifies a type that defines the semantics of the section data. Each section specifies the size and offset within the binary file. As you can see from the above, there are 30 section in test, as shown in the following figure:
Some of these are briefly described below:. interp saves the file name of the interpreter, which is an ASCII string. Data saves the initialization, which is part of the normal program data that can be modified when the program is run. Rodata saves read-only data and can be read but not modified. For example, the compiler encapsulates all static strings that appear in a printf statement to the section. Init and. Fini Save the code used to process initialization and completion, which are usually automatically added by the compiler. Gnu.hash is a hash table that allows you to quickly access all of the symbol list entries without a linear search of the entire TABLE element
The structure of the section is defined as follows:
typedef struct{
/* Section name * *
Elf32_word sh_name;
/* Section type: progbits-program-defined information, nobits-does not occupy the file space (BSS), rel-Relocation Table entry * *
Elf32_word sh_type;
/* Each bit represents a kind of information, whether the contents of the section can be modified, whether it can be executed and other information * *
Elf32_word sh_flags;
/* If the section will appear in the memory impact of the process, this member gives the position of the first byte of the section should be in place * *
elf32_addr sh_addr;
The offset between the first byte of the/* section and the header of the file * *
elf32_off sh_offset;
/* section length, Unit byte, nobits although this value is not 0 but does not occupy the space in the document * * *
elf32_word sh_size;
* * Section Head Table Index link *
/elf32_word sh_link;
* * Section additional information * *
Elf32_word sh_info;
* * Section with address-aligned constraints
/Elf32_word sh_addralign;
/* Some sections contain fixed-size items, such as symbol tables, so this member gives a fixed size
/Elf32_word sh_entsize;
} ELF32_SHDR;
This is the general structure of the elf, and there is time to summarize several of the more important sections.