Interpreting elf files with instances-Reading Notes (1)
Bkbll (bkbll@cnhonker.net, bkbll@tom.com)
2003/09/09
I. Prerequisites
There are many articles on the Internet that describe the formats and loading processes of ELF files. I think these articles are useful:
1. breadbox's <executable and linkable format (ELF)> English documents are downloaded in many places. The alert7 Homepage contains Chinese translations and original English texts.
Http://elfhack.whitecell.org/mydocs/elf.txt.
English: http://elfhack.whitecell.org/mydocs/ELF_chinese.txt
2. alert7's <elf dynamic symbolic parsing process (revision)>
Http://elfhack.whitecell.org/mydocs/ELF_symbol_resolve_process1.txt
But 2 is just about the process of dynamic parsing symbols. 1 is to list a lot of parameter structures, including the links and structures. If you start to look at them, you will surely be confused, because I want to write a program and use some elf files, I think there must be many people who are as confused as I am. This article is the right to take notes I have written, if it can help you at the same time, I will ignore this article.
This article assumes that you have read articles such as 1 or 2 and are familiar with Linux systems and GDB.
2. Analysis Platform
[Netconf @ linux1 elf] $ uname-
Linux linux1 2.4.18-14 #1 wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux
[Netconf @ linux1 elf] $ CAT/proc/version
Linux version 2.4.18-14 (bhcompile@stripples.devel.redhat.com) (GCC version 3.2 20020903 (Red Hat Linux 8.0 3.2-7) #1 wed Sep 4 13:35:50 EDT 2002
[Netconf @ linux1 elf] $ rpm-qf/usr/bin/readelf/usr/bin/hexdump
Binutils-2.13.90.0.2-2
Util-linux-2.11r-10
[Netconf @ linux1 elf] $
Iii. Examples/programs of analysis
[Netconf @ linux1 elf] $ cat elf8.c
# Include <elf. h>
Int foo1 ()
{
Printf ("[+] foo1 ADDR: % P/N", foo1 );
Foo2 ();
}
Int foo2 ()
{
Printf ("[+] foo2 ADDR: % P/N", foo2 );
Foo3 ();
}
Int foo3 ()
{
Printf ("[+] foo3 ADDR: % P/N", foo3 );
Foo4 ();
}
Int foo4 ()
{
Printf ("[+] foo4 ADDR: % P/N", foo4 );
}
Main ()
{
Foo1 ();
}
[Netconf @ linux1 elf] $ gcc-O elf8 elf8.c
[Netconf @ linux1 elf] $./elf8
[+] Foo1 ADDR: 0x8048328.
[+] Foo2 ADDR: 0x804834a
[+] Foo3 ADDR: 0x804836c
[+] Foo4 ADDR: 0x804838e
Iv. analysis process
1. Elf Header
First, the beginning of the ELF file is an EHDR structure. This structure is defined in/usr/include/elf. H. Let's look at this structure:
Typedef struct
{
Unsigned char e_ident [ei_nident];/* magic number and other info */
Elf32_half e_type;/* target file type */
Elf32_half e_machine;/* architecture */
Elf32_word e_version;/* object file version */
Elf32_addr e_entry;/* entry address */
Elf32_off e_phoff;/* program header table file offset */
Elf32_off e_shoff;/* section header table file offset */
Elf32_word e_flags;/* processor-specific flags */
Elf32_half e_ehsize;/* elf header size */
Elf32_half e_phentsize;/* size of each program header */
Elf32_half e_phnum;/* Total Number of program headers */
Elf32_half e_shentsize;/* Header size of each section */
Elf32_half e_shnum;/* Total Number of section headers */
Elf32_half e_shstrndx;/* index value of the detail table in section header table */
} Elf32_ehdr;
Except that elf32_half is two bytes (16 bits), other variables are defined as four bytes (32 bits ).
From the above structure, we can see that sizeof (elf32_ehdr) = 13*4 = 0x34
Let's take a look at the 52-byte content of the elf8 header:
[Netconf @ linux1 elf] $ hexdump-S 0-N 52-C elf8
00000000 7f 45 4C 46 01 01 01 00 00 00 00 00 00 00 |. Elf ...... |
00000010 02 00 03 00 01 00 00 00 00 78 82 04 08 34 00 00 | ...... x ...... 4. |
00000020 78 21 00 00 00 00 00 00 34 00 20 00 06 00 28 00 | x !...... 4... (.. |
00000030 22 00 1f 00 | "... |
OK. We will analyze the structure one by one:
E_ident [ei_nident]: 16 bytes: 7f 45 4C 46 01 01 01 00 00 00 00 00 00 00 00
Elfmag0 0x7f e_ident [0]
Elfmag1 'E' e_ident [1]
Elfmag2 'l' e_ident [2]
Elfmag3 'F' e_ident [3]
Elfclass32 1 e_ident [4]
Elfdata2lsb 1 e_ident [5]
Ei_version 1 e_ident [6]
The remaining values are all 0.
E_type: 2 bytes: 02 00 indicates the executable file (et_exec 2 executable file)
E_machine: 2 bytes: 03 00 System File 386 (em_386 3 intel 80386)
E_version: 4 Bytes: 01 00 00 00 and ei_version in e_ident have the same meaning.
E_entry: 4 Bytes: 78 82 04 08 indicates the program entry address 0x08048278
E_phoff: 4 Bytes: 34 00 00 00 indicates the offset of the program head table in the file (start position)
E_shoff: 4 Bytes: 78 21 00 00 represents the offset of Section Head table in the file (start position)
E_flags: 4 Bytes: 00 00 00
E_ehsize: 2 bytes: 34 00 indicates the size of the elf header, which is actually sizeof (elf32_ehdr)
E_phentsize: 2 bytes: 20 00 indicates the size of each program header (0x20)
E_phnum: 2 bytes: 06 00 indicates the total number of program headers (0x06)
E_shentsize: 2 bytes: 28 00 indicates the header size of each section (0x28)
E_shnum: 2 bytes: 22 00, indicating the number of section headers
E_shstrndx: 2 bytes: 1f 00 indicates the index value of section string table in section header table. (that is, the position and size of section string table are described in the section)
Is it easy? Let's take a look at the analysis results of readelf:
[Netconf @ linux1 elf] $ readelf-H elf8
Elf header:
Magic: 7f 45 4C 46 01 01 00 00 00 00 00 00 00 00 00 00
Class: elf32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: Unix-System V
Abi version: 0
Type: exec (Executable File)
MACHINE: Intel 80386
Version: 0x1
Entry Point address: 0x8048278
Start of program headers: 52 (bytes into file)
Start of section headers: 8568 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 6
Size of section headers: 40 (bytes)
Number of section headers: 34
Section header string table index: 31
2. Program Header
Except that the starting elf header position is fixed, other locations are associated, or they are all read from the elf header.
See the definition of this structure in elf. h:
Typedef struct
{
Elf32_word p_type;/* segment type */
Elf32_off p_offset;/* segment file offset */
Elf32_addr p_vaddr;/* segment virtual address */
Elf32_addr p_paddr;/* segment physical address */
Elf32_word p_filesz;/* segment size in file */
Elf32_word p_memsz;/* segment size in memory */
Elf32_word p_flags;/* segment flags */
Elf32_word p_align;/* segment alignment */
} Elf32_phdr;
Let's calculate the size of this structure. sizeof (elf32_phdr) = 0x20. From the preceding elf header information, we can also know that the length is 0x20, there are six such phdr In the ELF file. The initial phdr is at the 0x34 offset of the file. we can read it like this:
Lseek (FP, 0x34,0 );
Fread (buffer, 6, 0x20, FP );
Read all phdr data into the buffer.
We analyze a structure:
[Netconf @ linux1 elf] $ hexdump-S 52-N 32-C elf8
00000034 06 00 00 00 34 00 00 00 34 80 08 34 80 04 08 | ...... 4 ...... 4 ...... |
00000044 C0 00 00 00 C0 00 00 00 05 00 00 04 00 00 00 | ?..?.......... |
Analyze by structure:
P_type: 4 Bytes: 06 00 00, indicating that the segment is of the pt_phdr type (its own entry)
P_offset: 4 Bytes: 34 00 00 00 offset in the file.
P_vaddr: 4 Bytes: 34 80 04 08 Virtual Address: 0x08048034
P_paddr: 4 Bytes: 34 80 04 08 physical address: 0x08048034
P_filesz: 4 Bytes: C0 00 00 00 segment size: 0xc0
P_memsz: 4 Bytes: C0 00 00 00 memory size: 0xc0
P_flags: 4 Bytes: 05 00 00 00 field mark
P_align: 4 Bytes: 04 00 00 00
The phdr structure is roughly the same as above. Each structure defines the features of the ELF file and the attributes and sizes of each segment.
Let's take a look at the pt_interp segment information:
[Netconf @ linux1 elf] $ hexdump-s 84-N 32-C elf8
00000054 03 00 00 00 F4 00 00 00 F4 80 04 08 F4 80 04 08 | ....?..?..?.. |
00000064 13 00 00 00 13 00 00 00 04 00 00 00 00 | .......... |
This segment indicates the path/file name of the program dependency interpreter, which is defined at the offset of 0xf4 and the size is 0x13 bytes.
[Netconf @ linux1 elf] $ hexdump-s 0xf4-N 19-C elf8
201700f4 2f 6C 69 62 2f 6C 64 2D 6C 69 6e 75 78 2E 73 6f |/lib/ld-linux.so |
00000104 2E 32 00 |. 2. |
The ELF File depends on/lib/ld-linux.so.2 to explain.
We can use readelf to check:
[Netconf @ linux1 elf] $ readelf-l elf8
ELF file type is Exec (Executable File)
Entry point 0x8048278
There are 6 program headers, starting at offset 52
Program headers:
Type offset incluaddr physaddr filesiz memsiz flg align
Phdr 0x000034 0x08048034 0x08048034 0x000c0 0x000c0 r e 0x4
Interp 0x0000f4 0x080480f4 0x080480f4 0x00013 0x00013 R 0x1
[Requesting program Interpreter:/lib/ld-linux.so.2]
Load 0x000000 0x08048000 0x08048000 0x00454 0x00454 r e 0x1000
Load 0x000454 0x08049454 0x08049454 0x00104 0x00108 RW 0x1000
Dynamic 0x000464 0x08049464 0x08049464 0x000c8 0x000c8 RW 0x4
Note 0x000108 0x08048108 0x08048108 0x00020 0x00020 R 0x4
Section to segment mapping:
Segment sections...
00
01. interp
02. interp. note. abi-tag. hash. dynsym. dynstr. GNU. version. GNU. version_r. rel. dyn. rel. PLT. init. PLT. text. fini. rodata
03. Data. eh_frame. Dynamic. ctors. dtors. JCR. Got. BSS
04. Dynamic
05. Note. Abi-tag
[Netconf @ linux1 elf] $