elf file FormatThe title of this section of the article is called < extract the eigenvalues of so file, and here you have to say the format of the elf file. Because files such as. O,. So, and. exe are suffix files are elf-formatted. The ELF file format is a binary file. It is called the format of the executable link (executable and linking Format). The ELF format file standard is selected as a portable target file that is divided into three types:
- relocatable files (relocatable file) (*.0) contains code and data that are useful for linking to other destination files to create executables or to share destination files.
- The executable (executable file) (*.exe) contains a program for execution that specifies how exec () creates an image of a program.
- The Shared object file (*.so) contains code and data that can be linked in both contexts. First, the link editor can work with other relocatable files and shared files
data types for elf
the head structure of the elf file
typedef struct{elf32_word SH_NAME; The section name, which is the index of the section header string table stanza (section header, String, table). The name is a NULL-terminated string. Elf32_word Sh_type; For the section type Elf32_word sh_flags; Festival zone Mark Elf32_addr sh_addr; If the section area appears in the memory image of the process, this member gives the location where the first byte of the section should be located. Otherwise, this field is 0. Elf32_off Sh_offset; The value of this member gives the offset between the first byte of the section area and the header of the file. Elf32_word sh_size; This member gives the length of the section area (in bytes). Elf32_word Sh_link; This member gives a link to the Header table index in the section area. The specific explanation depends on the section type. Elf32_word Sh_info; This member gives additional information, and its interpretation depends on the section type. Elf32_word sh_addralign; Some section areas have address alignment constraints. Elf32_word sh_entsize; Some section areas contain fixed-size items, such as symbol tables. For this type of section, this member gives the length bytes of each table entry. }ELF32_SHDR;
every time I see an elf file, the thing that always comes to mind is bugs, because they are all in one verse. (Perhaps this statement is inaccurate) The elf file consists of a number of section sections, some of which are system-scheduled, and some are custom-made by the user.What we need to highlight here are some special sections, which are scheduled by the system, and we're here to mention the eigenvalues of the extracted section areas that we need later. Dynamic dynamically linked information. Dynsym This section contains the dynamic Link symbol table. Got this section contains the Global offset table. PLT Process Link Table. Text This section contains the executable instructions for the program
If you want to learn more about this, you can refer to this article:
elf file Format summary
readelf Tools
Readelf is a command for analyzing elf files under Linux, which is useful when parsing elf file formats. We also use this tool when extracting so files.
You can download it here to: http://download.csdn.net/detail/grace_0642/8562495
Here is a brief introduction to its usage:
1. Display Elf header file header information
Readelf-h file
===================================
2. View the file's Program Header table information
Readelf-i file
===================================
3. Displaying section information for a file
Readelf-s file
====================================
4. Display Dynamic section Information
readelf-d file
==================================
What else you need to know.
Usage of 1.awk
You can refer to this article and write it well link: http://coolshell.cn/articles/9070.html
AWK Concise Tutorial
CodeStructure Description:
The Python script invokes the shell script. The shell script calls the Readelf tool to read the so file content information, and the Readelf tool and script are placed under the same path.
Python script code:
' @Author: chicho@date:2014-12-5@function:elf parser@running:pyhton elf_extract.py/path/to/so ' import osimport s Ysif (Len (SYS.ARGV) <2): print ("*usage:python elf_extract.py/path/to/so") Else: path = sys.argv[1] FileList = Os.listdir (path) "We'll put the readelf file in the path of so files so, we canextract the features of E LF ' for filename in fileList: portion = os.path.splitext (filename) # find the file if portion[1]== ". So":
os.system ("./moreelf_finefeatures_extract.sh" + filename) print "The End"
Shell script:
#!/bin/bashinput=$1if [$#-lt 1]; Then echo "Usage: $0/path/to/libxxx.so" Exit 1fireadelf=./readelf entry_point_addr=$ ($READELF-H $INPUT | grep "Entr Y point address: "| Egrep-o "0x[0-9a-za-z]*") start_section_headers=$ ($READELF-H $INPUT | grep "Start of section headers:" | Egrep-o "[0-9]*") num_programs=$ ($READELF-H $INPUT | grep "Number of program headers:" | Egrep-o "[0-9]*") size_section_headers=$ ($READELF-H $INPUT | grep "Size of section headers:" | Egrep-o "[0-9]*") num_section_headers=$ ($READELF-H $INPUT | grep "Number of section headers:" | Egrep-o "[0-9]*") string_table_index=$ ($READELF-H $INPUT | grep section Header string table index: "| Egrep-o "[0-9]*") dynamic_section=$ ($READELF-D $INPUT | grep "dynamic section at" | egrep-o "[0-9]* entries" | egrep-o "[0-9]*") dynsym_entries=$ ($READELF-S $INPUT | grep "Symbol table". Dynsym ' contains "| egrep-o" [0-9]* ") num_rel_dyn=$ ($R Eadelf-r $INPUT | grep "Relocation section". Rel.dyn ' at ' | Egrep-o "[0-9]* Entries" | Egrep-o "[0-9]*") num_rel_plt=$ ($READELF-R $INPUT | grep "Relocation section". Rel.plt ' at ' | egrep-o ' [0-9]* entries ' | Egrep-o "[0-9]*") echo $entry _point_addr $start _section_headers $num _programs $size _section_headers $num _section_ Headers $string _table_index $dynamic _section $dynsym _entries $num _rel_dyn $num _rel_plt $label >> more_ Finefeatures_result.txt
For example, if we want to do something else, such as judging the eigenvalues of a particular so file:
#!/bin/bashinput=$1if [$#-lt 1]; Then echo "Usage: $0/path/to/libxxx.so" Exit 1fireadelf=./readelf entry_point_addr=$ ($READELF-H $INPUT | grep "Entr Y point address: "| Egrep-o "0x[0-9a-za-z]*") start_section_headers=$ ($READELF-H $INPUT | grep "Start of section headers:" | Egrep-o "[0-9]*") num_programs=$ ($READELF-H $INPUT | grep "Number of program headers:" | Egrep-o "[0-9]*") size_section_headers=$ ($READELF-H $INPUT | grep "Size of section headers:" | Egrep-o "[0-9]*") num_section_headers=$ ($READELF-H $INPUT | grep "Number of section headers:" | Egrep-o "[0-9]*") string_table_index=$ ($READELF-H $INPUT | grep section Header string table index: "| Egrep-o "[0-9]*") dynamic_section=$ ($READELF-D $INPUT | grep "dynamic section at" | egrep-o "[0-9]* entries" | egrep-o "[0-9]*") dynsym_entries=$ ($READELF-S $INPUT | grep "Symbol table". Dynsym ' contains "| egrep-o" [0-9]* ") num_rel_dyn=$ ($R Eadelf-r $INPUT | grep "Relocation section". Rel.dyn ' at ' | Egrep-o "[0-9]* Entries" | Egrep-o "[0-9]*") num_rel_plt=$ ($READELF-R $INPUT | grep "Relocation section". Rel.plt ' at ' | egrep-o ' [0-9]* entries ' | Egrep-o "[0-9]*") if [["$" =~ "libsecmain" *]]then label= "Bangcle1" Elif [["$" =~ "libsecexe" *]]then label= "Bangcle2" elif [["$" =~ "libtup" *]]then label= "Tencent" Elif [["$" =~ "libprotectclass" *]]then label= "Qihoo" elif [[] "=~" Li Bexecmain "*]]then label=" IJIAMI1 "Elif [[" $ "=~" libexec "*]]then label=" ijiami2 "Elif [[" $ "=~" libapkprotect "*]]then Label= "APKProtect1" Elif [["$" =~ "libcube-jni" *]]then label= "APKProtect2" Elif [["$" =~ "libminimapv320" *]]then Lab El= "APKProtect3" Elif [["$" =~ "Libswiperctrl" *]]then label= "APKProtect4" Else label= "Unknow" Fiecho $entry _point_addr $start _section_headers $num _programs $size _section_headers $num _section_headers $string _table_index $dynamic _ Section $dynsym _entries $num _rel_dyn $num _rel_plt $label >> more_finefeatures_result.txt
Extracting the eigenvalues of so files