Linux/unix platform executable file format analysis

Source: Internet
Author: User
Tags readable relative

Introduction: This article discusses three main executable file formats under the Unix/linux platform: a.out (Assembler and link editor output assembler and link editor outputs), COFF (Common Object File format Common Object File format), ELF (executable and linking format executable and link format). The first is a review of the executable file format and a description of the relationship between the contents of the executable file and the load-carrying operation by describing the ELF file loading process. Then this paper discusses the three file formats, and focuses on the dynamic connection mechanism of ELF files, which also interspersed with the evaluation of the advantages and disadvantages of various file formats. Finally, there is a simple summary of the three executable file formats, and some thoughts about the author's evaluation of the document format are presented.

Overview of executable file formats

Executables may be the most important file type in an operating system, relative to other file types, because they are the true performer of the operation. The size, speed, resource footprint, scalability, portability, etc. of the executable file are closely related to the definition of file format and the file loading process. Studying the format of executable files is very useful for writing high-performance programs and using some hacker techniques.

Regardless of the executable format, some basic elements are necessary, and it is obvious that the file should contain code and data. Because files may reference external file-defined symbols (variables and functions), it is also necessary to reposition information and symbolic information. Some ancillary information is optional, such as debugging information, hardware information, and so on. Basically any type of executable file format is to save the above information in intervals, called segments (Segment) or sections (section). The meaning of the middle and section of the different file formats may be subtle, but it is clear from the context that this is not a critical issue. Finally, the executable file usually has a header to describe the overall structure of the file.

Relative executables have three important concepts: compile (Compile), connect (link, also known as Link, join), load (load). The source program file is compiled into the target file, and multiple target files are connected to a final executable file, and the executable is loaded into memory to run. Because this article focuses on the executable file format, the loading process is also relatively focused. The following is a simple description of the elf file loading process under the Linux platform.

1: The kernel reads the head of the elf file first, then reads the data structure according to the data instruction of the head, finds the segment marked as loadable (loadable), and calls the function mmap () to load the segment contents into memory. Before loading, the kernel passes the token of the segment directly to Mmap (), and the segment's tag indicates whether the segment is readable, writable, and executable in memory. Obviously, the text segment is read-only executable, and the data segment is readable and writable. This approach utilizes the modern operating system and the processor's memory protection function. The famous Shellcode (reference 17) Writing technique is a practical example of breaking through this protection function.

2: The kernel analyzes the dynamic connector name corresponding to the segment of the elf file labeled Pt_interp, and loads the dynamic connector. The dynamic connectors for modern LINUX systems are usually/lib/ld-linux.so.2, and the details are described in detail later.

3: The kernel sets some tag-value pairs on the stack of the new process to indicate the actions associated with the dynamic connector.

4: The kernel passes control to the dynamic connector.

5: Dynamic connectors Check the dependencies of the program on external files (shared libraries) and load them as needed.

6: Dynamic connectors to the program's external reference to reposition, in layman's terms, is to tell the program its reference to the external variable/function address, this address is in the shared library is loaded in the memory of the interval. Dynamic joins also have the attribute of a delay (Lazy) positioning, that is, only when the "real" need to refer to the symbol to reposition, which is very helpful to improve the efficiency of the program.

7: The dynamic Connector executes the code of the section labeled. init in the Elf file, and initializes the program. In the early system, the initialization code corresponds to the function _init (void), which is forced to be fixed in the modern system, the corresponding form is

void
__attribute((constructor))
init_function(void)
{
……
}

Where the function name is arbitrary.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.