An. elf file format
The ELF (executable and linking format) is a common target file format for x86 Linux systems, with three main types:
- A relocatable file for connections that can be used to create executables and share destination files with other destination files.
- The executable file that is suitable for execution, the process image for the provider, the loaded memory execution.
- To share the destination file, the connector can connect it with other relocatable files and shared destination files to other destination files.
file Format
The ELF header describes the organization of the entire file at the beginning of the file, section provides information about the target file, and the Program Header table indicates how to create the process image, including the entry for each program header, the section header The table contains the entry for each section, giving information such as name, size, and so on.
Two. The loading process of elf files
From the point of view of compiling/linking and running, there are two ways to connect application and library programs. One is a fixed, static link, the desired library function of the target code extracted from the library, linked to the application software target image, the other is a dynamic link, the library function code does not enter the application software target image, but the function library image is also handed to the user, The image of the library is loaded into the user space when the application is started.
The Linux kernel supports both statically linked Elf images and dynamically linked Elf images, and the mount/boot image must be completed by the kernel, while the dynamic link implementation can be done either in the kernel or in user space.
Kernel-Space loading process
the program that actually executes the EXECVE () system call in the kernel Do_execve (), which first opens the target file image and reads from the header of the target file (that is, the Elf header field) and then calls another function Seach_binary_ Handler (), in this function, it searches for the executable file type queue that Linux can support, looking for handlers that match it. If the type matches, call the handler function pointed to by the Load_binary function pointer to process the target image file. For the elf file format, the handler function is the Load_elf_binary function.
The kernel has a struct LINUX_BINFMT data structure for each type of executable program that is supported. Defined as follows:
structlinux_binfmt{structlinux_binfmt*Next; structmodule*module; int(*load_binary) (structlinux_binprm*,structpt_regs*regs); int(*load_shlib) (structfile*); int(*core_dump) (LongSIGNR,structpt_regs* regs,structfile*file); unsignedLongMin_coredump; intHasvdso;}
Where the Load_binary function pointer points to a handler function that executes the program.
The elf file format is defined as follows:
Static struct linux_binfmt Elf_format = { . module = this_module, = load_elf_ Binary, . load_shlib = load_elf_library, . core_dump = Elf_core_dump, . Min_coredump = elf_exec_pagesize, . Hasvdso 1 };
Search_binary_handler look for the file format corresponding parsing module, as follows:
..... list_for_each_entry (FMT,&formats, LH) {1370 if(!try_module_get (fmt->module))1371 Continue;1372Read_unlock (&binfmt_lock);1373bprm->recursion_depth++;1374retval = fmt->load_binary (BPRM);1375Read_lock (&binfmt_lock);1376put_binfmt (FMT);1377bprm->recursion_depth--;1378 if(RetVal <0&&!bprm->mm) {1379 /*we got to Flush_old_exec () and failed after it*/1380Read_unlock (&binfmt_lock);1381FORCE_SIGSEGV (SIGSEGV, current);1382 returnretval;1383 }1384 if(retval! =-enoexec | |!bprm->file) {1385Read_unlock (&binfmt_lock);1386 returnretval;1387 }1388 }.....
The Load_elf_binary function is mainly the parsing process of elf files.
614ELF_PPNT =Elf_phdata; ...623 for(i =0; I < loc->elf_ex.e_phnum; i++) { 624 if(Elf_ppnt->p_type = =pt_interp) {...635Elf_interpreter = Kmalloc (elf_ppnt->P_filesz, Gfp_kernel); ...640retval = Kernel_read (Bprm->file, elf_ppnt->P_offset,641Elf_interpreter,642Elf_ppnt->P_filesz); ...682Interpreter =open_exec (Elf_interpreter); ...695retval = Kernel_read (Interpreter,0, bprm->buf,696binprm_buf_size); ...703 /*Get the EXEC headers*/...705LOC->INTERP_ELF_EX = * ((structELFHDR *) bprm->buf); 706 Break; 707 } 708elf_ppnt++; 709}
The purpose of the For loop is to find and manipulate the "interpreter" segment of the target image. The type of the "interpreter" segment is PT_INTERP, and after reading it reads the entire "interpreter" into the buffer based on the p_offset and size of its position p_offsize, the content of the interpreter is just a string, such as "/lib/ld-linux.so.2", The interpreter file is then opened by the Open_exec function.
814 for(i =0, ELF_PPNT =Elf_phdata; 815I < loc->elf_ex.e_phnum; i++, elf_ppnt++) {...819 if(Elf_ppnt->p_type! =pt_load)820 Continue; ...870Error = Elf_map (bprm->file, Load_bias +vaddr, ELF_PPNT,871Elf_prot, elf_flags); ...920}
This determines the mount address and then establishes a mapping of the user space virtual address space and a contiguous interval in the target image file through Elf_map (), whose return value is the starting address of the actual mapping.
946 if (Elf_interpreter) {... 951 elf_entry = load_elf_interp (&loc->i NTERP_ELF_EX, 952 Interpreter, 953 &INTERP_LOAD_ADDR); ... 965 } else 966 elf_entry = Loc->elf_ex.e_entry; ... 972 }
When a dynamic link is in place, the interpreter needs to be loaded, and the image is loaded through LOAD_ELF_INTERP, returning the entry address of the interpreter image. In the case of static linking, the interpreter is not required, and the portal address is the entry address of the destination image itself.
991 Create_elf_tables (BPRM, &loc->elf_ex, 992 (interpreter_type = Interpreter_ AOUT), 993 load_addr, interp_load_addr); ... 1028
Before the mount is completed, the image that initiates the user space is also required to prepare some information about the target image and the interpreter, such as regular argc, ENVC, etc., which need to be copied to the user space to appear on the user space stack when they enter the program portal of the interpreter or target image. This is the role of Create_elf_tables.
Finally, the macro operation of Start_thread () changes the EIP and ESP to the new address, allowing the CPU to enter a new entry address when returning to the user space.
Three. The experiment summary of the elf file loading and linking
The user executes the program through the shell, and the shell enters the system call through EXECVE. Sys_execve passes through a series of processes and eventually loads the user program and ELF interpreter into memory via the elf file's handler function Load_elf_binary and gives control to the interpreter. The ELF interpreter loads the associated library and finally gives control to the user program.
Linux kernel Analysis Week8 job-linux loading and launching an executable program