Problem Introduction:
Linux elf files are very difficult to understand at first, some people may go to see "linkers and Loaders", this book is really good, but there is no detailed explanation of many details, especially from the assembly language perspective. I read this book a lot of places still do not understand, and then I read the IBM360 computer Assembler design document, which detailed the two times scanning assembler design principle, as well as the relocation concept. But these still can't solve my doubts, because it bothers me is an option, namely ld-ttext=org, I do not understand what this org offset will affect the program, at first I thought this option just changed the elf file header information, but I was wrong, it involves a very complex mechanism. So I went through a lot of experiments, and finally came to the present personal opinion, is said to be personal opinion, because there may be some details of my understanding is still inaccurate, so I hope that readers can point out to me. thanked here. I would be honored if my article would be of some help to you.
First, let's take a look at a simple assembler program,
. section. Dataa:.int 2222.section. Text.globl _start_start:mov A,%eaxmov $1,%eaxmov $250,%ebxint $0x80
This procedure is very simple, of course, it is written in the grammar. The program simply puts 2222 in eax, then exits, and returns the exit code 250.
Let's assemble it.
As-o TESTELF.O Testelf.s
The following is the beginning of the link, note that we have to look at the code more clearly, so to use many of the LD options,-X and-s are used to remove unnecessary symbolic information, because this blog post focuses on function entry points and relocation.
Okay, run the command.
Ld-o testelf testelf.o-s-X
Below we use
Readelf testelf-a
See the Elf executable for more information. "For brevity, I can't, copy the necessary output parts directly"
Entry Point address:0x8048074
Program Headers:
Type Offset virtaddr physaddr filesiz memsiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x00085 0x00085 R E 0x1000
LOAD 0x000088 0x08049088 0x08049088 0x00004 0x00004 RW 0x1000
Section to Segment Mapping:
Segment Sections ...
xx. Text
. Data
Explain, I do not translate, more accurate
Offset: This member gives the offset from the beginning of the file at which the first byte of the segment resides.
virtaddr: This member gives, the virtual address at which, the first byte of the segment resides in memory.
physaddr: On systems for which physical addressing are relevant, this member are reserved for the segment ' s physical address. Because System V ignores physical addressing for application
Programs, this member have unspecified contents for executable files and shared objects.
Filesiz: This member gives the number of bytes in the file image of the segment; It may be zero.
Memsiz: This member gives the number of bytes in the memory image of the segment; It may be zero.
"It is too troublesome to write in detail and it is impossible to write in hexadecimal code. I'm taking notes myself, everyone sorry.
In short, I use-ttext can affect the value of virtaddr ,-ttext 0x22,virtaddr into 0x22 (not full equivalence, after the description), but entry is 0x24, because there are factors of alignment exists.
If-ttext 0, then you will find the page alignment, that is, the elf file in 4k page alignment, in the first 4k bytes, out of the beginning is the file header and program header, the remaining 0, which is to fill the gap between the No. 0 and 1 pages. LD See you want to put the text paragraph in the file No. 0 offset 0x22, which will overwrite the file header, so the text to the 1th page, so there will be a gap. Text is offset or 0x22 on page 1th.
Entry Point address:0x24
Program Headers:
Type Offset virtaddr physaddr filesiz memsiz Flg Align
LOAD 0x001022 0x00000022 0x00000022 0x00013 0x00013 R E 0x1000
LOAD 0x001038 0x00001038 0x00001038 0x00004 0x00004 RW 0x1000
See, this is the change, originally the text and the file header loaded together, so starting from offset 0, now just load text, so from the 1th page offset 0x22 start. Of course, in order to align, the real code back two bytes, so the entrance in 0x24. Load the code 0x001022--0x001034 into the memory 0x00000022 location, the code entry is exactly 0x00000024
If-ttext 0x128, so that even if the text on page No. 0 will not overwrite the file header, so this next LD put text on page No. 0, and the gap between the file header hundreds of bytes with 0 complement. The program header content is adjusted.
Entry Point address:0x128
Program Headers:
Type Offset virtaddr physaddr filesiz memsiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x00139 0x00139 R E 0x1000
LOAD 0x00013c 0x0000113c 0x0000113c 0x00004 0x00004 RW 0x1000
This time the text is loaded from offset 0, along with the header of the file. Loading into 0x00000000,entry is 0x128 just access to the beginning of the code.
If you use-N to prevent LD paging, then you-ttext 0 will force LD to put text at the beginning of page No. 0, in order to avoid overwriting the file header, LD will put text behind the file header. But data is still on page 1th, not followed by text, because-ttext 0 doesn't affect data unless you use-tdata 0
Basically, this option affects a lot of elf information, including file layout, because LD adjusts to different parameters. I just left a note, finished, and I was dizzy. Good-bye, guys.
This article is from the "mirage1993" blog, make sure to keep this source http://mirage1993.blog.51cto.com/2709744/1570527
Linux elf format File parsing relocation and entry entry point, based on gas assembly language perspective