1. Executable File Format
In traditional UNIX operating systems, all output files generated by compilation use the same name a by default. out, in modern operating systems,. an out-format executable file is the output of the linker, rather than the output of the assembler (in the ancient times of computers,. out is the output of the assembler. At that time, there was no linker ).
The target file and executable file have several different formats, most of which adopt an ELF format, you can use the following command to view more formats (some systems may not find the manual ):
$ Man a. out
In linux, you can use the executable files generated by the compilation link:
$ File Executable file Name
The output shows that this is an executable file in ELF format:
Yuanlu @ bear-labpc :~ /Workspace/wifi/wifi_hello $ file hello. ko
Hello. ko: ELF 32-bit LSB relocatable, ARM, version 1 (SYSV), not stripped
2 concept of UNIX Middle Section
There are many different formats in the NUIX system, but they all have a common concept-segment.
The so-called segment is the concept of the target file. A target file has multiple segments. They are simple regions in binary files, and all information related to a specific type is saved in them, such: symbol table entry. Here, do not confuse UNIX and Intel X86 segments. segments in the latter represent the design results of a memory model. In this design, address space silence is not a whole, it is divided into some fixed-size areas, called segments.
For a target file, run the size command to tell you the size of the three segments of the file. The three segments are: code segment (Text Segment), data segment and bss segment:
Yuanlu @ bear-labpc :~ /Workspace/wifi/wifi_hello $ size hello. ko
Text data bss dec hex filename
224 300 0 524 20c hello. ko
Here we will describe the content stored in the three segments:
(1) Text Segment: A code segment that stores the command code to be executed
(2) Data Segment: stores global or static initialized data variables.
(3) bss segment: stores global or static data variables that have not been initialized. Because the BSS segment only saves no variable, in fact, it does not need to save the image of these variables, but only records the size required by the BSS segment during runtime in the target file, therefore, the BSS segment does not occupy any space of the target file.
Note that the composition of a. out executable file is as follows:
A. a. out magic number
Other content of B. a. out
C. size required for BSS data segments
D. Data Segment (global and static variables after initialization)
E. Text Segment (Executable File instructions)
The file header contains. the magic number of out, which is a number that can be used to identify a random set of binary digits. For the moment, we ignore it. For a local variable, it does not exist. is created at runtime. Here we use the simplest hello world Driver to prove the correctness of the above content.
1 # include <linux/init. h>
2 # include <linux/module. h>
3
4 MODULE_LICENSE ("GPL ");
5
6/* the init function */
7 static _ init int hello_init (void)
8
9 {
10
11 printk (KERN_WARNING "Hello world! /N ");
12
13 return 0;
14}
15
16/* the distory function */
17 static _ exit void hello_exit (void)
18 {
19 printk (KERN_WARNING "Goodbye! /N ");
20}
21
22 module_init (hello_init );
23 module_exit (hello_exit );
Compile the hello. ko executable file and run the size hello. ko command to view the usage of each segment:
Yuanlu @ bear-labpc :~ /Workspace/wifi/wifi_hello $ size hello. ko
Text data bss dec hex filename
224 300 0 524 20c hello. ko
The content of the current bss segment is empty. Here 0 indicates the size of the bss segment, and the size of the data segment is 300 (in bytes ). We add two global and static initialized variables and uninitialized variables respectively, and add a large array declaration in the function:
1 # include <linux/init. h>
2 # include <linux/module. h>
3
4 MODULE_LICENSE ("GPL ");
5
6 int I;
7 int m = 2;
8
9 static int j;
10 static int n = 1;
11
12/* the init function */
13 static _ init int hello_init (void)
14
15 {
16 printk (KERN_WARNING "Hello world! /N ");
17
18 return 0;
19}
20
21/* the distory function */
22 static _ exit void hello_exit (void)
23 {
24 printk (KERN_WARNING "Goodbye! /N ");
25}
26
27 module_init (hello_init );
28 module_exit (hello_exit );
Use the size tool after compilation:
Yuanlu @ bear-labpc :~ /Workspace/wifi/wifi_hello $ size hello. ko
Text data bss dec hex filename
224 300 8 532 214 hello. ko
Here we found that the bss segment size has changed to 8, but the data segment has not increased. We can continue to verify the fact that:
(1) The local variables in the function are not stored in the executable file.
(2) For global variables, uninitialized variables are stored in the bss segment. The initialized variables are classified into two types: first, if they are initialized to 0, they are still placed in the bss segment, otherwise, put it in the Data Segment (this is a little different)
(3) Static global variables are not stored in the executable file, whether initialized or not.
If the first two points are well understood, why should we continue to verify that the actual performance of the third point is significantly different from that of the theory, here we can only guess the differences between compilers (my environment is a cross-compilation environment ).
3 a. out memory Layout
Segments can be easily mapped to objects that can be directly loaded by the connector at runtime. The loader is only an image of each segment of the executable file, essentially, the segment is a memory area of the program being executed. The connector copies each segment from the file to the memory, which is generally called by the mmap () system.
The following figure shows the layout of segments in the executable file in the memory:
Stack segment
Empty
A. a. out magic number
Other content of B. a. out
C. BSS Data Segment Size --------------------------------------------> BSS segment
D. Data Segment (global and static variables after initialization) ------------------> Data Segment
E. Text section (Executable File instructions) --------------------> text section
The code segment contains the commands to be executed. The data segment contains initialized global and static variables and their values (some features of different compilers may be slightly different ), the BSS segment size is obtained from the bss segment of the executable file, followed by the data segment. When the bss takes the address space into the program, all the BSS segments are cleared 0. The data segment and BSS segment are collectively referred to as the data zone.
These areas are not enough to meet all the requirements of a program, because a process also needs to save local variables, temporary variables, parameters passed to the function, and return values. The stack segment is used for this purpose. Of course, heap space is also indispensable to meet the need for dynamic memory allocation by processes.
Note that the lowest part of the virtual address space of a user process is not mapped, that is, the page table ing is not performed on several K bytes of the lowest virtual address of the process (given a physical address ), in this way, it can be used to capture memory situations with null pointers and small integer values that are referenced by pointers.