Compilation of Advanced Languages: Introduction to Links and loading process _ compiling principle

Source: Internet
Author: User
Tags function definition

# # #引言
As more and more powerful high-level languages emerge, in the server computing capacity is not a bottleneck, many students will choose to develop high efficiency, powerful virtual machine support of the Advanced Language (Java), or scripting language (python,php) as the first choice to achieve functionality, rather than the development of low efficiency, And the high efficiency of C + + as the development language. These languages typically run in virtual machines or interpreters without having to deal directly with the operating system.

Virtual machines and interpreters are equivalent to providing an intermediate layer for high-level languages or scripting languages, isolating the details of interacting with the operating system, reducing the number of problems that engineers have to deal with the underlying systems, and greatly improving the development efficiency of engineers. But it also makes engineers work long term on high-level languages, and sometimes it's strange to interact with the concepts of link libraries, executables, and CPU architectures.

Therefore, in order for the reader to have a holistic understanding of how source code is compiled into binary executable programs, the Basic Principles of program compilation, linking, and loading are described in a few ways. First we introduce different CPU architectures and the format of executable files for different operating systems. Then take a few simple C programs as examples, to introduce the compiler and linker to the program source code to do the processing. Finally, let's take a look at the program execution, the loader's handling of the program and the support of the operating system.

# # #CPU体系结构
Most of our students now touch the PC or server using the CPU is the x86_64 instruction set architecture, which is based on CISC (complex instruction set architecture). We have learned in the course of the principles of computer composition that in fact, in addition to CISC, there is another type of RISC CPU architecture, the simple instruction set architecture, such as the SUN's SPARC instruction set, IBM's PowerPC instruction set is based on RISC The CPU architecture of the instruction set. We do not delve into the details of various architectures, and we are concerned that code compiled in one of the CPU architectures can run under another architecture.

The answer is no, because the so-called binary program, in fact, there is a piece of CPU instruction composition, binary program execution process, is also from the CPU load these instructions to the instruction flow of a piece of execution. The instruction sets for different CPU architectures are not the same, and the length and composition of the instructions differ. So it is not feasible to have SPARC CPUs perform a binary program compiled into a X86 set of CPU instructions.

# # #操作系统
Remember when the Java language was just rising, there is a big selling point is the cross-platform implementation. Why is it possible to execute across platforms because the Java program has been Javac compiled with a byte code file that Java virtual machines can execute. You can perform a Java bytecode program compiled under another operating system as long as you upload your own version of the Java Virtual machine on a different operating system (WINDOWS,LINUX,MACOS). Why, then, is a gcc/g++ compiled binary program that cannot be executed across platforms?

As we have just said, Java programs can be executed across platforms because the Java virtual machines installed on different system platforms can recognize the same Java byte code. So we can infer that different operating system binaries cannot be executed across platforms because the binaries are formatted differently under different operating systems.

This is the truth indeed. What we all know is that after a program is compiled into binary, the runtime starts with the main function. But how does this program load into memory, the execution flow is how to accurately locate the address of the main function. In fact, these jobs are the operating system for us to do. We can take a bold look at what the operating system needs to do before the main function. First of all, the operating system must allocate a virtual address space; Then the system needs to load the code and data from the binaries into this address space, and then the system will find one particular location (initialization segment) according to a particular file format, and do some initialization work before the program runs For example, environment variable initialization and global variable processing, and then start executing our main function. "A particular file format" Here is why binaries cannot run across platforms.

What I really want to say here is that each operating system has its own binary file format, the operating system after the binary executable program load into memory, will be based on the default format to find all kinds of data, such as code snippets, data segments and initialization segment. So that Windows exe executable file, lib Static library, DLL dynamic library can not run directly under the Linux system; MacOS the following mach-o executable file, static link library (a library), dynamic link library (so library) is also not able to directly on the Linux Running under the system. And vice versa, the ELF executable file under Linux, the static link library (a library), the dynamic link library (so library) is also not able to run under the Window system.

# # #源代码的编译
With the CPU architecture and the operating system's impact on the binary file format, let's take a look at how the source code file is processed and turned into an executable file from a few examples. There are different compilers under different systems, such as the C + + compiler with vs from Windows, and gcc/g++ compilers under Linux and Unix. There are also many different programming languages, each with its own compiler to compile the corresponding source code into a binary executable program. Although there are some implementation details that are different, these compilers work in accordance with the principles and procedures. Because Linux and C language use is relatively extensive, at the same time, the author is relatively familiar with Linux and C + +, the remainder of this article is based on the Linux platform using the gcc/g++ compiler to compile C/s + + source code for explanation and explanation.

The original purpose of this article is to let the engineer to the program source code through the compiler, the linker and loader eventually become a process running in the system the whole process has a basic understanding, so it does not involve how the compiler through lexical analysis, parsing and semantic analysis to eventually get the target binary file. So the rest of this article focuses on how gcc/g++ forms a Linux-aware elf executable file.

### #C源码文件
First, we briefly review the basic concepts of variables and functions in the C source code program.

Let's start by distinguishing the concepts of declarations and definitions. In the C language program, we can declare a variable and a function, or we can define a variable or function. The difference between the two is as follows: declaring a global variable or function is telling the compiler, this variable may be used in the current source file or called, but the variable or function is not defined in the current file, but is defined in some other file, please do not make an error when compiling this file. Defining a variable is telling the compiler to reserve a space in the generated target file, and if the variable has an initial value, the compiler saves the initial value in the destination file. Defining a function is to ask the compiler to generate the binary code of this function in the target file of this file.

Then we need to look at the C source code program in the Variable type and function type, the most basic C program (not using C + + function) is relatively simple, we can declare the definition of variables and local variables, global variables and local variables can be declared as static and non static variables. In addition, we can dynamically request variables through malloc. Their differences are as follows: a non-static global variable means that the variable exists throughout the lifecycle of the program execution and can be accessed by other files other than the source file. A static global variable indicates that the variable exists throughout the lifecycle of the program's execution, but can only be accessed by the function of the source code file. A non-static local variable indicates that the variable exists only in the execution context of the function in which the variable is located (in fact, the variable is in the function stack frame of the function execution stack) The static local variable is in the category of all variables and exists throughout the life cycle of the program execution, However, the scope is limited to the code block that defines the variable (the scope of the curly braces) the dynamically requested variable indicates that the variable is in the process of running, the function dynamically requests a space from the process's address space, and uses this space to store the data.

For functions, we can also define static and non static functions, as follows: the definition of a non-static function indicates that this is a global function that can be accessed by other files of the source code file. The static function restricts this function to be invoked only by the function of the source code file.

Let's use the following applet to see what the compiler does with these variables.

int g_a = 1;            Defines an initial value global variable
int g_b;                 Defines a global variable static int g_c without an initial value
;         Defines the global static variable
extern int g_x;         Declares a global variable
extern int sub ();     function declaration

int sum (int m, int n) {        //function definition return
    m+n;
}
int main (int argc, char* argv[]) {
    static int s_a = 0;     Local static variable
    int l_a = 0;                 Local non-static variable
    sum (g_a,g_b);
    return 0;
}

### #目标文件
Before we compile this program with GCC to see what the compiler has done, we can actually simply comb through what the compiler needs to do to get the program running. First of all, we will be the default CPU will be based on the code of the program to implement a; in fact, when the conditional judgment or function call, the program will occur instruction flow jump; Also, the program code execution process needs to manipulate the various variables pointed to the data. From these three "taken for granted" behaviors, we can infer what the compiler needs to do at least. First, the CPU must not understand these high-level language code, the compiler needs to compile the code into binary directives. Second, when the instruction stream jumps, how can the CPU find the location to jump, the compiler needs to define a label for the location of each defined function, each tag has an address, and the call to each function is the equivalent of jumping to the address that the label points to. Third, how the CPU can find the data that the variables point to, the compiler needs to define a label for each variable, and each tag also has an address that points to the data space in memory.

Let's actually look at the compiler's behavior, and we'll start by compiling this into a target file and looking at:
Gcc-c test.c-o TEST.O && nm TEST.O

0000000000000000 D g_a
0000000000000004 C g_b
0000000000000000 b g_c
                 U g_x
0000000000000014 T main
0000000000000004 b s_a.1597
0000000000000000 T sum

First of all, we use the gcc-c command to compile the test.c source file into TEST.O target file, it should be noted that although the target file is also a binary file, but the executable file is different, the target file only the current source file compiled into a binary file, and not through the link process, is not capable of execution. We then use the NM command to view the symbol information for the target file. Here we see the output of the NM command has three columns by default, where the leftmost column is the relative address of the variable, and the middle column represents the type of the segment where the variable is located, and the name of the variable on the right. Let's take a look at the meaning of these three columns separately. First, the leftmost column is the relative address of the variable in the location, and we see that the relative addresses of g_a and G_c are the same, which doesn't conflict because they are in different China (D and b indicate that they are in different segments in the target file). The second column represents the type of segment where the variable is, for example, we see a segment with d,c,b,t these types, and in fact the compiler supports more segment types than this. We also do not delve into the meaning of each section type, as long as you understand that the different segments are stored in different data, such as paragraph D is the data section, the special store has the initial value of the global variable, T segment represents the code snippet, all the code compiled instructions are placed in this paragraph. Here we can notice that the relative address of the variable in the same segment cannot be duplicated. The third column represents the name of the variable, where we see that the local static variable name is modified to s_a.1597 by the compiler, and we should be able to guess the reason why the compiler did so. S_a is a local static variable scoped to the code block that defines it, so we can declare a local static variable of the same name in different scopes, such as we can declare another s_a in the SUM function. But as we mentioned above, the local static variable belongs to the scope of the global variable, it exists in the whole lifecycle of the program, so in order to support this function, the compiler adds a suffix to the local static variable name to identify the different local static variables.

The attentive reader should be able to see why the variable declaration here g_x has no address. As we mentioned in the C source file section, the Declaration of variables and functions is essentially a promise to the compiler that, although there is no such variable or function definition in this file, it must be in other files, so when the compiler discovers that the program needs to read the corresponding data for the variable, But when you can't find it in the source file, you put the variable in a special segment (U), which means that the subsequent link needs to be found in a later target file or in a link library, and then the link becomes an executable binary file.

From the explanation of the above information, we can see that the compiler is not so complicated, and everything it does is to support language-level functionality.

# # #目标文件的链接
Through a small program in the previous section, we discussed the basic components of the C source code file, the compiler's handling of these components, and the rationale behind the compiler doing so. We also leave a variable name that needs to be found in other target files. In this section, we discuss what the linker needs to do with the target files after the compiler compiles each C source code file into a target file.

First, let's try to link the target file from the previous section to see what happens: gcc test.o-o test

test.o:in function ' main ':
test.c: (. text+0x2c): Undefined reference to ' g_x ' collect2:ld
returned 1 exit status

When we tried to link the target file to an executable file, the linker gave an error. The linker cannot get a full executable program because the variables we previously promised were not found in other target files or library files. We tried to fix the problem with a different C program:

int g_x = m;
int sub () {}

Compile this file into the target file gcc-c test2.c-o test2.o; NM TEST2.O

0000000000000000 D g_x
0000000000000000 T Sub

Now we try to link these two target files to executable files: gcc test.o test2.o-o test; NM test, we found that there was a lot more information than the target file, which defined a lot of segments needed to achieve different language levels, where we were concerned with the symbol and the address of those variables defined in the source file, as shown in the following illustration:

00000000004005e8 t _fini
0000000000400390 t _init
00000000004003d0 t _start
...
0000000000601018 d g_a
0000000000601038 b g_b
0000000000601030 b g_c
000000000060101c D g_x
00000000004004c8 T main
0000000000601034 b s_a.1597
0000000000400504 t sub
00000000004004b4 t sum

In the final executable file, we can see that first, the variable g_x and the declared function sub, previously declared in the first source file, finally found the definition in the second target file, and second, the variables defined in the different target files, such as G_a, g_x are placed in the data segment (segment type D). Also, the relative address of the variable in the target file has all been changed to an absolute address.

So let's go over it again. The linker needs to process the source code: to find the relevant definition in the other target file for variables not defined in each target file. Merges the same types of segments generated in different target files. Address relocation of variables in different target files.

This is also the most basic function that the linker needs to implement.

# # #装载运行

In the preceding sections we discussed the most basic processing that the compiler needs to do to compile a C source file into a target file, as well as the most basic functionality that the linker needs to have when linking multiple target files to an executable file. In this section we discuss how the executable file is run by the system load.

### #动态链接库
We all know that in the process of writing a program, we do not implement all the functions themselves, in general, we will call the system library and Third-party library to achieve our functions. In the example code in the two section above, to illustrate the simplicity of the problem, we simply declare that several variables and functions are defined and that no library function is used. So now suppose we need to invoke the functionality provided by a library function, and what is the executable file at this point, let's look at a small example:

#include <stdio.h>
#include <string.h>

int main (int argc, char* argv[]) {
    char buf[32];
    strncpy (buf, "Hello, world\n");
    printf ("%s", buf);
}

We compile this file into an executable file and look at its symbolsgcc test3.c-o test3; NM TEST3:

00000000004005B4 T main
                 u printf@ @GLIBC_2.2.5
                 u strncpy@ @GLIBC_2.2.5

We should be able to see the output similar to the above, we have seen this type of symbol in the "Object File" section. It was in the target file, and there was no address, and we said it was left to the linker to find the variable definition in the object file that follows. But now that we're checking the executable, why does the executable file still have this symbols?

As we mentioned earlier, the compiler is nothing special, it does everything to support the programming language level of functionality, and here is no exception. The "undefined" symbols in the executable file is actually designed to support the functionality of the dynamic link library.

Let's review what the dynamic link library should look like. A dynamic link library is a program that locates the library when it is running and links the library to the virtual address space of the process. For a dynamic link library, all executables that use this library share the same block of physical address space, which is load into memory the first time the current dynamic link library is linked.

Now let's take a look at how the functions in a dynamic-link library are handled in a binary file, objdump-d test3 | Less, search printf we should be able to see the following:

0000000000400490 <strncpy@plt>:
  400490:       FF 6a 0b       JMPQ   *0x200b6a (%rip)        # 601000 < _global_offset_table_+0x18>
  400496:          pushq  $0x0
  40049b:       e9 e0 FF          FF FF JMPQ   400480 <_init+0x20> ...
00000000004004b0 <printf@plt>:
  4004b0:       FF 5a 0b       JMPQ   *0x200b5a (%rip)        # 601010 <_GLOBAL_OFFSET_TABLE_+0x28>
  4004b6:          pushq  $0x2
  4004bb:       e9 c0 FF FF FF          JMPQ   400480 <_init+0x20>

We see that the executable generates three proxy symbol for strncpy and printf respectively, and then the first instruction that agent symbol points to is to jump to an offset in the code snippet corresponding to _global_offset_table_. In Linux, this _global_offset_table_ corresponds to the code snippet for "address-independent code" to do dynamic address relocation. We have mentioned that dynamic link libraries can map to different virtual address spaces of different processes, so they belong to "address-independent code", and the linker jumps to the calling code of this function to dynamically load the address when the program is run.

Linux provides a convenient command to view an executable file dependent dynamic link library, we look at the current executable file dynamic library dependencies: LDD test3:

    Linux-vdso.so.1 =>  (0x00007fff413ff000)
    libc.so.6 =>/lib/x86_64-linux-gnu/libc.so.6 ( 0x00007fe202ae7000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe202eb2000)

The LDD command simulates the dynamic link library that is needed to load the executable program, but does not execute the program, and the following address section represents the address of the dynamic-link library during the simulation load. If you try to run the LDD command multiple times, we find that the address of each dynamic link library is different, because the address is dynamically positioned. We normally work, if a binary executable file error cannot find a function definition, you can use this command to check whether the system is missing or not install a dynamic link library.

We add a sleep (1000) at the top of the applet, and then look at the Run-time memory map allocations, cd/proc/21509 && cat maps, you should see the following paragraph:

7feeef61f000-7feeef7d4000 R-xp 00000000 fd:01 135891/lib/x86_64-linux-gnu/libc-2.15.so
7feeef7d4000-7feeef9d3000---P 001b5000 fd:01 135891                     /lib/x86_64-linux-gnu/libc-2.15.so
7feeef9d3000-7feeef9d7000 r--p 001b4000 fd:01 135891/lib/x86_64-linux-gnu/libc-2.15.so
7feeef9d7000-7feeef9d9000 rw-p 001b8000 fd:01 135891/lib/x86_64-linux-gnu/libc-2.15.so                     

We can see that when the process is running, the system maps four segments in the process address space for the LIBC library, because each segment has different permissions, so it cannot be merged into one segment. Calls to these dynamic-link libraries will eventually jump to the address shown here.

Based on that information, here's a summary of what the linker needs to do with the dynamic-link library: When the link library links the target file to an executable file, if it finds a variable or function that is not found in the destination file, it will follow the GCC A predefined dynamic library looks for a path to find a variable or function defined in a dynamic library. If the link library finds the variable or function definition in a dynamic-link library, the link library first writes the dynamic link library to the executable's dependent library and then generates the proxy symbol for the current variable or function. Generate a true dynamic jump instruction in the _global_offset_table_ code and jump to the appropriate offset in the _global_offset_table_ in the library function (such as strncpy,printf) proxy symbol.

We've been talking about dynamic link libraries (so) in the front, in fact, under each platform has a static link library, static link library link behavior and the target file is very similar, but because there are some problems in the static library, for example, because each executable file has a version of the static library, which caused the library to upgrade the time is very troublesome and so on, Now the static library is used very little, so we don't delve into it here.

### #main函数之前
In the "Operating system" section, we have simply mentioned that the process needs to do some initialization before the main function of the program executes, and then calls the main function to execute the program logic. In the "Dynamic link library" In this section, we mentioned that for the dynamic link library, we need to dynamically link the required libraries to the process's address space when the system is started. In this section, we combine these steps to simply trace from the object code of the executable how Linux can load the elf files into memory and eventually call the main function.

In the "link to destination File" section, we show the results of part NM test, where _start this symbol was deliberately retained because, for the elf file format, the Linux system, after assigning the virtual address space to the process and putting the code into memory, is from this _start the corresponding address to begin execution. This address is recorded in the head of the elf file, which can be obtained when the system reads elf files. Let's start with the instructions _start this symbol and track the key points we're interested in.

0000000000400510 <_start>:..
  400526:48 C7 C1 mov $0X400670,%RCX 40052d:48 C7 C7 f4 mov $0x4005f4,%rdi 400534:E8 B7 FF FF FF CALLQ 4004f0 <__libc_start_main@plt>/*. Start this section will execute libc in the __libc_sta library.

Rt_main instructions, here you need to pay attention to the two parameter values passed to this function "0x400670" and "0x4005f4", one of which is __libc_csu_init address, one is the address of the main function * * ... 00000000004004F0 <__libc_start_main@plt&gt: 4004f0:ff 0b jmpq *0x200b22 (%rip) # 60
  1018 <_GLOBAL_OFFSET_TABLE_+0x30> 0000000000400670 <__libc_csu_init>:.

4006b0:e8 e3 FD FF FF callq 400498 <_init> ...
  0000000000400498 <_init>:..

4004a6:e8 callq 400710 <__do_global_ctors_aux> ...
  00000000004005f4 <main>:..
  400626:e8 FE FF FF callq 4004c0 <strncpy@plt> ...
  40063f:e8 9c FE FF FF CALLQ 4004e0 <printf@plt>...
 400649:e8 B2 FE ff FF callq 400500 <sleep@plt> ...

First of all, let us briefly explain the meaning of the instructions above, first in the _start corresponding instructions, after some processing, will be used __libc_csu_init address and main address as a parameter call __libc_start_main, This function is implemented in the LIBC library, that is, all the executable programs in Linux share the same section of initialization code, space reasons we do not see the implementation of the __libc_start_main. What we need to know is that after __libc_start_main as some processing, the __libc_csu_init corresponding instructions are invoked first, and then the main corresponding instruction is invoked.

Main's corresponding instruction is our own main function, __libc_csu_init will then invoke the _init instruction, and then invoke the symbol corresponding to __do_global_ctors_aux which the C + + programmer should be familiar with, __do_ Global_ctors_aux the corresponding instruction will perform all the global variable initialization, or the global object construction in C + + operations.

Based on this information, we summarize what Linux does when we run a program through bash: First bash makes a fork system call, generates a subprocess, and then runs the ELF binaries specified by the EXECVE function in the subprocess (Linux performs the binary process The order is ultimately done by Execve this library function, Execve invokes the system call to load the elf file into an in-memory code snippet (_text). If there is a dependent dynamic link library, the dynamic linker is invoked to address the library file, and the memory space of the dynamic link library is shared by multiple processes. The kernel obtains the _start address from the elf file header, the dispatch execution flow starts from the address which _start points, executes the flow to jump to the public initialization code snippet in the libc in the _start execution code snippet __libc_start_main, carries on the initialization work before the program runs. During the execution of the __libc_start_main, it jumps to the initialization of the global variable in _init, then calls our main function and goes to the instruction flow of the main function.

So far, we discussed the whole process from the source code of a C language program to the running process.

# # #一个小例子

After understanding how the compiler transformed our source code into a binary executable, we were able to figure out how to look at how a piece of code was compiled into binary, and then write efficient code according to the compiler's "custom". In this section we analyze a small online example, the following is a user listed two sections of the program, in the interview was asked what the pros and cons.

Program 1:
if (K > 8) {for
    (int h=0;h<100;h++) {//dosomething}
}} else {for
    (int h=0;h<100;h++) {//do Something}

program 2: for
(int h=0;h<100;h++) {
    if (k>8) {//dosomething} 
    else {//dosomething}< c9/>}

From the programming specification, it is clear that program 2 is better than program 1, because if the "dosomething" part is more complex, the program is 2 compact and not redundant, and you can extract the public parts of the IF and Else branch "dosomething" out of the For loop. But experienced engineers can immediately see that, although the program 1 slightly redundant, but its execution speed than program 2 is faster, why we quickly from the compiler generated target file analysis, our test procedures are as follows:

Program 1:
if (type = = 0) {for
    (i=0 i<cnt; i++) {
        sum = Data[i]; 
    }   
} else if (type = = 1) {for
    (i=0 i<cnt; i++) {
        sum + = (i&0x01)? ( -data[i]):d ata[i];
    } 
Program 2: For
(i=0; i<cnt; i++) {
    if (type = = 0) {
        sum = Data[i];
    } else {
        sum = = (i&0x01)? ( -data[i]):d ata[i];
    }   
 

The fragment that is compiled into an executable file is:

Program 1:

4005d7:83 7d EC Cmpl $0x0,-0x14 (%RBP)/* Type==0 judgment/4005db:75 29 Jne 400606 <calc_1+0x44>//////Failure condition judgment skip to Else branch * * 4005DD:C7 ($0x0,-0x4)- -----------------------------circulation body begins---------------------------4005e4:eb jmp 4005fc <calc _1+0x3a>/* Skip to cyclic condition comparison instruction/4005e6:8b FC mov-0x4 (%RBP),%eax/* The first instruction in the loop * * ... 4 005f8:83 FC-Addl $0x1,-0x4 (%RBP) 4005fc:8b FC Mov-0x4 (%RBP),%eax 4     005ff:3b E8 cmp-0x18 (%RBP),%eax/* Cyclic conditions comparison * * 400602:7c E2 JL       4005e6 <calc_1+0x24>//* Skip to the start of the cycle * *------------------------------loop body End----------------------------400604: EB 4a jmp 400650 <calc_1+0x8e> 400606:83 7d EC-Cmpl $0x1,-0x14 (%RB P) */Else branch, type==1 Judge * * * Type==1 branch is basically the same as the Type==0 branch.    
 

Program 2:

400671:eb 4d jmp 4006c0 <calc_2+0x6b>------------------------------circulation body started---------------                   ------------400673:83 7d EC cmpl $0x0,-0x14 (%RBP)/* type==0/400,677:75 14
Jne 40068d <calc_2+0x38>/* Condition judgment failed to jump to else branch//... 400686:8B mov (%rax),%eax 400688:01 F8 add%eax,-0x8 (%RBP) 4006 8b:eb 2f jmp 4006BC <calc_2+0x67> 40068d:8b FC Mov-0x4 (%RBP                   ),%eax/Else branch */400690:83 E0 and $0x1,%eax 400693:84 C0
Test%al,%al 400695:74 JE 4006aa <calc_2+0x55> ...         4006C0:8B FC Mov-0x4 (%RBP),%eax 4006c3:3b-E8 cmp-0x18 (%RBP),%eax /* Cycle Conditions Comparison * * 4006c6:7c AB JL 400673 <calc_2+0x1e>/* Skip toLoop start phase/------------------------------loop body End----------------------------   
 

By contrasting the source program and assembly instruction, the assembly instructions of our program 1 and program 2 are compared and labeled respectively. We can compare, in the assembly instruction of program 1, after one condition is judged, the execution stream jumps to the corresponding circular instruction segment (IF/ELSE), and then loops the entire paragraph's instruction. In the assembly instruction of program 2, there are conditional judgment and jump (If/else) in the process of executing the cyclic instruction section. So here we can sum up the program 2 faster than the program 1 reason: Program 2 in the execution of each loop body needs to perform comparison instructions and jump instructions, if the number of cycles (such as more than million), is equivalent to more than the implementation of millions instructions. Modern CPUs are pipelined mode, there are command prefetching module, that is, the same time period, there are more than one instruction in the CPU operation, but also have predictive command prefetching. If an instruction jump occurs, it is likely that subsequent instructions will all be brushed out of the CPU, jump back to the new address execution, wasting multiple CPU cycles.

Through our analysis, we can say that if this program segment is in the bottleneck position throughout the project, program 2 as a priority is acceptable. However, if the program segment is in a speed bottleneck, program 1 is in possession of the advantage.

# # #结束语

In this article, we use the C language as an example to management how the source code program was compiled, and the link and load finally succeeded in running the Linux system. From the code details, this is a long and complex process, but just grasp the main line, you will find that the compiler and linker to do is to meet our functions and needs, is the so-called same.

In addition, although we use the C language to illustrate, in the same system, other languages compiled binaries are the same format. For example, using the source code compiled by the Go language under Linux, the final compiled binaries are still in elf format, and we can view and debug such code using the same set of tools (such as nm,objdump,readelf). From this we can learn that while the go compiler and GCC compiler details are implemented differently, the work is basically the same.

It is clear that such a passage would not be exhaustive in describing the complex process of compiling, linking and loading. The original intention of this article is to let the students have an intuitive understanding of the process, interested students in fact there are a lot of details can be explored, the final "reference" section has a few good resources, can be interested in the students to provide reference.

# # #参考 Shiki, Shi Fan, Pan: The self-cultivation of the programmer http://www.lurklurk.org/linkers/linkers.html John levine:linkers and loaders
Original address: http://tech.meituan.com/linker.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.