C Language Compilation execution process

Source: Internet
Author: User

Understanding C Compilation Execution process is the beginning of C learning.
The simple way to say C language from code compilation to execution is to go through the process:

C Source Code
Compile----> form the target code, which is the code that runs on the target machine.
Connect----> Connect the target code to the C function library and merge the library code used by the source program with the target code and form the final executable binary machine code (program).
Perform-----> run C programs in a specific machine environment.

If you use a graph to represent:

<a href= "http://www.emacsvi.com/wp-content/uploads/2015/10/c_compiler_execute.jpg" ></a>


Compile, compile the program to read the source program (character stream), the lexical and grammatical analysis, the high-level language instruction into the functional equivalent of the assembly code, and then by the assembler to machine language, and according to the operating system to the executable file format of the requirements of the chain to be executed program.
C source program header file--&gt; pre-compilation processing (CPP)--&gt; compiler itself--&gt; optimizer--&gt; assembler--&gt; linker –&gt; executable file

Read the C source program, processing the pseudo-directives (instructions with # beginning with #) and special symbols
[Analysis] The pseudo-directive mainly includes the following four aspects
(1) macro definition directives, such as # define Name tokenstring, #undef等. For the previous pseudo-directive, the precompilation is to replace all the name in the program with Tokenstring, but the name as a string constant is not replaced. For the latter, the definition of a macro is canceled so that subsequent occurrences of the string are no longer replaced.
(2) Conditional compilation directives, such as #ifdef, #ifndef, #else, #elif, #endif, and so on. The introduction of these pseudo-directives allows programmers to define different macros to determine which code is processed by the compiler. The precompiled program will filter out unnecessary code according to the relevant files.
(3) header file contains directives, such as # include "FileName" or # include &lt; Filename&gt; and so on. In a header file, a large number of macros (the most common character constants) are defined with a pseudo-directive # define, along with declarations of various external symbols. The main purpose of the header file is to make certain definitions available to a number of different C source programs. Because in a C source program that needs to use these definitions, you can simply add an # include statement instead of repeating the definitions in this file again. The precompiled program adds all the definitions in the head file to the output file it produces for processing by the compiler.
Header files that are included in the C source program can be system-supplied, and these header files are typically placed in the/usr/include directory. In the program # include them to use angle brackets (&lt;&gt;). In addition, developers can also define their own header files, these files are generally in the same directory as the C source program, in this case, in the # include with double quotation marks ("").
(4) Special symbols, pre-compiled program can recognize some special symbols. For example, the line identifier that appears in the source program is interpreted as the current row number (decimal number), and file is interpreted as the name of the currently compiled C source program. The precompiled program will replace the strings that appear in the source program with the appropriate values.
What the precompiled program accomplishes is basically an "override" of the source program. In this substitution, an output file with no macro definition, no conditional compilation instructions, and no special symbols is generated. The meaning of this file is the same as the source file without preprocessing, but the content is different. Next, the output file will be translated into machine instructions as the output of the compiler.
Only constants will be available in the precompiled output file. such as numbers, strings, definitions of variables, and C-language keywords, such as main,if,else,for,while,{,},+,-, *,\, and so on. The pre-compiler works by lexical analysis and parsing, after confirming that all instructions conform to grammatical rules, translate them into equivalent intermediate code representation or assembly code.
Optimization is a very difficult technology in the compilation system. It concerns not only the compiler technology itself, but also the hardware environment of the machine has a great relationship. Optimization is part of the optimization of the intermediate code. This optimization is not dependent on the specific computer. The other optimization is mainly for the generation of target code. , we put the optimization stage behind the compiler, which is a more general representation.
For the former optimization, the main work is to delete the common expressions, loop optimization (out-of-code, strength weakening, transformation loop control conditions, known amount of consolidation, etc.), replication propagation, and deletion of useless assignments, and so on.
The latter type of optimization is closely related to the hardware structure of the machine, and the most important consideration is how to make full use of the values of the variables stored in each hardware register of the machine to reduce the number of accesses to memory. In addition, how to carry out instructions according to the characteristics of the machine hardware (such as pipelining, RISC, CISC, VLIW, etc.) and some of the instructions to make the target code is relatively short, the efficiency of execution is relatively high, is also an important research topic.
The optimized assembly code must be translated into the appropriate machine instructions by the assembler assembly, which may be executed by the machine.
The assembler process actually refers to the process of translating assembly language code into a target machine instruction. For each C language source process that is processed by the translation system, it will eventually get the corresponding target file through this processing. A machine language code that is stored in the target file, which is the target equivalent to the source program.
The destination file consists of segments. Typically there are at least two segments in a target file:
The code snippet contains the main instructions for the program. The paragraph is generally readable and executable, but is generally not writable.
The data segment mainly stores various global variables or static data to be used in the program. General data segments are readable, writable, and executable.

There are three main types of target files in the UNIX environment:
(1) relocatable files contain code and data that are appropriate for other destination file links to create an executable or shared destination file.
(2) Shared destination file This file holds the code and data that are appropriate for linking in both contexts. The first thing the linker can do with other relocatable files and shared target files to create another target file; the second is that the dynamic linker combines it with another executable file and other shared target files to create a process image.
(3) executable file It contains a file that can be executed by the operating system to create a process.
The assembler is actually generating the first type of target file. For the latter two also need some other processing side can get, this is the work of the link program.
The target files generated by the assembler are not immediately executed, and there may be many unresolved issues. For example, a function in one source file might refer to a symbol defined in another source file (such as a variable or function call), a function in a library file might be called in a program, and so on. All of these problems need to be resolved by the process of the linked program.
The main task of the linker is to connect the relevant target files to each other, and the symbols referenced in one file are connected to the definition of the symbol in another file, so that all these target files become a unified whole that can be executed by the operating system.
Depending on how the developer assigns the same library functions, link processing can be divided into two types:
(1) Static links in this way, the code of the function will be copied from its location in the static link library to the final executable program. The code is then loaded into the virtual address space of the process when it is executed. A static link library is actually a collection of target files in which each file contains code for one or a set of related functions in the library.
(2) dynamic linking in this way, the code of the function is placed in a target file called a dynamic link library or a shared object. What the linker does at this point is to record the name of the shared object and a small amount of other registration information in the final executable program. When the executable is executed, the entire contents of the dynamic-link library are mapped to the virtual address space of the corresponding process at run time. The dynamic linker will find the appropriate function code based on the information recorded in the executable program.
For function calls in an executable file, you can use either dynamic or static linking methods, respectively. Using dynamic linking can make the final executable shorter and save some memory when the shared object is used by multiple processes because only one copy of the code for this shared object is stored in memory. However, it is not necessarily better to use dynamic links than to use static links. In some cases, dynamic linking can cause some performance damage.
After the above five processes, the C source program is eventually converted into an executable file. By default, the name of this executable file is named A.out.

This article transferred from: http://www.vcgood.com/archives/1400



C Language Compilation execution process

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.