C # Compiling and operating principles

Last Update:2017-02-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

About the relationship between compilation and memory, and the partitioning of memory at execution time

1. The so-called Allocation of space during compilation refers to the static allocation of space (relative to the use of new dynamic application space), such as global variables or static variables (including some complex types of

constants), the amount of space they need can be clearly calculated and no longer changed, so they can be stored directly in a specific section of the executable (and

Contains the initialized values), the program runs the section directly into a specific segment, without having to generate these variables with extra code during the run of the program.

In fact, the concept of "variable" during the run will no longer have so many attributes during compilation (such as name, type, scope, lifetime, etc.), corresponding

Just a piece of memory (only the first address and size), so dynamically requesting space during runtime requires additional code maintenance to ensure that different variables do not mix memory.

For example, writing new means that a piece of memory is already occupied, and the other variables can no longer be used; Writing delete means the memory is free and can be used by other variables.

(usually we are using variables to use memory, in terms of encoding variables are a memory block name, to distinguish each other)

Memory requests and release times are important, data is lost prematurely, and memory is consumed too late. In certain cases the compiler can help us with this complex task (adding additional

The code maintains the memory space, implements the request and releases). In this sense, local automatic variables are also assigned by the compiler to allocate space. Further speaking, memory management

We used both the heap and the stack of data structures that we used to hang on our mouths.

Finally, for the "compiler allocates space" this is not a rigorous argument, you can understand that during the compilation it for you to plan the memory usage of these variables, this scheme writes

Into the executable (the file contains some code that is not derived from your brain) until the program is run and is actually executed.

2, compiling is actually just a scanning process, lexical grammar check, code optimization only. I think you said, "compile-time allocation memory" refers to "compile-time assignment of the initial value", it just form a text, check error-free, and do not allocate memory space.

The system does not import the program into memory when you run it. A process (that is, a running program) consists mainly of the following five partitions:
Stack, heap, global data/Static, code area, constant area

A variable (which also returns to the address of the next instruction in the calling function) that holds the local data or the parameters of the function, and the return value of the function.
A variable that the heap area uses to store dynamic applications for memory in the program
Global variables/static zones are used to hold global variables or static variables in the program because their size is deterministic, static space is allocated during compilation, and will not change, which increases the speed at which the program accesses the data.
Code area for storing compiled binary code
Constant areas are used to hold the constants we declare (const type)

The code (the compiled binary code) is placed in the code area, and the various variables and constants generated in the codes are stored in the other four regions, respectively, by different types. System according to Code order

Execute and then change or invoke the data according to the code schema, which is the process of running a program.

allocating memory at compile time

---------------
The memory is not allocated at compile time. At this point only the placeholder is based on the type of the declaration, and the memory will be allocated correctly when the program executes. So the declaration is for the compiler to see

, a smart compiler can help you identify errors based on a statement.

Allocating memory at run time
---------------
This is right, the runtime program must be transferred to "memory". Because the CPU (which has multiple registers) only deals with memory. Before entering actual memory, the program should first

Allocate physical memory first.

Compilation process
---------------
When this EXE file is executed, the program is loaded into memory and becomes a process. At this point, the program initializes some of the global objects and then finds the entry function

, it begins execution by executing the execution statement of the program. At this point, the required memory can only be dynamically increased/released on the program's heap.

Compile process: is to translate the source files into the target file to the hard disk, this process accounted for the memory does not matter, is the compiler to do, the target file consists of three parts:

1. File information, including file type, file size, and so on, such as: DLL file of the first two bytes is 0x4d 0x5a.

2. The code is the program, such as hdata data = new Hdata (); Hdata is a class of its own definition, and this sentence is converted to the following form, which accounts for 37 bytes.
00000040 B9 7F DA XX mov ecx,0da7f10h
00000045 E8 E2 4D D4 FB Call fbd44e2c
0000004A B4 mov dword ptr [Ebp-4ch],eax
0000004D 8B 4D B4 mov ecx,dword ptr [ebp-4ch]
00000050 E8 0B F0 AB FB Call FBABF060
00000055 8B C4 mov edx,dword ptr [ebp-3ch]
00000058 8B B4 mov eax,dword ptr [ebp-4ch]
0000005b 8D EDX,[EDX+00000184H]
00000061 E8 3 A 5B FE call 74fe5ba0

3. Data, including global variables, static variables, and constants. Class member variables, method Local variables are not compiled into a file. such as: static int a = 0; The file occupies four bytes, int a = 1, does not account for bytes in the file, string str = "12345", although it is a class member, but it implies the constant "12345", which accounts for 5 bytes in the file.

Run the process:
1. When you double-click the icon, the system has all the EXE files in memory, including all the programs and global variables, this part of memory has been occupied to exit the program.

2. Run the program, also take this sentence as an example, hdata data = new Hdata (); The program of the Hdata class is already in memory, all instances of the Hdata class share a set of programs, the system simply allocates a piece of memory for the Hdata data (mainly the variables in the Hdata) and puts the memory starting point, except for the static variable, which is transferred into memory when the EXE file is loaded. When data fails, this piece of memory is freed.
The local variable, void aaaa () {int a = 1;} The body part of this program is about 5 bytes, a variable does not account for memory, when call AAAA () executes a line sink encoding add BP, 4 is to allocate four bytes in the stack to A,AAAA () return, the A variable occupies four bytes Be released.

3. Exit the memory that the EXE file occupies when the program is released.

The above is only a large principle, the actual situation is much more complex, like the allocation of memory is movable, and even put into memory. But this is enough for us to understand the application. Further down is the thing that makes the system and the person who does the compiler.

After writing, I found the post of 07, Ha, or sent it.This article is excerpt from the Web. Second articleThe c language runs the process of compiling a piece of printf code to the computer to run the output. The compiler is the intermediary that translates the code into 0 and 1 of the machine's known encodings. The compiler is tightly connected to the hardware. NET compiled by Microsoft implementation. NET platform at the same time, the introduction of a strong framework. In the XP system, it is necessary to install the framework itself, which is automatically integrated in the WIN7 system. The power of the framework is that the object-oriented base class library and the runtime clr. base Class library are tools classes that can be used when programming, not to mention Focus on the CLR. Runtime is the program run by the regulator, the program run up how to run, whether the memory is too much consumption, whether to clean up the memory, how to handle the exception, there are corresponding countermeasures in the run. The compiler here is not quite the same as the compiler above. In the process of compiling, it is not compiled into CPU-aware encoding, but the CLR recognizes the encoding. In the process of compiling, the contents of the frame library are added in. The result of compiling is EXE file, but this exe file is Il file, it cannot be recognized by CPU and can only be recognized by the CLR. Double-click the file, the CLR will load it into memory, when the appearance of the time is the compiler JIT. The role of a timely compiler is to recognize the Il file, and then operate the CPU to complete the corresponding operation. That is, the runtime converts the EXE into CPU-aware 0 and 1 encoding to manipulate the computer. Here involves a concept of mixed programming for a multi-lingual platform. Il Intermediate file is a standard, a specification. Files that conform to this specification can be run by CLR recognition. C#,F#,VB, the code that is written, as long as the corresponding compiler is provided, the compiled things can be run by the CLR recognition. And it's a good preparation for cross-platform. Because the specification for IL is OK, for porting C # programs, such as porting to Linux, we just need to have a CLR that can run on Linux, and this CLR will recognize the compiled IL run on Windows. Mono is a good example. . NET operation efficiency is entirely decided by jit. the intermediate language code according to the current hardware and software environment, run-time compile, and cache code. A timely compiler optimizes code based on the operating system and operating system and hardware environment. The CLR is managed code, plus the middle tier, why is the efficiency high? This is the JIT decision. Compile according to the hardware platform, not every code. For example, an empty for loop is not compiled to improve efficiency. If you call a method many times in your code, the CLR will cache the compiled code for the first time after the code is JIT-compiled, and the next time you use it,Do not compile, directly from the cache to remove the compiled code execution. There is also a garbage collection mechanism that removes unused or infrequently used code and re-creates it if you want to use it again. Defragment memory to make memory contiguous. This article is excerpt from the Web. The third program, C + +, typically runs in an unmanaged environment, and the class is composed of a header file (. h) and an implementation file (. cpp), each of which forms a separate compilation unit, and when we compile the program, several basic components translate our source code into binary.

The first is the preprocessor, and if there is a header file and a macro expression in the project, it will be responsible for containing the header file and translating all the macro expressions.

Next is the compiler, which does not generate binary code directly, but rather generates assembly code (. s), which is basically the common ground for all modern unstructured languages.

The assembler then translates the assembly code into the target code (. O and. obj files, machine instructions).

The last linker, which links all target files related to each other with the resulting executable file or library.

In general, our code is translated into assembly code first and then translated into machine instructions (binary code).

Compilation process for managed environments (C#/java)

In a managed environment, the process of compiling is slightly different, and the managed languages we know have C # and Java, and then we'll take C # and Java as examples to describe the compilation process in a managed environment.

When we write code in our favorite IDE, the first one that detects our code is the IDE (lexical analysis), which is then compiled into a target file and linked to a dynamic/static library or executable file for re-checking (parsing), and the last check is a run-time check. The common feature of the managed environment is that the compiler does not compile the machine code directly, but rather the intermediate code , called the Msil-microsoft intermediate Language,java in. NET, is the bytecode (bytecode)

After that, the runtime JIT (Just in time) compiler translates MSIL into machine code, which means that our code is parsed when it is actually used, which allows the CLR (common Language Runtime) to precompile and optimize our code to achieve improved program performance, but increases program startup time , we can also use NGen (Native Image Generator) to precompile our programs to shorten the startup time of the program, but without the benefit of run-time optimizations. (Jeffwong 's supplemental Java is compiled into bytecode by the compiler, then interprets bytecode as machine code at run time through the interpreter; C # is to compile the C # code into IL through the compiler and then compile IL into machine code through the CLR. So strictly speaking, Java is a compiled and interpreted language, and C # is a purely compiled language and needs to be compiled two times . ）

The. Net Framework is the addition of an abstraction layer on WIN32 core that provides the benefit of supporting multiple languages, JIT optimization, automatic memory management, and improved security, while another complete solution is WINRT, but this involves another topic, which is not described in detail here.

The pros and cons of JIT compilation

JIT compilation brings many benefits, the biggest one in my opinion is the advantage of performance, which allows the CLR (the common language runtime to play the assembler component) to execute only the required code, for example: suppose we have a very large WPF application, it is not immediately loaded the entire program, but the CLR starts executing when , the different parts of our code will be translated into local instructions in an efficient way, as it can check the system JIT and generate optimized code instead of following a predefined pattern. Unfortunately, one drawback is that the startup process is slow, which means that it does not work with packages that are loaded for long periods of time.

The JIT alternative uses NGen

If Visual Studio is created by JIT, then it's started we'll need to wait a few minutes, on the contrary, if it is compiled using NGen (Native Image Generator), it will create a pure binary executable, if only the problem of speed is considered, That's definitely the right choice.

In an unmanaged environment, we need to know that the process of compiling is divided into two stages of compilation and connection, the compile phase transforms the source program (*.c,*.cpp or *.h) into the target code (*.O or *.obj file), and the process is the first three phases of the C/s + + compilation process described above The link stage is to link the object code (obj file) in front into the code corresponding to the library function called in our program to form the corresponding executable file (EXE file).

In a Managed environment, the compilation process can be divided into lexical analysis, syntax analysis, intermediate code generation, code optimization and target code generation, and so on. NET or Java, they will generate intermediate code (MSIL or bytecode), then translate the optimized intermediate code into the target code, and finally, when the program runs, JIT translates il into machine code.

Whether it is a managed or unmanaged language, their compilation process is to translate the high-level language into machine code that the computer can understand, because the compilation process involves a wide range of knowledge (the principle of compiling and hardware knowledge), and my ability is limited, but also can simply describe these processes, if you want to learn more about the principles of compilation, I recommend you look at the principles of compiling.

Reference

[1] http://www.developingthefuture.net/compilation-process-and-jit-compiler/

Updated: 07/31/2013

This article Jk_rush

Original address: http://www.cnblogs.com/rush/

This article excerpt from the blog Park

C # Compiling and operating principles

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C # Compiling and operating principles

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support