Clr c ++ hosting and unmanaged

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Compile the source code into managed code
1. CLR (Common Language Runtime) is a "RunTime" that can be used by multiple languages. Core CLR functions (such as memory management, assembly loading, and security, exception Handling and thread synchronization) can be used by all CLR-oriented languages. For example, when "RunTime" uses exceptions to report errors, all languages for "RunTime" can report errors through exceptions.
2. You can regard the compiler as a syntax checker and a "correct code" analyzer. They check your code and make sure everything has some meaning, then output the code that describes your intent. Different programming languages allow different syntaxes for development. Do not underestimate the value of choice. For example, in mathematics and finance, using the APL syntax saves development time than using Perl to express the same intent.
3. Microsoft has created several compilers for "RunTime", including C ++/CLI, C #, Visual Basic, F #, iron python, iron Ruby and an intermediate language (IL) Assembler. Some other companies and universities have developed compilers such as Ada, APL, COBOL, and PHP, no matter which compiler is used, it is a managed module after compilation ). Managed code is a standard 32-bit windows portable execution (pe32) file.
4. Composition of PE files
① Pe32 or pe32 + header: standard Windows PE file header. If this header is in the pe32 format, the file can be run in the 32-bit or 64-bit version of windwos. If pe32 + is used, files can only run in the 64-bit version of windwos. This header also identifies the file type, including GUI, Cui, DLL, and contains a time mark to indicate the file generation time. For code modules that only contain Il, most information in the PE (+) header is ignored. For modules that contain local CPU code (such as those written in C ++ ), this header contains information related to the local CPU code.
② CLR header: contains the information that this module becomes a managed module (which can be explained by Clr and some utilities ). The header contains the required CLR version, some flags, methoddef metadata token for the managed code entry method (main), and module metadata, resources, and strong names, position/size of some flags and other less important data items.
③ Each managed module contains metadata tables. There are two types of tables: one is to describe the types and Members defined in source code, and the other is to describe the types and members referenced by source code.
④ Il (intermediate language) code: the compiler compiles the source code to generate the code. During the runtime, CLR compiles the Il CPU commands at a cost.
5. The code generated by the local compiler is for a specific CPU architecture (such as x86, x64, and IA64), and the CLR generates intermediate code (IL ). Sometimes it is also called managed code, because CLR needs to manage its execution.
6. In addition to generating Il, the compiler also generates complete metadata (that is, part 3 of the PE Header) in the hosting module ). Metadata (metadata) is a set of data tables that describe the content defined in the module, such as the type and members. There are also some reference content, such as the Import Type and members. Metadata is always compiled with the Il code into the managed code. Usage of metadata:
A. during compilation, metadata eliminates the need for local C/C ++ headers and library files, because in the Il code file responsible for implementing types/members, it contains all information about the reference type/member. The compiler can read metadata directly from managed code.
B. Visual Studio uses metadata to help you write code. Its "Intelligent Sensing" technology can parse metadata and point out the methods, attributes, time, and fields that a type provides.
C. The CLR code verification process uses metadata to ensure that only "type security" operations are performed.
D. Metadata allows you to serialize an object field to the memory, send it to another machine, deserialize it, and recreate the object on a remote machine.
E. Metadata allows the Garbage Collector to track the object's lifetime. The garbage collector can determine the type of any object and know which field of the object references other objects from the metadata.
7. Microsoft's c ++ compiler generates an unmanaged (local) code EXE/DLL module by default, and operates the local memory at runtime. These modules can be executed without the need for CLR. However, if a/CLR switch is specified, the C ++ compiler can generate a module containing managed code. Of course, this code can be executed only on the machine where the CLR is installed. Among the many compilers, the C ++ compiler is the most special one.

2. Merge managed code into an assembly
1. the CLR does not actually work with modules. Instead, it works with the Assembly. Assembly is an abstract concept. It is difficult for beginners to grasp its essence. I am. An assembly is a logical group of one or more modules/resource files. Second, the Assembly is the minimum unit of reuse, security, and version control. In the world of CLR, it is quite similar to a component. Chapter 2nd provides a deeper introduction, which is skipped here.
2. figure 1-2 helps to understand the Assembly. Some managed modules and resource (or data) files are prepared for processing by a tool that generates a separate pe32 (+) file to represent the logical grouping of files. Actually, this pe32 (+) file contains a data block named "list. A list is a collection composed of metadata tables. These tables describe the files that make up the assembly, the types that are publicly exported by the Assembly files (that is, the public type defined in the Assembly, which is visible outside the Assembly ), and the resources and data files associated with the Assembly.

By default, the managed module generated by the compiler is converted into an assembly.

Iii. Loading public Language Runtime
1. to check whether the. NET Framework is installed, check the mscoree. dll file in the % SystemRoot % system32 directory. This file exists, indicating that. NET Framework is installed. Several versions may be installed on one machine. To learn which versions are installed, check the following registry subkeys:
KEY_LOCAL_MACHINE/software/Microsoft/NET Framework Setup/NDP
This subitem does not exist in the XP environment. It should be:
KEY_LOCAL_MACHINE/software/Microsoft/. NET Framework
2. netframework sdkimplements the clrver.exe utility to view installed. Net versions.
> Clrver // view all installed. Net versions on the local machine.
> Clrver-All // view the. NET version used by all running. net programs.
> Clrver 322 // view the. Net version used by the program whose process number is 322.

4. Run the Assembly Code
To execute a method, you must first convert its Il to the local CPU command. This is the responsibility of the clr jit (just-in-time or "instant") compiler.

Figure 1-4 shows what happens when a method is called for the first time.

Just before the main method is executed, CLR will detect all types referenced by the main code. This causes the CLR to allocate an internal data structure, which is used to manage access to the referenced type. In Figure 1-4, the main method references a console type, which causes the CLR to allocate an internal structure. In this internal data structure, each method defined by the console type has a corresponding record item 10. Each record item contains an address. You can find the implementation method based on this address. During the initialization of this structure, CLR sets each record item to an undocumented function included in the CLR. I call this function
Jitcompiler.
When the main method calls writeline for the first time, the jitcompiler function is called. The jitcompiler function is responsible for compiling the Il code of a method to the local CPU command. Because Il is compiled in "instant" (just in time), this component of CLR is usually called jitter or JIT compiler.

NOTE: If an application runs in Windows x86 or wow64, the JIT compiler generates x86 commands. If the application
Run in Windows x64 or itanium as a 64-bit application, then the JIT compiler will generate x64 or
IA64 command.
When the jitcompiler function is called, it knows the method to be called and the type of the method defined. Then, jitcompiler searches for the Il of the called method in the metadata of the defined (this type) assembly. Then, jitcompiler verifies the Il code and compiles the Il code into local CPU commands. The local CPU command is saved to a dynamically allocated memory block. Then, jitcompiler returns the internal data structure created by CLR for the type, finds the record corresponding to the called method, and modifies the original reference to jitcompiler, let it point to the memory block now (including the locally compiled
CPU command) address. Finally, the jitcompiler function jumps to the Code in the memory block. These codes are exactly the specific implementation of the writeline method (the version that gets a single string parameter. When the code is executed and returned, it is returned to the Code in main and continues to be executed as usual.
Now, main needs to call writeline for the second time. This time, because the writeline code has been verified and compiled, the code in the memory block is executed directly and the jitcompiler function is skipped completely. After the writeline method is executed, main is returned again. Figure 1-5 shows what happened when the writeline was called for the second time.

A method may cause some performance loss only when it is called for the first time. All future calls to this method will run at full speed in the form of local code, without re-Verifying the Il and compiling the local code.
The JIT compiler stores local CPU commands in dynamic memory. Once the application is terminated, the compiled code is also discarded. Therefore, if you run the application again in the future, or start two instances of the application at the same time (using two different operating system processes), the JIT compiler must re-compile the Il compilation cost command.
For most applications, the performance loss caused by JIT compilation is not significant. Most applications call the same method repeatedly. During application running, these methods only affect the performance at one time. In addition, the time spent inside the method is likely to be much longer than the time spent on calling the method.
It should also be noted that the clr jit compiler will optimize the local code, which is similar to the work done by the backend of the unmanaged C ++ compiler. Similarly, it may take a lot of time to generate optimized code. However, compared with the absence of optimization, the code will achieve better performance after optimization.

5. Local Code Generator: ngen.exe
Using the ngen.exe tool provided by the. NET Framework, You can compile the Il code locally when an application is installed on your computer. Since the code has been compiled during installation, the clr jit compiler does not need to compile the Il code at runtime, which helps improve the application performance. Ngen.exe plays an important role in two situations:
L accelerate application startup speed ngen.exe can speed up startup because the code has been compiled into local code and does not need to be compiled during runtime.
L reduce the working set of the application if a program assembly is loaded to multiple processes at the same time, running ngen.exe on the Assembly can reduce the working set of the application ). Ngen.exe compiles the code at a cost and saves the code to a separate file. This file can be mapped to multiple process address spaces through "memory ing", so that the code is shared, so that each process needs to copy the Code separately.

Vi. framework class library
. NET Framework contains the framework class library (FCL ). FCL is a collective term for a group of DLL assembly, which contains thousands of Type Definitions. Each type exposes some functions. This is not the only Library released by Microsoft. Other libraries include Windows sideshow managed API sdk15 and DirectX SDK. These additional libraries provide more types and more available functions. In fact, Microsoft is releasing a large number of libraries at an astonishing speed, and developers are using various
Microsoft technology has never been simpler.

VII. General Type System
CLR is fully centered on the type, which should be obvious so far. Type: Application and other types. Through the type, the code written in one programming language can communicate with the code written in another language. Because the type is the foundation of CLR, Microsoft has developed a formal specification called "Common Type System (CTS)", which describes the definition and behavior of the type.
The CTS specification specifies that a type can contain zero or multiple members. These members will be discussed in more detail in Part II "design type. At present, I just want to briefly introduce them.
L field is a data variable that is part of the object state. Fields are distinguished by name and type.
L method: a function that can perform an operation on an object, usually changing the object state. The method has one name, one signature, and one or more modifiers. The number (and sequence) of the specified parameters in the signature; The type of the parameter; whether the method has a return value; if there is a return value, the type of the return value is also specified.
L property for callers, this member looks like a field. But for Type implementers, it looks like a method (or two methods, called getter and setter, or value method and value assignment method ). Attribute allows the real-time user to verify the input parameters and object status before the access value, and/or calculate a value only when necessary. Attribute also allows users of the type to use simplified syntax. Finally, you can use attributes to create read-only or write-only "fields ".
L Event Events implement a notification mechanism between objects and other related objects. For example, you can use an event provided by a button to notify other objects after the button is clicked.
CTS also specifies the type Visibility rules and access rules for type members. For example, if you mark a type as public (using the public modifier in C #), any assembly can see and access this type. However, if you mark the type as assembly (using the internal modifier in C #), only the code in the same Assembly can see and access this type. Therefore, using the rules set by CTS, the Assembly creates a visible boundary for a type, and the CLR enforces (enforces) These rules. Although the caller can "see" A type, it does not mean that the caller can access it as needed. Use the following options to further restrict access by callers to members of the type.
L private members can only be accessed by other members of the same class type.
L family members can be accessed by a derived type, regardless of whether the types are in the same assembly. Note that many languages (such as C ++ and C #) use the protected modifier to identify the family.
L family and Assembly members can be accessed by derived types, but these derived types must be defined in the same assembly. Many languages (such as C # and Visual Basic) do not provide such access control. Of course, this is not the case for the Il assembly language.
L Assembly members can be accessed by any code in the same assembly. Many languages use internal modifiers to identify assembly.
L family or assembly members can be accessed by derived types in any assembly. Members can also be accessed by any type in the same assembly. In C #, the protected internal modifier is used to identify family or assembly.
L public members can be accessed by any code in any assembly.
In addition, CTS defines rules for type inheritance, virtual methods, and object lifetime. These rules are designed to adapt to the semantics that can be expressed in modern programming languages.

8. Public language specifications
COM allows objects created in different languages to communicate with each other. Currently, CLR integrates all languages and allows you to use objects created in one language. The reason for this integration is that CLR uses the standard type set, metadata (self-described type information), and public execution environment.
Language integration is an ambitious goal, and the most difficult problem is that there are great differences between different programming languages. For example, some languages are case-insensitive when processing symbols, and some languages do not support unsigned integers, Operator overloading, or variable parameter quantities.
To create a type that is easily accessible from other programming languages, you can only select from your own programming language to determine which features are supported by all other languages. To help, Microsoft defines a common language specification (CLS), which defines a minimum set of functions in detail. Any compiler-generated type must support this minimum feature set to be compatible with components generated by other "CLS-compliant and CLR-oriented languages.
As shown in figure 1-6, CLR/CTS provides a feature set. Some languages expose a large subset of CLR/CTS. For example, if developers use the Il assembly language to write programs, they can use all the functions provided by CLR/CTS. However, most other languages (such as C #, Visual Basic, and Fortran) only disclose a subset of CLR/CTS to developers. CLS defines that all languages must support
A minimum function set.
When defining a type in one language, if you want to use this type in another language, do not use any function outside CLS in the public and protected members of this type. Otherwise, other developers may not be able to access this type of Members when writing code in other languages.
The following code uses C # To define a CLS-compliant type. However, the type contains several constructor that does not conform to CLs, resulting in the C # compiler error:
Using system;
// Tell the compiler to check CLS compatibility
[Assembly: clscompliant (true)]

Namespace somelibrary {
// A warning is displayed because it is a public class.
Public sealed class somelibrarytype {

// Warning: the return type of somelibrary. somelibrarytype. ABC () does not match Cls.
Public uint32 ABC () {return 0 ;}

// Warning: somelibrary. somelibrarytype. ABC ()
// Does not match CLS
Public void ABC (){}

// No warning is displayed: The method is private.
Private uint32 ABC () {return 0 ;}
}
}
The above Code applies the attribute16 [Assembly: clscompliant (true)] to the Assembly. This attribute tells the compiler to check the pubic type, determine whether there is any inappropriate structure, and prevent access to this type from other programming languages. When the above Code is compiled, the C # compiler will report two warning messages. The first warning is that the ABC method returns an unsigned integer. in some languages, the unsigned integer cannot be operated. The second warning is that this type exposes two public methods. These two methods (ABC and ABC) are only case-insensitive and return type. Visual Basic
These two methods cannot be distinguished from other languages.
Interestingly, the public words before sealed class somelibrarytype are deleted and re-compiled. Both warnings will disappear. As a result, the somelibrarytype is set to internal (rather than public) by default and will not be made public outside the assembly.

IX. Interoperability with unmanaged code
. NET Framework provides many advantages that other development platforms do not have. However, there are not many companies that are determined to redesign and re-implement all their existing code. Microsoft also knows this problem and provides mechanisms through CLR to allow both hosted and unmanaged code to be included in the application. Specifically, CLR supports three types of interoperability scenarios.
L The managed code can call an unmanaged function hosting code in the DLL. A mechanism called P/invoke (platform invoke) can be used to call the functions contained in the DLL. After all, many types defined in FCL must internally call functions exported from kernel32.dll, user32.dll, and so on. Many programming languages provide a mechanism that allows hosted code to easily call unmanaged functions contained in the DLL. For example, the C # application can call the createsemaphore function exported from kernel32.dll.
L hosted code can use existing COM components (servers). Many companies have implemented a large number of unmanaged COM components. Using the Type Library from these components, you can create a managed assembly to describe COM components. Managed code can access types in managed assembly like any other managed type. For more information, see the tlbimp.exe tool 17 provided by the. NET Framework SDK. Sometimes, you may not have a type library or want to control the content generated by tlbimp.exe. In this case, you can manually build a type in the source code so that the CLR can use it to achieve correct interoperability. For example
C # Use the DirectX COM component in the application.
L unmanaged code can use many existing unmanaged code of the managed type (server) to provide a COM component to ensure that the code works correctly. With managed code, these components can be implemented more simply, avoiding all Code having to deal with reference counting and interfaces. For example, you can use C # To create an ActiveX control or a shell extension. For more information, see the tlbexp.exe and regasm.exe tools provided by the. NET Framework SDK.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More