This article discusses:
Language definition
Compiler stages
CLR Abstract Stack
Tools to get IL correctly
This article uses the following techniques:
The. NET Framework
Compiler hackers have a reputation in the field of computer science. When I saw Anders Hejlsberg a speech at the Professional developers ' conference and walked down the podium, a group of people immediately asked him to sign the book and pose for a photo. For those who are committed to learning and understanding LAMBDA expression details, type systems, and assembly language, hackers have a mysterious intellect. You can now share some of the glories by writing your own microsoft®.net Framework compiler.
There are hundreds of compilers for the. NET Framework, which are compiled for code written in dozens of languages. These languages blend in the. NET CLR, and the code can run smoothly and interoperate without conflicts. When building large software systems, skilled developers can take advantage of this feature to add some C # and Python code to the program. These developers are really impressive, but they can't be compared to the real masters, the compiler hackers, because the gurus have a deep understanding of virtual machines, language design, and the specifics of these languages and compilers.
In this article, I'll take you through a code written in C # with a compiler ("good for the Nothing" compiler with a very good name), and introduce you to the advanced architecture, rationale, and. NET Framework APIs needed to build your own. NET compilers. Start with a language definition, then explore the compiler architecture, and then take you to the code generation subsystem that is used to build the. NET Assembly. The goal of this article is to help you understand the basics of compiler development and to learn more about how languages can be effectively programmed against the CLR. I'm not really developing a language to replace C # 4.0 or IronRuby, but there's a lot of little-known technical secrecy in this discussion that will inspire you to be enthusiastic about compiler development technology.
Language definition
Software languages are developed for a specific purpose. From improved information representation (such as Visual basic®) to increased productivity (such as Python, designed to make the most efficient use of each line of code), to specialization (such as Verilog, a hardware description language for processor manufacturers), Even just to satisfy the author's personal preferences (for example, Boo's creator has a special interest in the. NET Framework and a disdain for other available languages), the purpose varies widely.
After you have determined your purpose, you can design the language (you can consider this process as a language blueprint). The language of the computer must be very precise so that programmers can accurately express what is needed so that the compiler is able to accurately understand and generate executable code for the exact content being expressed. You must specify a language blueprint to eliminate ambiguity during the implementation of the compiler. To do this, you can use the META syntax, which is used to describe the syntax of a language. There are quite a few meta grammars now, so you can choose one based on your preferences. I'm going to use a meta syntax called EBNF (Extended backus-naur Form) to specify the good for nothing language.
It is necessary to mention that EBNF is very famous: it was invented by the Turing Prize winner and FORTRAN main developer John Backus. A deep discussion of EBNF is not within the scope of this article, but I will explain the basic concepts.