Compilation is one of my biggest headaches. I have learned three times, but I have not really understood it (but I still wrote some "experiences" without saying anything). Now I have learned it for 4th times. The most profound understanding of computer science is that everything can only be done by yourself. At present, there are still few practices in terms of compilation principles, so it is not easy to understand.
Talk about your learning experience.
The current textbook is compiler design in C. The biggest advantage of this book is that it provides a complete CompilerCodeHowever, it is written in C, and I have a poor foundation in C.
C # is also used in work #. However, cProgramI can't write it, but I still understand it. Looking at it, we are looking at the lexical analysis to convert NFA into the DFA part.
In addition, a simple code Analysis Tool (counting several lines of code and several lines of comment statements in each class and method) is developed using the ready-made C # syntax and anlr tool ), and a small tool that inserts statements at the beginning and end of a method to detect performance.
I also read Part Of The anlr code and the compiler code generated by it, which is basically confused.
The first chapter of compiler design in C provides a simple compiler code, which I have rewritten with C. You may be able to use simple applications in the future.
Since the book was published earlier, some knowledge is outdated. For example, the input subsystem uses a buffer system for reading data in segments, which is of course related to the current hardware conditions. When I look at the old version of anlr code, it seems that a similar loop structure is used for caching (Because Java has no pointer), but the new version of code simply uses a large array, read all characters at a time.
In this simple compiler code, Parser calls lexer to read a token only when it needs a token. In anlr code, when parser calls lexer for the first time, all the tokens are extracted. This may be more efficient.
During lexical analysis, it seems that NFA is built and compiler design in C is directly transferred from the regular expression, while anlr is transferred from the abstract syntax tree, which may be more convenient.
Simple lexical analysis can be hard-coded and handwritten. It seems that NFA and DFA are mainly used to adapt to automatic tools such as lex. In actual work, complicated situations may be generated by tools, instead of implementing NFA and DFA manually. Learning these things may be mainly to "know why" and to learn some ideas (for example, the DFA conversion table seems to be applicable in some status-related issues ), the specific implementation does not necessarily seem to be used.
The above are some records learned so far. Now, it takes a lot of effort to learn something. As learning progresses, I will try again later.