There is already a problem with completing the operating system
A senior computer student how to complete a simple operating system within six months (probably only at night). What should we learn?
In the answer to see a lot of people recommend to write compiler, want to know for example to write JavaScript or Python compiler, what needs to do, how to arrange?
Reply content:
If you are not obsessed with the mainstream language, you can look at SICP 1th, 2 and 4th, and write the scheme interpreter after reading it.
I do not recommend Dragon Book, Tiger Book or anything, because of the threshold. That biased book, for the relatively lack of practice of students, not easy to understand, plus a slow pace, the sense of accomplishment late, it is likely to rot tail. So I'm just offering fast track approach advice here.
SICP 1th, 2 chapters, as a functional and programming skills of the tutorial, have been very beneficial to people.
For the students who have some experience, after reading the 2nd chapter on the implementation of the closure of the simple description (2 sentences), it is likely to be eager to a simple scheme interpreter, but at this stage of your interpreter implementation, redundant, slow, not support tail calls, so you need to see the 4th chapter. If you want to write more, the 4th chapter is the essence of the book, lazy evaluation, pattern matching, continuation, partial evaluation can be inspired from the book.
What can you gain by implementing a scheme interpreter?
You have a certain sense of the rationale and optimization of the interpreter.
If you read the LUA source in the implementation of the closure, you may fall into a variety of details, and so much to clear the ins and outs, but think it is too sophisticated. After all, a critical facility in a quasi-industrial project is often a product of several tweaks and highly optimized products.
And if there is in scheme, by chatting a few lines of code, to achieve a concise, correct closure experience, then, in turn, look at the LUA source, disassembly. NET bytecode to see the implementation of the closure of C #, it is very easy. After all, it is often better to do it first and then do it well.
If there is a certain execution force, then, 1-2 months, you should always read the book in the relevant content, make 1 or more interpreters, not white go this one.
Here's a step for you to refer to:
- Read SICP related chapters and write a scheme interpreter using scheme. Because of the basic structure list in scheme, it is easy to construct abstract syntax tree, plus the feature of scheme language dynamic type, and strong pattern match, implement a most basic bootstrap interpreter with scheme, just dozens of lines of code (perfect support closure OH). You will find that the key branches of this interpreter correspond exactly to the 3 elements of the lambda calculus, which is also the "basic unit", "abstraction", and "application" element that the SICP book has repeatedly emphasized.
- Look at Peter. Norvig implements a scheme interpreter with dozens of lines of Python code: (How to Write a (LISP) interpreter (in Python)). So, you start to know how to implement the interpreter in the mainstream language, welcome back to Earth ... Norvig also introduces a technique that allows your interpreter to support the tail recursion.
- The interpreter is implemented in C + + language. To implement the GC language for the first time in a non-GC language, you must begin to think seriously about memory management issues. It is recommended that you do not implement assignments temporarily, so that in your functional language interpreter, only new objects will always refer to old objects, no ring references, and reference counts can be used directly.
- Faster and better.
- The previously implemented interpreter, which is based on the matching of abstract syntax trees, is equivalent to visitor pattern in oo design patterns, and attempts to iterate through the syntax tree while attempting to parse and save the bytecode, and then only explain the execution bytecode? You can gain significant performance gains. This preprocessing is a partial evaluation.
- Discover performance bottlenecks in variable evaluation? You need to introduce lexical addressing (lexical addressing). Here you step-by-step optimization, you will find your environment chain, is folded into the mainstream form of lua/c#.
- Discard the reference count and turn to the real GC? Mark and Sweep, copying GC, Mark-compact GC, generational GC? How to register a C function in your environment, and the temporary GC object allocated in its function body is always up to date? When moving a GC object, how do you ensure that all pointers to it are modified? (especially pointer variables on stacks)
- Obviously, a program does not always write the tail recursive form, how to let the Non-tail recursive program interpretation, nor stack overflow it? Norvig introduction of the technique is not enough, you need to consider implementing CPS interpreter or full-text CPS transformation, so the stack space is moved into the heap, the stack will always have only one frame.
- Try to make a trip in scheme to the basic form of library forms such as Let, let* and cond? So you start to realize the power of the macro, which is a common powerful weapon outside of data abstraction, functional abstraction, syntactic abstraction.
- Implement CALL/CC in the CPS interpreter? Through a full-text CPS transformation, the output does not include CALL/CC common scheme source to your DS interpreter?
- If you want to do better, there are a lot of questions you need to think about, and the process of solving them is a great way to get you to exercise.
JavaScript specifications have more than 200 pages (Harmony number of pages to double), you read Spec all have one months ...
To do a Lisp dialect, in two ways follow these two graphs to illuminate the skill in the picture
Don't you have this book?
Turing Community: Books: Two weeks homemade scripting language
And this one:
Turing Community: Books: Home-made programming languages
The graduation thesis is a compiler, right? Are you nervous? O (≧v≦) o
Let's be clear, are you sure you want to write a compiler? If you don't think about it, take a look at my recommended article
Early:
Edit Translator
After reading the above if OK, then do the following work, since your time is very short. Only at night, learn to arrange the time 19 o'clock (mother Egg, don't tell me it's too early!!) Start at 23 o ' (many schools are off the grid at this time).
Basic knowledge
- Want to write the compiler, in addition to compiling the principle of the data can be referenced? What should I start with? (in C + +)?
- How difficult is it to develop a C + + compiler, and where is the difficulty?
- How to write a compiler (in C language to write C language compiler), what is needed to pave the way, you can give the relevant website and book recommendations?
- How does the compiler work on a hardware level?
- Write a compiler from scratch series-ordinary and extraordinary-the column
The first 4 is what you need to figure out in the evening. 4 hours, if you do not understand then immediately to consult the information and Google (Google
, this does not have to turn). The second night, do not continue the 5th content. Take a look at 1-4 points from the beginning and see if there are any missing local records.
Look at 4 o'clock and make sure you write a compiler for what language you want, and you'll be able to make a 5th. By the way you need to buy a book, do not listen to others said to buy Dragon books, Tiger books, etc., these masterpieces are too difficult. You buy a Allen. Recommended Turing Community: Books: Two weeks homemade scripting language
。
Writing a compiler is a dull thing, and it's not easy to see the interface. It boils down to "
TossAnd think and cherish the compilers of Lisp languages like scheme to start writing. And do not have to be fully implemented, you can choose a subset, GC closure and so on can not.
You can write the interpreter first, try compiling it in another language, and try compiling it into binary
This information is too much, you search it yourself.
Book, I mentioned a question, a lot of answers are very good, you look for me to write a C compiler, in fact, just translate C into a compilation. compiler to do well, the most important thing is to compile code optimization, but the novice to make the translation is the first.
The morphology is simple and can be handwritten. Grammar can be found in standard C grammar, using tools such as bison to generate an analysis table (but I like wheel brother, I wrote a LR1 Analysis table generator), the rest is simple, you can generate an AST. Then, translating into a compilation is traversing the AST, and I don't have to deal with all of C's syntax, such as structs. The basic C language structure, although complex, but not difficult, if it is a master of the Assembly, it is more relaxed. (In fact, it is best to generate intermediate representations of IR, then translate into compilations)
In the process of writing, AST node design is key, to save a lot of information, the other symbol table design is also very important. In the process of translation, I really feel that the type is a factor that needs to be considered all the times, as well as the addition and subtraction, shaping and floating-point type in the compilation layer is much worse. In addition, I have always believed that textbooks taught grammar guidance translation too dog blood, for a lot of grammar language, have to change the grammar, so or honestly to establish AST, translation is more happy. Start with the C compiler, and do not write Python and JavaScript compilers, especially JavaScript compilers. ]
Yesterday was on the mobile phone, today to explain.
JavaScript and Python are not context-independent languages and are relatively difficult to do. And C89 has a complete LALR (1) compatible EBNF grammar, online is also easy to search.
For a study of grammar and analytical methods, see one of my answers:
How to create a computer language? The simplest way to write a compiler is to go directly to someone else's writing, the main year of course is to learn the principle of compiling, I was at the end of the compilation of the principle of the time to achieve their own, in fact, according to the code of their own thinking after writing out, before and after a few weeks of time just
C + + Implementation Pascal subset is about 1200 rows
Duduscript/pl0 GitHub
Welcome fork I do not have the ability to write the compiler, but I heard, Dragon Book, Tiger Book, and Whale book, very good, you see, not necessarily can write, but certainly help! "A lot of people recommend writing compilers," estimates the so-called "a lot of people" have 80% is loaded b