How can a computer senior complete a compiler in six months? What do you want to learn?

Source: Internet
Author: User
There is already a problem about how a senior computer student can complete a simple operating system within six months (around the evening. What should I learn? In the answer, I see many people who recommend writing compilers. I want to know what the compiler needs to do for example, to write JavaScript or Python. How can I arrange it? The operating system has been completed.
How can a senior computer student complete a simple operating system in six months (about only at night. What should I learn?
In the answer, I see many people who recommend writing compilers. I want to know what the compiler needs to do for example, to write JavaScript or Python. How can I arrange it? Reply: if you are not persistent in mainstream languages, you can read chapter 1st, chapter 2, and Chapter 4th, and then write the scheme interpreter.

I do not recommend longshu or hushu because of the threshold. The theoretical book is not easy to understand for students who lack practice. Coupled with slow pace and a sense of accomplishment, it is likely to end up. So here I will only provide short and fast suggestions.

Chapter 1st and Chapter 2 are useful for functional and programming techniques.
For those who have some experience, after reading the simple description of the closure implementation in Chapter 2nd (two sentences), they may be unable to wait for a simple scheme interpreter; however, at this stage, your interpreter implementation is redundant, slow, and does not support end-to-end calling. Therefore, you need to read Chapter 2. If you want to write more, Chapter 1 is the essence of this book. lazy evaluation, pattern matching, continuation, and partial evaluation can all be inspired by this book.

What can you get by implementing a scheme interpreter?
You may feel the basic principles and optimization of the interpreter.
If you read the closure implementation in the lua source code as soon as it comes up, you may fall into various details. After it is easy to clarify the ins and outs, you may feel that it is too delicate. After all, key facilities in a quasi-industrial project are often the product of several adjustments and high optimization.
In scheme, you can talk about several lines of code to implement a concise and correct closure. In this case, look at the lua source code and disassemble it. net bytecode depends on the closure implementation of c #, which is very easy. After all, if you do the right thing first and then do a good job, this path is often more smooth.

If there is a certain degree of execution, then, 1 ~ In two months, you should always read the relevant content in the book and make one or more interpreters.

The following is a reference procedure:
  1. After reading the relevant sections of the publication object, use scheme to write a scheme interpreter. Because of the basic structure list in scheme, it is easy to construct an abstract syntax tree, coupled with the characteristics of the dynamic type of scheme language and powerful pattern matching, use scheme to implement a basic self-lifting interpreter, which requires only dozens of lines of code (perfect support for closures ). You will find that the several key branches of this interpreter exactly correspond to the three elements of lambda calculus, and are also the three elements of the "Basic Unit", "abstract", and "application" that are repeatedly emphasized in the file.
  2. Peter Norvig implements a scheme Interpreter using dozens of lines of python code: (How to Write a (Lisp) Interpreter (in Python )). As a result, you begin to know how to implement interpreters in mainstream languages. Welcome back to Earth... Norvig also introduced a technique that allows your interpreter to support tail recursion.
  3. The interpreter is implemented in C/C ++. For the first time, if you use a non-GC language to implement GC, you must begin to think carefully about memory management. We recommend that you do not assign values for the moment. In this way, in your function language interpreter, there will always be only new objects that reference the old objects and there will be no circular references, so you can directly use the reference count.
  4. Faster and better.
    • The previously implemented interpreters are based on the matching of the abstract syntax tree, which is equivalent to the visitor pattern in the OO design pattern. Compared with the practice of interpreting the syntax tree while traversing the syntax tree, we try, after the code is converted into bytecode and saved, only the execution bytecode is interpreted? You can achieve significant performance improvement. This preprocessing is a partial evaluation.
    • Find performance bottleneck in variable evaluation? You need to introduce lexical addressing. If you optimize the environment step by step, you will find that your environment chain is folded into the mainstream form of lua/C.
    • Abandon the reference count and turn to the real GC? Mark and sweep, copying gc, mark-compact gc, and generational gc? How can we make the c function registered in your environment always accessible to the temporary GC objects allocated in the function body? When moving a gc object, how can we ensure that all pointers to it are modified? (Especially the pointer variable on the stack)
    • Obviously, a program cannot always be written in the form of tail recursion. How can we let non-tail recursive Programs explain and avoid stack overflow? The skills introduced by Norvig are not enough. You need to consider implementing the CPS interpreter or performing the full-text CPS transformation. As a result, the stack space is moved to the heap, And the stack will always have only one frame.
    • Try to pre-process the library forms such as let, let *, and cond in scheme to form the basic form? So you begin to realize the power of macros. This is a common powerful weapon other than data processing action and functional processing action, syntactic processing action.
    • How to Implement call/cc In the CPS interpreter? Through a full-text CPS transformation, the output does not contain the call/cc common scheme source code to your DS interpreter?
    • If you want to do better, there are still many problems that you need to think about. The process of solving these problems can help you exercise.
The JavaScript specification has more than 200 pages (the number of Harmony pages must be doubled). It takes a month for you to read the Spec ......
As a lisp dialect, the NLP skills are highlighted along the two pictures below in two aspects.

Isn't there a book?


Turing community: Books: two-week self-made scripting language


And this book:


Turing community: Books: Self-made programming languages

Is this a compiler? Are you nervous? O (distinct v Branch) o
Are you sure you want to write a compiler? If you haven't figured it out, let's take a look at the articles I recommend below.
Preliminary stage:
Inspector
After reading the above, if you are sure to do the following work, since your time is very short. If you are only available at night, learn to schedule a time of (mom, don't tell me it's too early !!) From to (many schools turned off the lights and disconnected the network at this time ).



Basic knowledge
  1. If you want to write the compiler bare, what other information besides the compilation principle can you refer? Start from what? (C/c ++ )?
  2. How difficult is it to develop a C ++ compiler?
  3. How to Write a compiler (using C language to write a C language compiler)? What knowledge is required to pave the way? Can you give recommendations for related websites and books?
  4. What is the working principle of the compiler on the hardware layer?
  5. Write a compiler series from scratch-ordinary and extraordinary-zhihu Column

The first four are what you need to understand in the first night. If you don't understand it in 4 hours, go to the documents and Google (Google ). Do not continue the content at the next night. Check whether there are any missing records from 1 to 4 points.
Let's take a look at the four points to determine the language of the compiler you are writing. If you want to do so, you can start the 5th point. By the way, you need to buy a book, it is too difficult to hear from others about buying dragon books and tiger books. You can buy a Turing Community recommended by @ Zhao Yi: Books: two-week self-made scripting language .

Writing a compiler is boring and it is not easy to see the interface. In the final analysis" Tossing"; He thinks and cherishes writing from The lisp Language compiler such as scheme .. It does not need to be fully implemented. You can select a subset and gc closure ..

You can first write the interpreter, then try to compile it into other languages, and then try to compile it into binary

There are too many materials. Search for them by yourself.

Well, I have mentioned a question. Many of the answers below are very good. You can find a C compiler by yourself. In fact, you just translate C into a compilation. The most important thing for compilers to do well is to compile code optimization. However, it is the first priority for beginners to make translations.

It is easy to use and can be handwritten. For grammar, you can find the Standard C syntax and use tools such as Bison to generate an analysis table (but I wrote an LR1 analysis table generator like wheel brother). The rest is simple, you can generate an AST. Then, the translation into the assembly is to traverse the AST. I didn't handle all the C syntax, for example, the struct won't work. The basic C language structure for translation is complex, but not difficult. If it is a compilation expert, it will be easier. (In fact, it is best to generate an intermediate representation of IR and then translate it into a compilation)

During the write process, the design of AST nodes is critical and a lot of information needs to be stored. In addition, the design of symbol tables is also important. In the translation process, I really feel that the type is a factor that needs to be taken into account all the time. It is also a plus or minus factor, and the integer and floating point types are much different at the Assembly level. In addition, I have always thought that the grammar-guided translation in textbooks is too boring. For many grammar languages, I have to change the grammar. Therefore, it is easier to build AST honestly and translate it. Write from the C compiler. Do not write Python and Javascript compilers, especially Javascript compilers.]

I called it on my mobile phone yesterday. Let's explain it today.

Neither Javascript nor Python is a context-independent language, which is more difficult to do. C89 has a complete LALR (1) compatible EBNF syntax, which can be easily found online.

For more information about grammar and analysis methods, see one of my answers:

How to create a computer language? The simplest way to write a compiler is to look at what others have written. Of course, I have learned how to compile the compiler, at the end of last year, I implemented one by myself during the compilation of the principle course. In fact, it was just a few weeks before and after when I thought about it based on other people's code.
C ++ implements a subset of about 1200 rows
Duduscript/pl0 · GitHub Welcome to fork. I don't have the ability to write compilers myself, but I heard that longshu, hushu, and whale books are very good. You may not write them out after reading them, but it must be helpful! "Many people recommend writing compilers." It is estimated that 80% of the so-called "Many people" are installed with B.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.