You have to fill out this hole before you leave. May be a bit hasty, if there is a mistake, please make corrections.
Lisp-like language syntax
(define fib (lambda (n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))))
Observe the above Fibonacci number generation function.
- Using
define
keyword binding variables with entities
- Use all prefix expressions
- No
return
keywords
- Recursive implementation of cyclic multi-use
- Functions are called "lambda expressions" and are
lambda
defined using keywords
- function must have a return value
Basic knowledge of compiling principles
A program written in a language is essentially a string, and the task of the compiler/interpreter is to translate the string into an encoding that an "executor" can understand.
I think the biggest difference between the compiler and the interpreter is that there is a clear output of the former, and the output is a string (encoding) that an executor can understand, and the latter has no obvious output, and it executes directly what the string (the source program) is expressing (this should be challenged here).
This article only describes the basic composition of the interpreter.
The basic process from receiving the source program to the final execution of the interpreter is:
- Lexical analysis. This step decomposes the source code into the smallest constituent element of the language (a short string of one). If
var a = 1;
it is broken down into a [‘var‘, ‘a‘, ‘=‘, ‘1‘]
. Like this short string, we call it "Token".
- Syntax analysis. This step analyzes the Token list according to the language's own grammar rules, checks the source code for syntax errors, and if not, resolves an abstract syntax tree (AST), and the relationship between semantics and code blocks is described by a tree-like data structure.
- Perform. When the AST is generated, the interpreter continues to parse the AST and executes according to the semantics expressed by the AST, and the result is the execution result that the source code wants. Some language features, such as closures, are implemented in the execution function.
Realize
Next we will implement a Lisp-like language interpreter that supports integer and floating-point arithmetic, array type, lambda expression, and basic logical statements.
Lexical analysis
Because the Lisp language itself is relatively concise in syntax, the implementation of lexical analysis can be relatively simple:
function tokenize(program) { return program.replace(/\(/g, ‘ ( ‘).replace(/\)/g, ‘ ) ‘).split(‘ ‘)}
Syntax analysis
In the process of generating an AST, each token's "identity" needs to be confirmed, such as Var and a are keywords, 1 is an integer, and token is identified. This uses an array to implement the AST.
Generates abstract syntax tree function read_from_tokens (tokens) {if (tokens.length = = = 0) {throw new Error (' unexpected EOF while R Eading ')} Let token = Tokens.shift () while (token = = =) {token = Tokens.shift ()} if (' (' = = = = Ken) {Let L = [] while (tokens[0] = = = ") {Tokens.shift ()} while (Tokens[0]!== ') {L.push (Read_from_tokens (tokens)) while (tokens[0] = = =) {Tokens.shift () }} tokens.shift () return L} else if (') ' = = = token} {throw new Error (' unexpected )} else {return atom (token)}}//meta, the base class for all data types, as the interpreter's refinement becomes useful, and is justified in semantic terms, but is not discussed in this article class meta { Constructor (value) {this.value = value}}//symbol type such as if define custom variable and other language keywords belong to this type class Sym extends Meta {Co Nstructor (value) {super (value)}}//the type of token to materialize function atom (token) {Let temp = parseint (token) if (i Snan (temp)) {return new Sym (token)} else if (token-temp = = = 0) {return temp} else {return parsefloat (token) }}
Perform
The process of execution is the core of the interpreter, which is implemented here as an eval function.
eval function
The approximate structure of the Eval function is a state machine that parses the AST according to the grammar rules of that language.
function eval (x, env=global_env) {if (x instanceof Sym) {//If the Token is a keyword return env.find (x.value) [X.value]/ /Look for the entity that is bound to the Token in the current scope} else if (! (x instanceof Array)) {//is not an array return x//returns directly (because at this point it is considered an integer or floating-point number)} else if (X[0].value = = ' if ') {//if let [Sym, Test, CO NSEQ, Alt] = x//extract information in the format of the IF statement as agreed in the language syntax let exp = (eval (test, env)? conseq:alt) return eval (exp, env) } else if (X[0].value = = ' Define ') {//If it is define let [Vari, exp] = X.slice (1) env.add (Vari.value, eval ( EXP, env))} else if (X[0].value = = ' lambda ') {//If it is lambda let [parms, body] = X.slice (1) return new P Rocedure (parms, Body, env)//Create an Procedure instance} else if (X[0].value = = ' quote ') {//If it is quote let [SYM, exp] = x Return Exp} else {//otherwise (here is the possible case: X is an array or a procedure (a function, where the implementation is Procedure instance, as described below)) let proc = eval (x[0] , env) let args = [] x.slice (1). ForEach (function (ARG) {ARgs.push (eval (ARG, env))}) if (proc instanceof Procedure) {return Proc.execute.call (proc, args) } return Proc.apply (this, args)}}
Functions and environment
In this language, we stipulate that the boundary of a scope is a function, and that scope is a lexical scope (same as JavaScript).
So whenever a function call is encountered, we need to create a new evaluation environment and make it the same environment as the parent environment of this new evaluation environment (if you don't understand why, search for lexical scopes yourself). So when we look for variables anywhere, we should look up the environment chain (scope chain) until we find them. Throws an exception if it is not found.
Thus, we can understand the scope as a data structure similar to a unidirectional list.
//Procedure class, the implementation of the function in the language class Procedure {constructor (parms, body, env) {this.parms = Parms t His.body = Body This.env = env} execute (args) {return eval (this.body, New env (this.parms, args, this ENV))}}//Env class, implementation of the evaluation environment in the language class Env {constructor (parms=[], args=[], outer=null) {this.e = new Object () This.init (parms, args) This.outer = outer//Parent environment}//Find a variable in this environment find (vari) {if (! (Vari in THIS.E)) && (! This.outer)) {throw new Referenceerror (' variable ' + vari + ' is undefined. ') } return Vari in THIS.E? This.e:this.outer.find (Vari)} init (keys, values) {Keys.foreach (key, index) = {This.e[key . value] = Values[index]})} assign (subenv) {object.assign (THIS.E, Subenv)} Add (key, value) { This.e[key] = value}}//Initializes a global environment let global_env = new env () global_env.assign (baseenv)
End
At this point, this simple interpreter has been completed and involves more details such as the definition of the exception, the global context definition, and the complete code can be viewed here.
I suggest that if you are interested in doing it yourself, you will have a deeper understanding of the principle of the interpreter and the closure.
The completion of this article is relatively hasty, if you have any questions to read the interpreter's Python implementation, the author explained very detailed, there are many test cases, I just read this article to write a JavaScript version.
If there are any errors, please make your own corrections or contact me via email ([email protected]) and GitHub (Sevenskey).
Thanks for reading Qwq
Class Lisp interpreter JavaScript implementation