Write the Interpreter in C language (a)--our goal

Source: Internet
Author: User

Statement

In order to improve the quality of teaching, my college is planning to prepare C language teaching materials. "Write the Interpreter in C language" series of articles after finishing will be revenue book "comprehensive Experiment" chapter. Therefore, the main reading of this series of articles for students who have just finished C language (not requiring data structure and other knowledge), so the wording is more verbose, do not be offended. I have limited level, if there is a description of inappropriate or wrong place please advise! Hereby DECLARE.

Cause

Recently, our college teacher contacted me, I hope I can provide a C language written in the BASIC interpreter for the C language course design teaching. I had just been fascinated by the language itself, and had intended to write an interpreter, which was the right one for me to accept.

Once in the library to see the new "Programming Master Proverbs", the fourth chapter "programming language operating mechanism" includes a C language written in the BASIC interpreter code, but the code does not seem complete (I turned several times, did not find the implementation code of the function Get_token); This time the code has other uses, should not be involved in copyright issues, the final reason is that I have "want to code" impulse ^_^. To sum up, I want to start with a C language to write a BASIC interpreter from scratch.

Pre-knowledge

1. To write an interpreter, the first thing to be clear is what the interpreter is (see Wikipedia: http://zh.wikipedia.org/zh-cn/interpreter for specific explanations). Misappropriation of the word "Programmer's motto": The Interpreter is a string interpreter (P165 explains the principle of the language). So, assuming that it's just for me, I'd rather use Lex & Yacc and even Perl than write it purely in C.

2. As already mentioned in the cause, this procedure will be used as a comprehensive experiment when the younger brothers learn the C language. So you need to be familiar with the syntax of C, single-link join/delete nodes, and the concept of stacks (most of which can be found in the C language textbook), some relatively obscure techniques (such as SETJMP/LONGJMP) will not be present in the program.

About language

I mentioned in the article "My opinion of programming and language" that programming is a very broad concept. In a sense, all the software is a specific language, but according to the flexibility of the program itself can be divided into "hard-coded", "configurable", "controllable" and "programmable" four categories (see "Four Types of programs"). Assuming the flexibility of a program is "programmable", its configuration file can be considered a "programming language", and the program itself is an "interpreter".

To be "programmable", the program should have at least "input/output", "Expression operation", "Memory Management" and "conditional Jump" four functions (see "Using DOS batch to do Digital Image processing"). This corresponds to the structure of the von Neumann computer: With the operator and the controller as the center, the input/output device and the memory transfer between the data must pass the operator. Each section is described below.

Our goal

We have to write an interpreter, and naturally we can not escape the above examples. Grammar is a reference to BASIC, but because it is designed in our own language, it is possible to "spice up" (for example, a long-awaited factorial operation in an expression ^_^). Here is a sample code for BASIC demo (Example.bas):

0009 n = 00010 while n < 1 OR n > 200011   PRINT "Please enter a number between 1-20" 0012   input N0013 WEND0020 for I = 1 to N0030
    l = "*" 0040 for   j = 1 to n-i0050     L = "" + L0060   NEXT0070 for   j = 2 to 2 * I-1 STEP 20080     l = l + "* *" 0090   NEXT0100   PRINT L0110 NEXT0120 I = N-10130 L = "" 0140 for J = 1 to n-i0150   L = l + "" 0160 NE  XT0170 for J = 1 to ((2*i)-1) 0180   L = l + "*" 0190 NEXT0200 PRINT L0210 I = I-10220 IF I > 0 THEN0230   GOTO 1300240 ELSE0250   PRINT "by Redraiment" 0260 END IF

The BASIC syntax requires that the beginning of the line provide a number between 1->9999 as the line number of the row (the line number of the current row is not less than the line number of the previous row), which is called when the GOTO statement jumps. BASIC's syntax is stricter than C, which not only reduces the complexity of the code but also makes the language itself easier to learn. The above code almost identical covers all the functions we need to implement, assuming that it can be parsed correctly, you will see the following effects:


Go down and discuss the functionality to be implemented in turn.

Input/output (IO)

Interacting with external programs or people through input/output is the most basic requirement to get out of "hard-coded". Input/output is also a very abstract concept, it is not limited to the standard input and output (keyboard, display, etc.), but also through the file, the Internet and other ways to obtain data (so C language in addition to scanf, printf, etc., in fact, #include instruction is also considered an IO operation). This program does not emphasize IO, so it is only required to implement input and print two instructions, respectively, to enter data from the keyboard and print to the screen. The format of the instruction is as follows:

INPUT var[, var ...]  var represents the variable name (same as below), and the variables are separated by commas. Function: Obtains one or more values from the keyboard and assigns values to the corresponding variables.  When entering multiple variables at the same time, each number entered is separated by a space, carriage return, or tab.  For example: INPUT A, B, cprint expression[, expression ...]  Where expression is the expressions (the same as the same), the expressions are separated by commas. Function: Evaluates an expression, outputting the result to the screen and wrapping.  Suppose there are multiple expressions, separated by tabs (/t) between the expressions. Example: PRINT I * 3 + 1, (A + B) * (C + D)
Expression arithmetic

In "DOS" I call it "arithmetic operations". But for computers, "arithmetic operations" include not only arithmetic operations such as "arithmetic", but also "relational operations" and "logical operations". In order to avoid ambiguity, it is renamed "Expression operation" here. "Expression Arithmetic" is the core of the whole program, which is the equivalent of a computer's computing device. In our program, you need to implement the following operators:

symbols name Priority Level Binding Nature
( Opening parenthesis 17 Left2right
) Right 17 Left2right
+ Add 12 Left2right
- Reducing 12 Left2right
* By 13 Left2right
/ Except 13 Left2right
% Take the mold 13 Left2right
^ exponentiation 14 Left2right
+ Positive sign 16 Right2left
- Minus sign 16 Right2left
! Factorial 16 Left2right
> Greater than 10 Left2right
< Less than 10 Left2right
= Equals 9 Left2right
<> Not equal to 9 Left2right
<= No greater than 10 Left2right
>= Not less than 10 Left2right
and Logic and 5 Left2right
OR Logical OR 4 Left2right
Not Logical Non- 15 Right2left
Memory management

In our mini-interpreter, we can only implement simple variable management without considering the problem of dynamic allocation of memory space. We default to A-Z 26 available weakly typed variables (which can be arbitrarily assigned to integers, floating-point numbers, or strings). Variables require the ability to assign a value first, otherwise the variable is not available (so the first line in the demo sample code is to assign N to 0). The format of the assignment statement is

[let] var = expression in which let is an optional keyword.  BASIC does not agree with var1 = VAR2 = expression Such an assignment statement, because in the expression "=" is translated to "equals", so the assignment is not in the table above the present.  Function: Evaluates the value of an expression and assigns the result to the variable var. For example: I = (123 + 456) * 0.09
Jump by condition

Assuming that the simplest language is designed, its control statements simply need to provide a set of statements such as JMP, JNZ, and so on, which can be used to simulate the IF, while, for, GOTO and other control statements. But BASIC, as a high-level language, needs to provide higher-level, more abstract statements. We will implement the following four statements:

1) GOTO expression in which expression is a numeric expression and the result must be an available line number.  Because it is an expression, a subroutine call can be simulated by dynamic computation.  Function: Unconditionally jumps to the specified line. For example: GOTO 120+102) If expression then  sentence1[else  sentence2]end If sentence is a statement block (hereinafter), including one or more running statements.  ELSE is an optional part. Function: Branch structure.  But the expression value is true (the number is not equal to 0 or the string is not NULL) when the statement block 1 is run; otherwise, an else statement block is run when there is an else statement block. For example:        IF 1=1 then           print "TRUE"        ELSE           print "FALSE"        END IF3) for var = expression to expression [STEP expre Ssion]  sentencenext All expressions are numeric expressions. Step is an optional part, which is the step size of the iterator.  The value of the step expression does not agree with 0. Function: loop iteration Structure For example: for        I = 1 to ten STEP 3          PRINT I        NEXT4) while expression  sentencewend action: Iterate through the statement block until the value of the expression is false  。 For example: While        n < n          = n + 1        wend
A lot of other details
    1. BASIC source code does not distinguish between uppercase and lowercase;
    2. The program does not handle character escapes in the implementation, so it cannot output double-quotes. After introducing the complete source code, assume that you are interested in being able to try your own good;
    3. The same procedure does not consider gaze (REM keyword). In fact this is very easy, but the same problem left you to deal with ^_^;
    4. You might also be interested in joining GOSUB and RETURN keyword, freeing subroutine functionality from GOTO.
Summarize

This article mainly describes the functions we have written to implement the interpreter, and then there will be a series of articles to step through the implementation of the interpreter in detail. The core part of the interpreter-the expression evaluation-is introduced in the next article. Please pay attention to "write interpreter in C" (ii).

Copyright notice

Please respect original works. Reproduced please maintain the integrity of the article, and in the form of hyperlinks to the original author "Redraiment" and the main website address, convenient for other friends to ask questions and correct.

Contact information

My e-mail, welcome letter ([email protected])
My blogger (sub-clear line): http://redraiment.blogspot.com/
My Google Sites (sub-clear line): Https://sites.google.com/site/redraiment
My csdn blog (Dream Ting Xuan): http://blog.csdn.net/redraiment
My Baidu space (Dream Ting Xuan): http://hi.baidu.com/redraiment

Write the Interpreter in C language (a)--our goal

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.