Compilation Principle learning: TINY language lexical scanning program implementation, tiny lexical
I am very interested in interpreted programs (similar to python or the bc calculator in linux) recently. I started to learn the compilation principles. Today, I implemented the lexical scanning program for TINY language. For most of them, refer to the Book Compilation principles and practices. But I made some minor improvements.
Let's talk about TINY:
1. Note: place it in a pair of braces. The comments in the book cannot be nested. I made some improvements and allowed nesting.
2. Keyword: read write if end repeat until else
3. Type: only integer and Boolean types are supported.
4. calculation: +-*/() <=: =, where: = is the value assignment operation and = is the judgment. No <=>=
An example of TINY language program:
Test. tine: (from compilation principles and practices)
{ Sample program in TINY language - computes factorial}read x; { input an integer }if 0 < x then { don't compute if x <= 0 }fact := 1;repeatfact := fact * x;x := x - 1;until x = 0;write fact { output factorial of x }end
Some types of declarations are involved in globals. h:
#ifndef GLOBALS_H#define GLOBALS_H#include <stdio.h>typedef enum {ENDFILE, ERROR,IF, THEN, ELSE, END, REPEAT, UNTIL, READ, WRITE,ID, NUM,ASSIGN, EQ, LT, PLUS, MINUS, TIMES, OVER, LPAREN, RPAREN, SEMI} TokenType;extern lineno;/* The max size of identifier of reserved word */#define MAXTOKENLEN 50#endif
Flex input used to generate lexical scanning, which is the core part of the program:
Tiny. l
%{#include <stdio.h>#include <string.h>#include "globals.h"#include "util.h"char tokenString[MAXTOKENLEN + 1];%}digit[0-9]number{digit}+letter[a-zA-Z]identifier{letter}[a-zA-Z0-9]*newline\nwhitespace[ \t]%%"if"{return IF;}"then"{return THEN;}"else"{return ELSE;}"end"{return END;}"repeat"{return REPEAT;}"until"{return UNTIL;}"read"{return READ;}"write"{return WRITE;}":="{return ASSIGN;}"="{return EQ;}"<"{return LT;}"+"{return PLUS;}"-"{return MINUS;}"*"{return TIMES;}"/"{return OVER;}"("{return LPAREN;}")"{return RPAREN;}";"{return SEMI;}{number}{return NUM;}{identifier}<span style="white-space:pre"></span>{return ID;}{newline}{lineno++;}{whitespace}<span style="white-space:pre"></span>{ /* Do nothing */ }"{"{ char c; int count = 1; do { c = input(); if (c == EOF) break; else if (c == '\n') lineno++; else if (c == '{') count++; else if (c == '}') count--; } while (count != 0);}.{return ERROR;}%%TokenType getToken(void){TokenType currentToken;currentToken = yylex();strncpy(tokenString, yytext, MAXTOKENLEN);printf("%d: ", lineno);printToken(currentToken, tokenString);return currentToken;}
The printToken function is implemented in util. c:
Util. h:
#ifndef UTIL_H#define UTIL_H#include "globals.h"void printToken(TokenType token, char* tokenString);TokenType getToken(void);#endif
Util. c:
#include "util.h"#include <stdio.h>#include "globals.h"void printToken(TokenType token, char* tokenString){switch(token){case IF:case THEN:case ELSE:case END:case REPEAT:case UNTIL:case READ:case WRITE:printf("\treversed word: %s\n", tokenString);break;case ID:printf("\tidentifier: %s\n", tokenString);break;case NUM:printf("\tnumber: %s\n", tokenString);break;case ASSIGN:case EQ:case LT:case PLUS:case MINUS:case TIMES:case OVER:case LPAREN:case RPAREN:case SEMI:printf("\toperator: %s\n", tokenString);}}
This is all the files! Finally, it is the makefile file:
scanner.exe: main.o lex.yy.o util.ogcc main.o lex.yy.o util.o -o scanner.exe -lflmain.o: main.c globals.h util.hgcc main.c -cutil.o: util.c util.h globals.hgcc util.c -clex.yy.o: tiny.lflex tiny.lgcc lex.yy.c -c
Therefore, a simple lexical scanning program is completed.
Because the default input is used, this program can be input directly from the keyboard. The running effect is as follows:
Of course, you can also use the redirection operation. The effect is as follows:
Compilation Principle Course Design-lexical analyzer design (C language)
# Include "stdio. h"/* define some macros and variables used by the I/O Library */
# Include "string. h"/* define string library functions */
# Include "conio. h"/* provides screen window operation functions */
# Include "ctype. h"/* classification function */
Char prog [80] = {'\ 0 '},
Token [8];/* stores the string that constitutes the word symbol */
Char ch;
Int syn,/* types of characters that store word characters */
N,
Sum,/* store integer words */
M, p;/* p is the prog pointer of the buffer, and m is the token pointer */
Char * rwtab [6] = {"begin", "if", "then", "while", "do", "end "};
Void scaner (){
M = 0;
Sum = 0;
For (n = 0; n <8; n ++)
Token [n] = '\ 0 ';
Ch = prog [p ++];
While (ch = '')
Ch = prog [p ++];
If (isalpha (ch)/* ch is a letter */{
While (isalpha (ch) | isdigit (ch)/* ch is a letter or number */{
Token [m ++] = ch;
Ch = prog [p ++];}
Token [m ++] = '\ 0 ';
Ch = prog [p --];
Syn = 10;
For (n = 0; n <6; n ++)
If (strcmp (token, rwtab [n]) = 0)/* string comparison */{
Syn = n + 1;
Break ;}}
Else
If (isdigit (ch)/* ch is a numeric character */{
While (isdigit (ch)/* ch is a numeric character */{
Sum = sum * 10 + ch-'0 ';
Ch = prog [p ++];}
Ch = prog [p --];
Syn = 11 ;}
Else
Swit ...... remaining full text>
Compilation Principle lexical analysis program
# Include <iostream>
# Include <ctype. h>
# Include <fstream>
# Include <string. h>
# Include <malloc. h>
Using namespace std;
Ifstream fp ("source.txt", ios: in );
Char cbuffer;
Char * key [13] = {"if", "else", "for", "while", "do", "return", "break", "continue ", "int", "void"
, "Main", "const", "printf"}; // keyword
Char * border [7] = {",",";","{","}","(",")","//"}; // delimiter
Char * arithmetic [6] = {"+", "-", "*", "/", "++", "--"}; // Operator
Char * relation [7] = {"<", "<=", "=", ">", "> =", "= ","! = "}; // Relational Operator
Char * lableconst [80]; // identifier
Int constnum = 40;
Int lableconstnum = 0;
Int linenum = 1; // count the number of constants and identifiers
Int search (char searchchar [], int wordtype)
{
Int I = 0, t = 0;
Switch (wordtype)
{
Case 1:
{For (I = 0; I <= 12; I ++) // keyword
{
If (strcmp (key [I], searchchar) = 0)
Return (I + 1 );
}
Return (0 );}
Case 2:
{
For (I = 0; I <= 6; I ++) & # ...... the remaining full text>