(Part 2)
5. Implementation of top-down syntax analysis programs
After four steps of careful preparation, the most exciting time has come. Generally, most of the Code in the textbook "compilation principles" is pseudo code that cannot be run on a machine. Here, what you will see is a practical program that can check for errors and execute evaluate-based top-down syntax analysis algorithms to calculate arithmetic expressions.
Without losing its universality, we stipulate that arithmetic expressions can only perform the four arithmetic operations (including parentheses) of integers, so we need to expand the following three functions:
Int E_AddSub (); // The formula corresponding to the non-terminator E
Int T_MulDiv (); // The formula corresponding to the non-terminator T
Int F_Number (); // The formula corresponding to the non-terminator F
As you can see, the preceding function returns the int value. We need these three functions to return the calculated result value. In order to calculate the value of each function's Neutron expression, in the E_AddSub () and T_MulDiv () functions, I use a variable rtn to store the value at the left of the operator, opr2 is used to store the right value of the operator and perform corresponding operations based on the operator.
To save the input arithmetic expression, I use the Global static character array expr to represent the input character buffer and pos to represent the value of the character indicator. In this way, the indicator takes the advance () of the next character () the operation can be replaced by pos ++, and the character indicated by the indicator can be expr [pos].
To indicate errors, I used a macro to define six types of error codes, and defined the corresponding strings of six error messages. At the same time, the error () function is transformed:
Void Error (int ErrCode );
In this way, the program can respond to the error by passing the error code, including indicating the error location, displaying the error message, and sending a prompt. In addition, I declared the error jump buffer static variable errjb. errjb is a std: jmp_buf type structure. You can use the setjmp () macro to record the running status of the current program to errjb, when an error is returned, you can use the longjmp () function to directly jump to the location where the main program setjmp () is called, rather than in the wrong function body.
In this way, a fully functional arithmetic expression is used to analyze the construction of the executor. Note that the program constructed in this way cannot recognize the unary operator. For example, an error is returned when "-1 + 1" is input.
The following is a running result snippet:
1 + (
^ Syntax error !!! The expression is invalid or the expression is incomplete!
Enter again!
Enter an arithmetic expression (enter "Q" or "q" to exit ):
2 -()
^ Syntax error !!! There is no expression in the brackets or the expression is incomplete!
Enter again!
Enter an arithmetic expression (enter "Q" or "q" to exit ):
2 + (3 +
^ Syntax error !!! The expression is invalid or the expression is incomplete!
Enter again!
Enter an arithmetic expression (enter "Q" or "q" to exit ):
2 + (3*9) +
^ Syntax error !!! The expression is invalid or the expression is incomplete!
Enter again!
Enter an arithmetic expression (enter "Q" or "q" to exit ):
2*(2 + 4) 4
^ Syntax error !!! Invalid characters are connected after the right brackets!
Enter again!
The program list is as follows:
/***** Analysis and Calculation of arithmetic expressions, file name: Exp_c.cpp, code/comment: hifrog ****
* *** Run ****/
# Include
# Include
# Include
# Include
# Include
# Define EXP_LEN 100 // defines the length of the input Character Buffer
/* ------------ Macro definition of error code --------------*/
# Define INVALID_CHAR_TAIL 0 // The expression is followed by an invalid character
# Define CHAR_AFTER_RIGHT 1 // an invalid character is connected after the right parenthesis
# Define LEFT_AFTER_NUM 2 // The left parenthesis is not directly connected after the number is specified.
# Define INVALID_CHAR_IN 3 // The expression contains invalid characters
# Define NO_RIGHT 4 // The right parenthesis is missing
# Define EMPTY_BRACKET 5 // no expression in brackets
# Define UNEXPECTED_END 6 // The End Of The expected arithmetic expression
Using namespace std;
Const string ErrCodeStr [] = // expression error message
{
"The expression is followed by an invalid character! ",
"Invalid characters are connected after parentheses! ",
"The left parenthesis is not directly connected after the number! ",
"The expression contains invalid characters! ",
"The right parenthesis is missing! ",
"No expression in the brackets or the expression is incomplete! ",
"The expression is invalid or the expression is incomplete! "
};
Static char expr [exp_len]; // arithmetic expression input Character Buffer
Static int Pos; // character indicator flag: used to save the position of characters being analyzed
Static jmp_buf errjb; // error jump Buffer
// ******** The following is the function declaration *********
// The generative function "e-> T + E | T-E | T" is used to analyze addition and subtraction arithmetic expressions.
Int e_addsub ();
// Generate the function "T-> F * T | f/T | f" to analyze the multiplication and division arithmetic expressions.
Int T_MulDiv ();
// Generate the function "F-> I | (E)" to analyze the expressions in numbers and parentheses.
Int F_Number ();
// Error handling function, which can indicate the error location and error information.
Void Error (int ErrCode );
Int main ()
{
Int ans; // Save the calculation result of the arithmetic expression
Bool quit = false; // whether to exit the calculation
Do
{
// Set a jump target here. If other functions of this program call longjmp,
// The execution command jumps to here and continues to be executed from here.
If (setjmp (errjb) = 0) // if no error exists
{
Pos = 0; // The initialization character indicator is 0, which indicates the first character of the input string.
Cout <"enter an arithmetic expression (enter" Q "or" q "to exit):" < Cin> expr; // input expression to fill in the expression character buffer.
If (expr [0] = 'q' | expr [0] = 'q ')
// Check whether the first character exits?
Quit = true;
Else
{
// Call the deduced function "E-> T + E | T-E | T,
// Start with the starting symbol "E.
Ans = E_AddSub ();
// At this time, the program considers that the syntax analysis of the expression has been completed, and the cause of the error is determined below:
// If a right brace in the expression is followed by a number or other characters,
// An error is returned because the number I does not belong to the FOLLOW () set.
If (expr [pos-1] = ')' & expr [pos]! = '/0 ')
Error (CHAR_AFTER_RIGHT );
// If a number or right brace in the expression is followed by the left brace,
// An error is returned because the left parenthesis does not belong to the FOLLOW (E) set.
If (expr [pos] = '(')
Error (LEFT_AFTER_NUM );
// If there are other invalid characters at the end
If (expr [pos]! = '/0 ')
Error (INVALID_CHAR_TAIL );
Cout <"The calculated expression value is:" <}
}
Else
{
// Setjmp (errjb )! = 0:
Cout <"Enter again! "< }
}
While (! Quit );
Return 0;
}
// The generative function "E-> T + E | T-E | T" is used to analyze addition and subtraction arithmetic expressions.
// Return the calculation result
Int E_AddSub ()
{
Int rtn = T_MulDiv (); // calculates the left element of the addition or subtraction arithmetic expression.
While (expr [pos] = '+' | expr [pos] = '-')
{
Int op = expr [pos ++]; // retrieves the symbol of the current position in the character buffer to the op
Int opr2 = T_MulDiv (); // calculate the right element of the addition/subtraction arithmetic expression
// Calculate the value
If (op = '+') // if it is a "+" Number
RTN + = opr2; // calculated by addition
Else // otherwise (it is)
RTN-= opr2; // calculated by subtraction
}
Return RTN;
}
// The derivation function T-> F * T | f/T | f is used to analyze the multiplication and division arithmetic expressions.
// Return the calculation result
Int t_muldiv ()
{
Int RTN = f_number (); // calculates the left element of the multiplication/division arithmetic expression.
While (expr [POS] = '*' | expr [POS] = '/')
{
Int op = expr [pos ++]; // retrieves the symbol of the current position in the character buffer to the op
Int opr2 = F_Number (); // calculate the right element of the multiplication and division arithmetic expression
// Calculate the value
If (op = '*') // if it is a "*" Number
Rtn * = opr2; // use multiplication.
Else // otherwise (it is)
RTN/= opr2; // calculated by Division
}
Return RTN;
}
// Generate the function "F-> I | (E)" to analyze the expressions in numbers and parentheses.
Int f_number ()
{
Int RTN; // declare the variable storing the returned value
// Use the formula F-> (e) to derive:
If (expr [POS] = '(') // if the current position of the character buffer is "("
{
Pos ++; // then the indicator points to the next symbol.
RTN = e_addsub (); // call the analysis function of the generative formula "e-> T + E | T-E | T"
If (expr [POS ++]! = ') // If it does not match "(" matched ")"
Error (no_right); // an error is returned.
Return RTN;
}
If (isdigit (expr [POS]) // if the current position in the character buffer is a number
{
// The formula F-> I is used for derivation.
// Convert the string at the current position in the character buffer to an integer
RTN = atoi (expr + POS );
// Change the indicator value, skip the digit section of the character buffer, and find the next input character.
While (isdigit (expr [POS])
Pos ++;
}
Else // an error occurs if it is not a number.
{
If (expr [pos] = ') // if ")"
Error (EMPTY_BRACKET); // The brackets are empty, that is, there is no arithmetic expression in the brackets.
Else if (expr [pos] = '/0') // if the input string ends at this time
Error (UNEXPECTED_END); // The arithmetic expression ends incorrectly.
Else
Error (INVALID_CHAR_IN); // otherwise, the input string contains invalid characters.
}
Return rtn;
}
// Error handling function. Enter the error code to indicate the error location and error message.
Void Error (int ErrCode)
{
Cout <'/R'; // line feed
While (pos --)
Cout <''; // print a space and move" ^ "indicating an error to the wrong position of the input string
Cout <"^ syntax error !!! "
< <
Longjmp (errjb, 1); // jump to the setjmp call in the main () function, and set the setjmp (errjb) return value to 1
}
(Full text)