YACC usage collection

Source: Internet
Author: User
Index:
  1. Concept
  2. YACC File Format
  3. Definition
  4. Rule Section
  5. Part 3
  6. Recursive Processing
  7. If-else conflict
  8. Error Handling
  9. YACC source program Style
1. Concept

YACC uses the BNF syntax to process context-free syntax ).

The sign that appears on the left side of each formula (left-hand side: LHS) is a non-terminal sign and appears on the right side of the formula (right-hand side: RHs) the symbols are non-terminal symbols and terminal symbols, but the terminal symbols only appear on the right side.

Conflicts may occur during the Protocol process. YACC has some default processing methods for this, that is, using the first matching rule.

 

2. YACC File Format

The YACC file is divided into three parts:

... definitions ...%%... rules ...%%... subroutines ...

 

3. Definition

The first part includes the token definition and C code (included in "% {" and "% ).

For example, define a flag in the definition section:

%token  INTEGER

After running YACC, a header file containing the predefined Meanings of the flag is generated, for example:

#ifndef  YYSTYPE#define  YYSTYPE int#endif#define  INTEGER  258extern  YYSTYPE  yylval;

LEX uses the flag definition in this header file. YACC calls the yylex () of lex to obtain the token. The value corresponding to the token is put in the variable yylval. The yylval type is determined by yystype. The default type of yystype is int. For example:

[0-9]+  {yylval = atoi(yytext);return INTEGER;}

The token mark 0-258 is reserved as the character value. Generally, the token Mark starts from. For example:

[-+]  return *yytext;  /* return operator */

Returns the plus or minus sign. Place the minus sign in front to avoid being recognized as a range symbol.

For operators, % left and % right: % left can be defined to indicate left-associated, and % right to indicate right-associated ). You can define multiple groups of % left or % right. The group defined later has a higher priority. For example:

%left  ‘+’  ‘-‘%left  ‘*’  ‘/’

The multiplication and division by method and subtraction defined above have a higher priority.

YACC maintains two stacks: The symbol stack and the value stack, which are always synchronized.

Change the type of yystype. Define ttstype as follows:

%union {    int     iValue;    /* integer value */    char    sIndex;    /* symbol table index */    nodeType *nPtr;    /* node pointer */};

The content in the generated header file is:

typedef union {    int     iValue;   /* integer value */    char    sIndex;   /* symbol table index */    nodeType *nPtr;   /* node pointer */} YYSTYPE;extern  YYSTYPE  yylval;

You can bind a token to a domain of yystype. For example:

%token  <iValue>  INTEGER%type   <nPtr>   expr

Bind expr to nptr and integer to ivalue. YACC performs conversion during processing. For example:

expr:  INTEGER  { $ = con($1); }

The conversion result is:

yylval.nPtr = con(yyvsp[0].iValue);

Here, yyvsp [0] is the current header of the value stack.

A method that defines a higher priority for a Single-dollar minus sign:

%left GE LE EQ NE '>' '<'%left '+' '-'%left '*'%nonassoc UMINUS

% Nonassoc indicates that there is no combination. It is generally used in combination with % prec, indicating that the operation has the same priority. For example:

expr: '-' expr %prec UMINUS { $ = node(UMINUS, 1, $2); }

Indicates that the operation has the same priority as uminus. In the above definition, uminus has a higher priority than other operators, so the operation has a higher priority than other operators.

 

4. Rule Section

The rules are similar to the BNF syntax.

In the rule, the target or non-terminal operator is placed on the left, followed by a colon (:), followed by the right of the generative formula, followed by the corresponding action (included ). For example:

%token INTEGER%%program: program expr '/n' { printf("%d/n", $2); }|;expr:  INTEGER   { $ = $1; }| expr '+' expr { $ = $1 + $3; }| expr '-' expr { $ = $1 - $3; };%%int yyerror(char *s){    fprintf(stderr, "%s/n", s);    return 0;}int main(void){    yyparse();    return 0;}

$1 indicates the value of the first tag on the right, $2 indicates the value of the second tag on the right, and so on. $ Indicates the value after the Statute.

 

5. Part 3

This part is the function part. When a YACC parsing error occurs, the yyerror () function is called. You can customize the implementation of the function. The main function calls the YACC resolution entry function yyparse ().

 

6. Recursive Processing

Recursive processing includes left recursion and right recursion.

Left recursive form:

list:item| list ',' item ;

Right recursion:

list: item| item ',' list

When right recursion is used, all items are pushed into the stack to start the Protocol. When left recursion is used, there will not be more than three items in the stack at the same time.

Therefore, left recursion has a great advantage.

 

7. If-else conflict

When there are two if and one else, the matching between the else and the IF is a problem. There are two matching methods: the first match and the second match. Modern Programming Languages allow else to match the nearest if, which is also the default behavior of YACC.

Although YACC is correct, to avoid warning, you can give the IF-else statement a higher priority than the if statement:

%nonassoc IFX%nonassoc ELSEstmt:  IF expr stmt %prec IFX| IF expr stmt ELSE stmt

 

8. Error Handling

When an error occurs during YACC parsing, the default action is to call the yyerror () function and return a value from yylex. A more friendly method is to ignore an error input stream and continue scanning. The implementation method is as follows:

stmt:';'| expr ';'| PRINT expr ';'| VARIABLE '=' expr ';| WHILE '(' expr ')' stmt| IF '(' expr ')' stmt %prec IFX| IF '(' expr ')' stmt ELSE stmt| '{' stmt_list '}'| error ';'| error '}';

The error flag indicates that when YACC finds an error, it calls yyerror (), and then the input stream goes forward to ';' or '}', and then continues scanning.

 

9. YACC source program Style

It is recommended to write in the following style:

  1. All terminal characters use uppercase letters, while all non-terminal characters Use lowercase letters;
  2. Place the syntax rules and semantic actions in different rows;
  3. Write the same left rule together, and write the left rule only once, and all the subsequent rules are written after the vertical line "|;
  4. Put the Semicolon ";" at the end of the rule, exclusive line;
  5. Use tabs to align rules and actions.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.