Lex and YACC can help you write programs to transform structured input. It includes a simple text search program that looks for patterns from the input file, and a C compiler program that transforms the source program into the best target code.
Lex uses a series of descriptions of possible tokens to produce a C routine that identifies those tokens (we become the lexical parser. Lexical analysis Program (lexer), or become a scanner).
The tag description used by Lex is called a regular expression. YACC uses concise syntax descriptions and generates a C routine that can parse the syntax. That is, the analysis program. The YACC parser automatically detects whether the input tag sequence matches a rule in the syntax, and once the input does not match either rule, he detects a syntax error.
The simplest Lex program
Percent
. | \ n ECHO percent
Copies its standard input to the standard output.
Identify words with Lex
Build a simple program that recognizes different types of English words. Recognizing verbs and non-verbs
%{/** * This example demonstrates a very simple recognition * verb/non-verb *//** * {%} for the definition section, defining a segment that describes the * original C program code that will be copied to the final program. In other words, this part of the content of the C program can be used *. If you have a header file that must be included in the file later, you will also need to include it in this area.
where {%} content is implemented using C, Lex copies the contents directly into the generated C file.
* * Comments in Lex must be properly identified using whitespace characters, otherwise it will be interpreted by Lex * as something else. * Trailing percent of the mark this part ends * The following section is a rule segment, each rule has two parts, a pattern and an action, * separated by blanks. When the lexical parser generated by Lex identifies a pattern, it executes the corresponding action, where the pattern is a UNIX-style regular expression *//** * rule "|"
Indicates that the next pattern applies the same action, so all * verbs use the action specified for the last verb * Why doesn't the island match is but matches the island or both?
* * Yytext array contains text that matches patterns * Lex has a set of simple disambiguation rules: * 1:lex mode matches only input characters or strings once * 2:lex performs the longest possible match of the current input. * Because Lsland is a match for the is long, Lex looks at Island * to match the above "include everything" rule * * The last line is the default statement. "."
Matches any single character unexpectedly, * "\ n" matches a line break, and the echo output matches the pattern.
* * The last part is the user subroutine, there is any legal C code composition. * * Use command to compile * Lex simple.lex//Generate LEX.YY.C * GCC lex.yy.c-o simple * * */%} percent [\ t]+/* Ignore whitespace
*/;
is |
AM |
are |
were | was |
BES |
being |
been |
do |
does |
Did |
would |
would |
should |
can |
could |
has |
have |
had |
Go {printf ("%s:is a verb\n", Yytext);}
[a-za-z]+ {printf ("%s:is not a verb\n", Yytext);} .|
\ n {ECHO;/* Usual default state */} percent int main () {Yylex ();
return 0;
}//The function must be included int yywrap () {return 1;}
Here are the results of my program's operation: