Action for flex

Source: Internet
Author: User
Operations performed by the lexical analyzer

When the lexical analyzer matches an extended regular expression in the Rule section of the description file, it executesOperation. If there are not enough rules to match all strings in the input stream, the lexical analyzer copies the input to the standard output. Therefore, do not create rules that only copy input to output. The default output can help you find the interval in the rule.

When usingLexCommand ProcessingYACCWhen entering the parser generated by the command, provide rules that match all input strings. Rules must be generatedYACCCommand output that can be interpreted.

Null operation

To ignore the input associated with the extended regular expression, use the; (C language null statement) as the operation. The following example ignores three delimiter characters (blank, tab, and line feed ):

 
[\ T \ n];
Same as the next operation

To avoid repeated write operations, use | (MPS Queue ). This character indicates that the operation of this rule is the same as that of the next rule. For example, examples that ignore white spaces, tabs, and line breaks can also be written as follows:

 
"" | "\ T" | "\ n ";

\ NAnd\ TQuotation marks on both sides are not required.

Print matching string

To determine which text matches the expression in the Rule section of the description file, you can package the C LanguagePrintfThe subroutine is called as an operation of this expression. When the lexical analyzer finds a match in the input stream,ProgramPut matching strings into external characters (Char) And wide characters (Wchar_t) In the array, calledYytextAndYywtext. For example, you can use the following rules to print matching strings:

 
[A-Z] + printf ("% s", yytext );

C LanguagePrintfThe subroutine accepts the format parameters and the data to be printed. In this example,PrintfThe sub-routine parameters have the following meanings:

% S Symbol used to convert data to a type string before printing
% S Convert data to a wide string (Wchar_t).
Yytext Name of the array containing the data to be printed
Yywtext Contains the multibyte type to be printed (Wchar_t) Data array name

LexCommand DefinitionEcho; As to printYytext. For example, the following two rules are equivalent:

 
[A-Z] + echo; [A-Z] + printf ("% s", yytext );

You canLexDescription file Definition% ArrayOr% PointerChange as followsYytextDescription:

% Array SetYytextIt is defined as an array of characters ending with null. This is the default operation.
% Pointer SetYytextIt is defined as a pointer to a string ending with null.
Search for the length of the matched string

To find the number of characters that the lexical analyzer matches with a specific extended regular expression, useYylengOrYywlengExternal variable.

Yyleng The number of matching bytes.
Yywleng The number of wide characters in the matching string. The length of Multi-byte characters is greater than 1.

To count the number of words entered and the number of characters in the word, use the following operations:

 
[A-Za-Z] + {words ++; chars + = yyleng ;}

In this operation, the total number of matched characters is assignedChars.

The following expression finds the last character in the matching string:

 
Yytext [yyleng-1]

Match strings in a string

LexCommand to partition the input stream without searching all possible matching strings for each expression. Each character is counted only once. To overwrite this option and search for items that may overlap or contain each other, useRejectOperation. For exampleSheAndHeAll instances (includingSheInHe) Count, use the following operations:

 
She {s ++; reject;} He {H ++} \ n | .;

InSheAfter counting the number of occurrences,LexCommand to reject the input string, and thenHeCount the number of occurrences. BecauseHeNot includingShe, SoRejectYou do not haveHe.

Add the result to the yytext Array

Typically, the next string from the input stream overwritesYytextThe current entry in the array. If you useYymoreSubroutine. The next string from the input stream will be addedYytextThe end of the current entry of the array.

For example, the following lexical analyzer searches for strings:

 
% S instring % <initial >\" {/* Start of string */begin instring; yymore ();} <instring> \ "{/* end of string */printf (" matched % s \ n ", yytext); begin initial ;}< instring>. {yymore () ;}< instring >\n {printf ("error, new line in string \ n"); begin initial ;}

Although strings may be identified by matching multiple rulesYymoreThe subroutine ensures thatYytextThe array contains the entire string.

Returns characters to the input stream.

To return characters to the input stream, use the following call:

 
Yyless (N)

WhereNThe number of characters to be maintained in the current string. The character exceeding this number in the string is returned to the input stream.YylessThe type of the first function provided by the subroutine and/The (slash) operator uses the same, but it allows more control over its usage.

More than onceYylessChild routines process text. For example, when the syntax is used to analyze a C-language programX =-Such expressions are hard to understand. It indicatesX Equal -, OrX-=(MeaningX MinusValue A)? To use this expressionX Equal -To print a warning message, use the following rules:

 
=-[A-Za-Z] {printf ("Operator (=-) ambiguous \ n"); yyless (yyleng-1);... Action for = ...}

Input/Output subroutine

LexThe program allows the program to use the following input/output (I/O) subroutines:

Input () Returns the next input character.
Output (c) Write character C to output
Unput (c) Push character C back to the input stream, and then passInputSubroutine reading
Winput () Returns the next multi-byte input character.
Woutput (c) Write multi-byte character C back to the output stream
Wunput (c) Pushes the multi-byte character C back to the input stream to passWinputSubroutine reading

LexPrograms provide these subroutines as macro definitions. SubroutineCodeInLex. yy. cFile. You can override them and provide other versions.

DefinitionWinput,WunputAndWoutputMacro for useYywinput,YywunputAndYywoutputSubroutine. Considering compatibility,YYThe child routine is subsequently usedInput,UnputAndOutputChild routines to read, write, and replace the number of bytes that are completely multibyte characters.

These subroutines define the relationship between external files and internal characters. If you change the child routines, change them all in the same way. These subroutines should follow these rules:

    • All child routines must use the same character set.
    • InputThe subroutine must return 0 to indicate the end of the file.
    • Do not changeUnputSubroutines andInputThe relationship between child routines. Otherwise, the first function does not work.

Lex. yy. cFile allows the lexical analyzer to back up to 200 characters.

To read files containing null, create different versionsInputSubroutine. InInputIn the normal version of the subroutine, the value 0 returned from the null character indicates that this is the end of the file and the input will be terminated.

Character Set

LexThe command-generated lexical analyzer usesInput and OutputAndUnputThe subroutine processes character I/O. ThereforeYytextReturn Value in the subroutine,LexCommand to use the character descriptions used by these subroutines. HoweverLexThe command uses a small integer to represent each character. When a standard library is used, this integer represents the bit mode Value of a character. Under normal circumstances, lettersAUse and character constantsAIn the same format. If you use different I/O subroutines to change this interpretation, place the conversion table in the definition section of the description file. The conversion table starts and ends with the following rows:

 
% T

The conversion table contains other rows that indicate the values associated with each character. For example:

% T {INTEGER} {character string} % t

File End Processing

When the lexical analyzer reaches the end of the file, it callsYywrapLibrary subroutine. the return value of this call is 1, indicating that the lexical analyzer should continue to end normally at the end of the input.

However, if the lexical analyzer receives input from multiple sources, changeYywrapSubroutine. The new function must obtain the new input and return the value 0 to the lexical analyzer. The return value 0 indicates that the program should continue processing.

You can also include the codeYywrapThe summary report and table are printed when the child routine ends.YywrapThe subroutine is mandatory.YylexThe subroutine identifies the only way at the end of the input.

Operations performed by the lexical analyzer

When the lexical analyzer matches an extended regular expression in the Rule section of the description file, it executesOperation. If there are not enough rules to match all strings in the input stream, the lexical analyzer copies the input to the standard output. Therefore, do not create rules that only copy input to output. The default output can help you find the interval in the rule.

When usingLexCommand ProcessingYACCWhen entering the parser generated by the command, provide rules that match all input strings. Rules must be generatedYACCCommand output that can be interpreted.

Null operation

To ignore the input associated with the extended regular expression, use the; (C language null statement) as the operation. The following example ignores three delimiter characters (blank, tab, and line feed ):

 
[\ T \ n];

Same as the next operation

To avoid repeated write operations, use | (MPS Queue ). This character indicates that the operation of this rule is the same as that of the next rule. For example, examples that ignore white spaces, tabs, and line breaks can also be written as follows:

 
"" | "\ T" | "\ n ";

\ NAnd\ TQuotation marks on both sides are not required.

Print matching string

To determine which text matches the expression in the Rule section of the description file, you can package the C LanguagePrintfThe subroutine is called as an operation of this expression. When the lexical analyzer finds a match in the input stream, the program puts the match string into an external character (Char) And wide characters (Wchar_t) In the array, calledYytextAndYywtext. For example, you can use the following rules to print matching strings:

 
[A-Z] + printf ("% s", yytext );

C LanguagePrintfThe subroutine accepts the format parameters and the data to be printed. In this example,PrintfThe sub-routine parameters have the following meanings:

% S Symbol used to convert data to a type string before printing
% S Convert data to a wide string (Wchar_t).
Yytext Name of the array containing the data to be printed
Yywtext Contains the multibyte type to be printed (Wchar_t) Data array name

LexCommand DefinitionEcho; As to printYytext. For example, the following two rules are equivalent:

 
[A-Z] + echo; [A-Z] + printf ("% s", yytext );

You canLexDescription file Definition% ArrayOr% PointerChange as followsYytextDescription:

% Array SetYytextIt is defined as an array of characters ending with null. This is the default operation.
% Pointer SetYytextIt is defined as a pointer to a string ending with null.

Search for the length of the matched string

To find the number of characters that the lexical analyzer matches with a specific extended regular expression, useYylengOrYywlengExternal variable.

Yyleng The number of matching bytes.
Yywleng The number of wide characters in the matching string. The length of Multi-byte characters is greater than 1.

To count the number of words entered and the number of characters in the word, use the following operations:

 
[A-Za-Z] + {words ++; chars + = yyleng ;}

In this operation, the total number of matched characters is assignedChars.

The following expression finds the last character in the matching string:

 
Yytext [yyleng-1]

Match strings in a string

LexCommand to partition the input stream without searching all possible matching strings for each expression. Each character is counted only once. To overwrite this option and search for items that may overlap or contain each other, useRejectOperation. For exampleSheAndHeAll instances (includingSheInHe) Count, use the following operations:

 
She {s ++; reject;} He {H ++} \ n | .;

InSheAfter counting the number of occurrences,LexCommand to reject the input string, and thenHeCount the number of occurrences. BecauseHeNot includingShe, SoRejectYou do not haveHe.

Add the result to the yytext Array

Typically, the next string from the input stream overwritesYytextThe current entry in the array. If you useYymoreSubroutine. The next string from the input stream will be addedYytextThe end of the current entry of the array.

For example, the following lexical analyzer searches for strings:

% S instring % <initial >\" {/* Start of string */begin instring; yymore ();} <instring> \ "{/* end of string */printf (" matched % s \ n ", yytext); begin initial ;}< instring>. {yymore () ;}< instring >\n {printf ("error, new line in string \ n"); begin initial ;}

Although strings may be identified by matching multiple rulesYymoreThe subroutine ensures thatYytextThe array contains the entire string.

Returns characters to the input stream.

To return characters to the input stream, use the following call:

 
Yyless (N)

WhereNThe number of characters to be maintained in the current string. The character exceeding this number in the string is returned to the input stream.YylessThe type of the first function provided by the subroutine and/The (slash) operator uses the same, but it allows more control over its usage.

More than onceYylessChild routines process text. For example, when the syntax is used to analyze a C-language programX =-Such expressions are hard to understand. It indicatesX Equal -, OrX-=(MeaningX MinusValue A)? To use this expressionX Equal -To print a warning message, use the following rules:

 
=-[A-Za-Z] {printf ("Operator (=-) ambiguous \ n"); yyless (yyleng-1);... Action for = ...}

Input/Output subroutine

LexThe program allows the program to use the following input/output (I/O) subroutines:

Input () Returns the next input character.
Output (c) Write character C to output
Unput (c) Push character C back to the input stream, and then passInputSubroutine reading
Winput () Returns the next multi-byte input character.
Woutput (c) Write multi-byte character C back to the output stream
Wunput (c) Pushes the multi-byte character C back to the input stream to passWinputSubroutine reading

LexPrograms provide these subroutines as macro definitions. The code of the subroutine isLex. yy. cFile. You can override them and provide other versions.

DefinitionWinput,WunputAndWoutputMacro for useYywinput,YywunputAndYywoutputSubroutine. Considering compatibility,YYThe child routine is subsequently usedInput,UnputAndOutputChild routines to read, write, and replace the number of bytes that are completely multibyte characters.

These subroutines define the relationship between external files and internal characters. If you change the child routines, change them all in the same way. These subroutines should follow these rules:

    • All child routines must use the same character set.
    • InputThe subroutine must return 0 to indicate the end of the file.
    • Do not changeUnputSubroutines andInputThe relationship between child routines. Otherwise, the first function does not work.

Lex. yy. cFile allows the lexical analyzer to back up to 200 characters.

To read files containing null, create different versionsInputSubroutine. InInputIn the normal version of the subroutine, the value 0 returned from the null character indicates that this is the end of the file and the input will be terminated.

Character Set

LexThe command-generated lexical analyzer usesInput and OutputAndUnputThe subroutine processes character I/O. ThereforeYytextReturn Value in the subroutine,LexCommand to use the character descriptions used by these subroutines. HoweverLexThe command uses a small integer to represent each character. When a standard library is used, this integer represents the bit mode Value of a character. Under normal circumstances, lettersAUse and character constantsAIn the same format. If you use different I/O subroutines to change this interpretation, place the conversion table in the definition section of the description file. The conversion table starts and ends with the following rows:

 
% T

The conversion table contains other rows that indicate the values associated with each character. For example:

 
% T {INTEGER} {character string} % t

File End Processing

When the lexical analyzer reaches the end of the file, it callsYywrapLibrary subroutine. the return value of this call is 1, indicating that the lexical analyzer should continue to end normally at the end of the input.

However, if the lexical analyzer receives input from multiple sources, changeYywrapSubroutine. The new function must obtain the new input and return the value 0 to the lexical analyzer. The return value 0 indicates that the program should continue processing.

You can also include the codeYywrapThe summary report and table are printed when the child routine ends.YywrapThe subroutine is mandatory.YylexThe subroutine identifies the only way at the end of the input.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.