Problem
Problem
Loglan is a synthetic speakable language designed to test some of the fundamental problems of linguistics, such as the Sapir whlf hypothesis. it is syntactically unambiguous, culturally neutral and metaphysically parsimonious. what follows is a gross over-simplification of an already very small grammar of some 200 rules.
Loglan is a comprehensive language that can be pronounced. It is designed to verify some principles in Linguistics, such as the sapier-wolf hypothesis. Its syntax is accurate and neutral in culture. Its philosophy is to save and save. The syntax set of this language has been overly simplified-there are only 200 syntax rules.
Loglan sentences consist of a series of words and names, separated by spaces, and are terminated by a period (.). loglan words all end with a vowel; names, which are derived extra-linguistically, end with a consonant. loglan words are divided into two classes -- little words which specify the structure of a sentence, and predicates which have the form ccvcv or cvccv where c Represents a consonant and V represents a vowel (see examples later ).
A loglan statement is composed of a series of words and names separated by spaces and ended with a dot. All the words in loglan end with a vowel. The name is from outside the language and ends with a consonant. Loglan has two types of words: small words and predicates. A small word specifies the structure of a sentence. The predicate is in the form of "ccvcv" or "cvccv". C indicates a consonant, and V indicates a vowel. (See the example below)
The subset of loglan that we are considering uses the following grammar:
We consider using a subset of the loglan language with the following syntax definitions:
A → A | E | I | o | u
MoD → ga | Ge | GI | go | Gu
Ba → Ba | be | BI | Bo | Bu
Da → da | de | di | Do | du
La → la | le | Li | lo | Lu
NAM → {all names}
Preda → {All predicates}
<Sentence> → <statement >|< predclaim>
<Predclaim> → <predname> BA <preds> | da <preds>
<Preds> → <predstring> | <preds> A <predstring>
<Predname> → la <predstring> | Nam
<Predstring> → Preda | <predstring> Preda
<Statement> → <predname> <verbpred> <predname> | <predname> <verbpred>
<Verbpred> → mod <predstring>
In the above grammar, NAM is the symbol of all names, and Preda is the symbol of all predicates)
Write a program that will read a succession of strings and determine whether or not they are correctly formed loglan sentences.
Write oneProgramTo read a group of strings and determine if they are correct loglan statements.
Input and Output
Input and Output
Each loglan sentence will start on a new line and will be terminated by a period (.). the sentence may occupy more than one line and words may be separated by more than one whitespace character. the input will be terminated by a line containing a single '#'. you can assume that all words will be correctly formed.
Each loglan statement starts from a new line and ends with a period. A statement may occupy multiple rows, and there may be more than one space between words. All input ends with the # sign of an exclusive row. You can think that all words are in the correct format.
Output will consist of one line for each sentence containing either 'good' or 'bad! '.
For each input statement, "good" or "bad!" should be output! ".
Sample Input
Input example
La mutce bunbo mrenu Bi ditca.
La fumna Bi le mrenu.
Djan ga vedma le Negro ketpi.
#
Sample output
Output example
Good
Bad!
Good
Analysis
Analysis
For those who have not learned how to compile, this question is very difficult. The following will be parsed in as popular a language as possible. The related concepts will be briefly introduced.
The syntax definition in the question is given by the "generative expression", where some basic symbols of the regular expression are used. A generative rule defines a constructor (derivation) rule. The preceding rule is a printable symbol, and the arrow points to a symbolic sequence. The vertical line "|" indicates "or ". For example:
A → A | E | I | o | u
The meaning of the above formula is that any symbol in A, E, I, O, and u can be replaced by the symbol A (derived from ). Let's look at it again:
<Statement> → <predname> <verbpred> <predname> | <predname> <verbpred>
The meaning of the above formula is: <predname> <verbpred> <predname> the three symbols are arranged from left to right, which can be replaced by the <Statement> symbol, note that <predname> <verbpred> exists before and after the "|" symbol, no matter whether <predname> is followed by <verbpred> as the suffix, <Statement> can be exported as long as there is a prefix of <predname>. However, if there is a <predname> suffix, whether or not to deduce it together depends on the formulated deduction policy. The question requires determining whether the syntax of any given sentence is correct, that is, to use the syntax to deduce the sentence. If the <sentence> symbol can be introduced, it is correct.
The syntax analysis on the generative syntax shows that this syntax does not have ambiguity, but it is not a regular syntax and needs to be converted. In grammar, Preda is the terminator (except the predicate or name, there are no other symbols to export). It can export <predstring>, <predstring> can be used as the prefix of Preda to export <predstring>. Therefore, the Regular Expression of <predstring> should be:
Preda +
In the above formula, "+" indicates a string formed by one or more Preda connections. We can also see that <predstring> is used for Suffixes in addition to its own production formula. Therefore, multiple consecutive <predstring> can be directly merged, does not cause ambiguity. The generated data should be rewritten:
Preda → Preda
In this way, multiple consecutive prefixes are generated as a single Preda, and then converted to <predstring>:
<Predstring> → Preda
Nam, DA, and BA are terminologies, but their production patterns do not have self-loops and do not need to be converted. To describe what a self-loop is, we need to break out a lot of concepts such as finite automaton (DNF), closures, and state diagrams. These are not the focus of this article! The symbol A must also be processed. <predstring> You can directly export <preds>, and <preds> can be used as the prefix of A to export <preds> yourself, then the <preds> production formula can be rewritten:
<Predstring> → <predstring> A <predstring>
<Preds>→ <Predstring>
Draw a status chart. The last one to handle is:
<Statement> → <predname> <verbpred> <predname> | <predname> <verbpred>
Obviously, <predname> and <verbpred> can be used as a whole. to simplify the formula, we have another symbol <predverb>:
<Predverb> → <predname> <verbpred>
In this way, the statement syntax can be rewritten to the regular Syntax:
<Statement >→<Predverb> <Predname> | <Predverb>
The processed regular syntax is as follows:
A → A | E | I | o | u
MoD → ga | Ge | GI | go | Gu
Ba → Ba | be | BI | Bo | Bu
Da → da | de | di | Do | du
La → la | le | Li | lo | Lu
NAM → {all names}
Preda → {All predicates} | Preda
<Sentence> → <statement >|< predclaim>
<Predclaim> → <predname> BA <preds> | da <preds>
<Preds> → <predstring>
<Predname> → la <predstring> | Nam
<Predstring> → Preda | <predstring> A <predstring>
<Predverb> → <Predname> <verbpred>
<Statement >→< Predverb > <Predname> | < Predverb >
<Verbpred> → mod <predstring>
Based on this syntax, you can quickly judge the correctness of the statement. First, convert all input strings to words (including predicates and small words) or names. You can judge whether the last letter is a word or name based on whether it is a vowel, if it is a name, it will be converted into Nam. If it is a word, you need to look up the table according to the grammar above (there is no need to look up the table, seeCode). After the entire sentence is converted, the symbols must be deduced in sequence. The process is as follows:
-
- Preda→Preda
-
- <Predstring>→Preda
-
- <Predname>→Nam
-
- <Predname>→La <predstring>
-
- <Verbpred>→MoD <predstring>
- <Predstring>→<Predstring> A <predstring>
-
- <Preds>→<Predstring>
-
- <Predclaim>→Da <preds>
-
- <Predclaim>→<Predname> BA <preds>
-
- <Predverb>→<Predname> <verbpred>
-
- <Statement>→<Predverb> <predname>
-
- <Statement>→<Predverb>
- <Predclaim>→ <Sentence>
-
- <Statement>→ <Sentence>
If <sentence> is finally pushed, the syntax is correct. Otherwise, the syntax is incorrect. The specific implementation method of derivation is to insert and delete Dynamic Arrays: If one symbol can be directly exported to another, it will be rewritten to the export symbol; if one or more symbols can be exported together with the prefix, the prefix will be deleted, and the current symbol will be changed to the exported symbol. The suffix will also be deduced along with the prefix. DetailsAlgorithmFor more information, see the code annotations.
The sequence of the state transition table in the program is the same as that in the preceding derivation, but many of the order can be changed. For example, you can complete Step 1 in advance and Step 2 in advance. The condition for arranging order is that the symbolic sequence behind the arrow has been fully deduced before this step is executed, that is, no generative formula can generate any symbol behind the arrow. This ensures that the derivation of the current symbol is completed at one time. If such a condition is not guaranteed, the backend engineer is required and an error may occur. For example, if you want to execute Step 1 before step 2, but the <preds> symbol is not yet exported in the sentence, it is impossible for the symbol da to successfully export <predclaim>. For example, if Step 1 is executed before step 6, all <predstring> values are deduced to <preds>, and step 2 is never completed, because a can be deduced only after the prefix and suffix of <predstring>.
In the implementation of a program, it is acceptable to deduce a symbol as a prefix or suffix. For example, <predname> can be used as the suffix of <predverb> <Statement>, <predverb> can also be used as the prefix of <predname> for the same derivation.
The abbreviation of a symbol in the symbol Enumeration type is as follows:
UN: Unknown/Error
PS: predstring
P: preds
Pn: predname
SE: Sentence
VP: verbpred
PV: predverb
PC: predclaim
St: Statement
The other symbol names are the same as those given in the question.
Solution
Answer
# Include <iostream> # include <string> # include <vector> using namespace STD; // enumeration of various symbols. The annotations below are the corresponding symbol Enum symbol {, moD, La, ba, da, Preda, NAM, Se, PC, P, Pn, PS, St, VP, PV, un}; bool avowel [] = {1, 0, }; // static symbol aconvtbl [14] [4] = {// status conversion table. Each of the four statuses is a group of {Preda, UN, Preda, Preda} in sequence }, {Preda, UN, UN, PS}, {Nam, UN, UN, PN}, {LA, UN, PS, PN}, {mod, UN, PS, VP}, {A, PS, PS}, {ps, UN, UN, p}, {da, UN, P, PC}, {Ba, Pn, p, PC}, {VP, Pn, UN, PV}, {PV, UN, Pn, St}, {PV, UN, UN, St}, {PC, UN, UN, Se}, {St, UN, UN, Se}, // four symbols in each group}; // 1: initial symbol, 2: prefix, 3: suffix, 4: the exported symbol // the function that converts the input string to the symbol token2status (const string & Str) {int nnum = Str. length (), clast = STR [nnum-1]; If (! Islower (clast) |! Avowel [clast-'a']) {return Nam; // Nam} switch (nnum) {Case 1: return; // only one vowel can be acase 5: // use bitwise operations to quickly determine whether the predicate meets the rule ccvcv or cvccvnnum = avowel [STR [4]-'a']; nnum | = (avowel [STR [0]-'a'] <4) | (avowel [STR [1]-'a'] <3 )); nnum | = (avowel [STR [2]-'a'] <2) | (avowel [STR [3]-'a'] <1 )); return (nnum = 5 | nnum = 9 )? Preda: UN; Case 2: // switch (STR [0]) {// determine the group of case 'G': Return MOD based on the first digit; case 'B': Return BA; Case 'D': Return da; Case 'l': return la;} return UN; // unrecognized error symbol} // main function int main (void) {vector <symbol> set; For (string STR; CIN> STR & Str! = "#";) {// Cyclically read each word int ndot = Str. find ('. '); // if a sentence is found in a word, it is deemed that the sentence ends if (ndot = Str. NPOs) {// No period set found. push_back (token2status (STR); // convert the word into a symbol and save it to the statement "continue";} // The following is the time when the sentence ends. erase (Str. length ()-1); // delete a period if (! Str. empty () {// if the word is not empty, add the statement set. push_back (token2status (STR) ;}// perform lexical analysis and output the result for (INT I = 0; I <14; ++ I) {// process each status in sequence, symbol * ptbl = aconvtbl [I]; // set the temporary variable for (vector <symbol> :: iterator J = set. begin (); J! = Set. End ();) {vector <symbol>: iterator ibeg = set. Begin (), iend = set. End (); If (* J! = Ptbl [0]) {++ J; // It is not the specified symbol. In this example, the next continue ;} // if the first or second adjacent symbols are specified, verify whether they exist if (ptbl [1]! = UN & (j = ibeg | * (J-1 )! = Ptbl [1]) {++ J; // The existing symbol is inconsistent with the specified one. The result is "continue;"} If (ptbl [2]! = UN & (j = iend-1 | * (J + 1 )! = Ptbl [2]) {++ J; // The existing symbols do not match the specified one. The result is incorrect.} // Delete the symbols before and after the deletion (if specified) j = ptbl [1]! = UN? Set. Erase (J-1): J; j = ptbl [2]! = UN? Set. erase (J + 1)-1: J; * j = ptbl [3]; // change the current symbol to the specified target symbol} cout <(set. size () = 1 & set [0] = Se? "Good": "bad! ") <Endl; set. Clear (); // clear statement, prepare to process the next statement} return 0 ;}