As mentioned in (1), lexer is used as a tool to cut SQL strings and convert statements into a tokens array.
Parser completes the post-order part of SQL parsing: Use a lexer object as a tool to cut out tokens, parse the semantics, and bind the relevant system interface.
Here, we should first review the SQL syntax supported by simpledb, which affects the method it uses when parsing strings.
<Query> := SELECT <SelectList> FROM <TableList> [ WHERE <Predicate> ]
<SelectList> := <Field> [ , <SelectList> ]
<TableList> := TableName [ , <TableList> ]
<Predicate> := <Term> [ AND <Predicate> ]
<Term> := <Expression> = <Expression>
<Expression> := <Field> | <Constant>
<Field> := FieldName
<Constant> := String | Integer
<Modify> := <Insert> | <Delete> | <Update>
<Insert> := INSERT INTO TableName ( <FieldList> ) VALUES ( <ConstList> )
<FieldList> := <Field> [ , <FieldList> ]
<ConstList> := <Constant> [ , <Constant> ]
<Delete> := DELETE FROM TableName [ WHERE <Predicate> ]
<Update> := UPDATE TableName SET <Field> = <Expression> [ WHERE <Predicate> ]
<Create> := <CreateTable> | <CreateView> | <CreateIndex>
<CreateTable>:= CREATE TABLE TableName ( <FieldDefs> )
<FieldDefs> := <FieldDef> [ , <FieldDefs> ]
<FieldDef> := FieldName <TypeDef>
<TypeDef> := INT | VARCHAR ( Integer )
<CreateView> := CREATE VIEW ViewName AS <Query>
<CreateIndex>:= CREATE INDEX IndexName ON TableName ( <Field> )
Parser implements "one-click compilation" in the compilation technology, that is, scanning the tokens array through lex sequence, and recognizing each token element during the scanning process, according
Constant/field-> Expression-> Term-> Predicate
To complete parameter packaging and output a data class associated with the query, as shown in Class 1 in (1.
First, take the example "select sid, sname from students where sid = '000000'" in figure (1), and give a syntax tree constructed during parser parsing:
Figure 1 example syntax tree
The code for the Query method is as follows:
public QueryData query()
{
lex.eatKeyword("select");
List<string> fields = selectList();
lex.eatKeyword("from");
List<string> tables = tableList();
Predicate pred = new Predicate();
if (lex.matchKeyword("where"))
{
lex.eatKeyword("where");
pred = predicate();
}
return new QueryData(fields, tables, pred);
}
As shown in the Code and the preceding syntax tree, simpledb parses the Field List fieldlist strictly according to the type supported in the syntax and the "stuck" keyword during SQL statement parsing, tablelist In the table name list and predicates in the predicates list. Then, pack the data actually used in these queries into the corresponding objects, and parse the SQL statement. The preceding example shows that after the QueryData object is packaged, it is passed to the query processing module. Some methods in the query are used to read the query data according to fieldlist, tablelist, and predicates.
In parser, the create (), delete (), insert (), query (), and modify () Methods correspond to the preceding syntaxes.
As the entry for SQL statement parsing, The updateCMD method is as follows. Branches are implemented based on the first token of the SQL statement:
public object updateCmd()
{
if (lex.matchKeyword("insert"))
return insert();
else if (lex.matchKeyword("delete"))
return delete();
else if (lex.matchKeyword("update"))
return modify();
else
return create();
}
Additionally, pay attention to the preceding query () code. lex uses match * () to check whether the next token meets the matching conditions. Use eat *() to describe the token that meets the condition:
public void eatDelim(char d);
public string eatId();
public int eatIntConstant()
public void eatKeyword(string w)
public string eatStringConstant()
All data-related items have returned values, and the processed results are returned. Data-independent items have no returned values, but the position pointer is actually moved using nextToken.
Tablelist and fieldlist are constructed recursively:
private List<string> fieldList()
{
List<string> l = new List<string>();
l.Add(field());
if (lex.matchDelim(','))
{
lex.eatDelim(',');
l.AddRange(fieldList());
}
return l;
}
Predicates also uses recursion:
private Predicate predicate()
{
Predicate pred = new Predicate(term());
if (lex.matchKeyword("and"))
{
lex.eatKeyword("and");
pred.conjoinWith(predicate());
}
return pred;
}
The difference is that the conjoinwith method of the Predicate class is used during predicate join. In fact, a condition list terms is maintained under the Predicate object, this method is to store each term in the predicate together.
This article and the previous section briefly describe the ideas and processes of SQL statement parsing. We can see that the parsing result is various query data generated, which will be passed to the query module, the query module uses the obtained data to query the data.