Perform
The previous article described how to use the built-in DSL support provided by Scala to implement a parser that can parse SQL, how to get the parsing results-ast, how to operate on the data, and get the results we want. Previously, the reason for choosing Scala as an implementation of this engine is that Scala provides convenient DSL implementation support, and the second is that Scala provides a rich set of functions for collection operations as a functional programming language. In addition, the function is a separate type in Scala, so it is possible to combine existing functions to get a more powerful function (as in the previous article, the parser with the analytic composition is the same as the parser that has been used to get the more powerful).
First, we need to figure out the order in which the ordinary SQL statements are executed, in general, SQL is written in the order
- SELECT[DISTINCT]
- From
- WHERE
- GROUP by
- Having
- UNION
- ORDER by
However, the order of execution is
- From
- WHERE
- GROUP by
- Having
- SELECT
- DISTINCT
- UNION
- ORDER by
This implementation of SQL execution does not support joins, so the Union is omitted, but the approximate order is similar, if you think of a list<map> as a database table, then the order of execution can be shown, the figure green arrows indicate that can be executed concurrently, The execution of an aggregate function is not concurrent, but because the data has been grouped, it can be concurrency at a higher level.
Understanding the approximate execution process, here are the functions that each process executes.
- WHERE clause
The execution of the WHERE clause uses the filter function to filter out data that is not eligible. For example, the following section of the code, the list of the odd number to filter out.
scala> val L = list (1,2,3,4,5,6= List (1, 2, 3, 4, 5, 6) Scala> L filter (_%2==0= List (2, 4, 6)
The process of judging each element can be performed concurrently, and you only need to write it so that you can perform a secure concurrency operation.
Scala> l.par.filter (_%2==0= List (2, 4, 6)
In the Scala_sql engine, the function that implements where is
def where (where:option[sqlexpr]): Table = { where match { case None + = table Case Some (x:sqlexpr) = Table Filter (Evalwhereeachrow (_, x) } }}
where Evalwhereeachrow (_,x) is another function, the first parameter is a column in the table, a map object, and the second parameter is the portion of the AST that is parsed by SQL, corresponding to the WHERE clause.
- GroupBy clause
GroupBy operations for collections are also available in Scala, followed by the previous example
Scala> L groupBy (_%2= Map (1, List (1, 3, 5), 0, List (2, 4, 6))
Above this function, the L this list is grouped according to the parity.
In the Scala _sql engine, the function that implements GroupBy is
def evalgroupby (Table:table, Groupby:sqlgroupby): seq[table] = { = groupby.keys map { Case x:fieldident = x.name } = Keys.map (Row (_))). Map (_._2). Toseq }
Conclusion
There are two main types of DSLs, internal DSLs and external DSLs, and for external DSLs, a parser is required to parse the DSL's script and get a data structure that can be processed by the program, which is usually an AST. There are basically two ways to implement the parser you need:
- Write one manually, such as the SQL parsing module in Druid.
- Using parsing generators such as Antrl, a parser is generated by writing grammar rules.
In recent years, with the function of programming slowly get the attention of the industry, the use of parser combinator (analytic composition sub) way to write a parser to implement the DSL has also entered the public view.
Implementing a SQL execution engine in Scala-(bottom)