1. Stream computing SQL principles and architecture
Stream computing SQL is typically a declarative language for class SQL, primarily for continuous queries of streaming data (Streams), to be used on the underlying APIs of common flow computing platforms and frameworks such as Storm, Spark streaming, Flink, beam, and so on.
Reduce the threshold for real-time development by building a SQL abstraction layer using a simple, common SQL language.
The principle of Stream computing SQL is simple, that is, a bridge between SQL and the underlying stream computing engine---Stream computing SQL is submitted by the user, translated into the underlying API by the SQL engine layer and executed on the underlying stream computing engine. Like the storm.
, it is automatically translated into Storm's task topology and runs on the storm cluster.
The stream computing SQL engine is the core of stream computing SQL, which is mainly responsible for the syntax analysis, semantic analysis, logical plan generation, logical plan execution, and physical execution plan generation of user SQL input. The underlying flow computing platform is the real calculation.
Unlike offline tasks, real-time data is constantly flowing in, so in order to use SQL to abstract convection processing, Stream computing SQL also introduces the concept of "table", but the table here is a dynamic table.
The schema for stream computing SQL is as follows:
SQL Layer : Stream computes the interface of SQL to the user, which provides various functions such as filtering, transformation, association, Aggregation, window, select, Union, split, and so on.
SQL engine Layer : Responsible for SQL parsing/validation, logical Plan generation optimization, and physical plan execution.
The flow calculation engine layer : Executes the execution plan generated by the SQL engine layer specifically.
Big Data Development Combat: Stream SQL Real-time development