To help enterprise users find more efficient ways to speed up Hadoop data queries, the Apache Software Foundation has launched an open source project called Drill. Apache Drill implements Google's Dremel.
Apache Drill introduced the JSON file model on SQL-based data analysis and business Intelligence (BI), which enables users to query fixed schemas, evolutionary architectures, and schema-independent (Schema-free) data in various formats and data stores. The construction of relational query engines and databases in this architecture is a prerequisite, assuming that all data has a simple static schema.
Apache Drill Architects are unique. It is the only columnar execution engine that supports complex and modeless data (columnar execution engine) and the only one that can perform data-driven queries during query execution (and recompile, also known as schema discovery) The execution engine (execution engine). These unique features enable Apache Drill to record breakpoint performance (record-breaking performance) in JSON file mode.
The project will create an open source version of the Google Dremel Hadoop tool (Google uses this tool to speed up the Internet application of Hadoop data analysis tools). "Drill" will help Hadoop users achieve faster querying of massive datasets.
Data:
Compatible with existing SQL environments and Apache Hive:
The "Drill" project is also inspired by Google's Dremel project: The project helps Google with the analysis of massive datasets, including analyzing crawling Web documents, tracking application data installed on Android Market, analyzing spam, Analyze the test results on Google's distributed build system and more.
By developing the "drill" Apache open source project, organizations will be expected to build drill-owned API interfaces and flexible and powerful architectures to help support a wide range of data sources, data formats and query languages.
Drill query:
Drillbit Core Model:
Drill Compiler:
Apache Open Source Project--apache Drill