The selection of the big data query engine draws several structural diagrams and makes some comparative analysis:
I. Presto
Ii. Impala
3. hawq
Iv. Overall comparison:
1) MPP architecture, with no significant performance gaps
2) hawq has more comprehensive functions and features than Presto and Impala, and brings risks of complicated system configurations and high maintenance costs.
3) Presto and Impala both have their own obvious advantages:
1. Presto can access multiple data sources through connector, with high flexibility, while Impala only supports limited data source types
2. Impala naturally supports High Availability of coordinator. Presto Coordinator has a single point of failure and must be manually recovered.
3. impala is easier to deploy and integrate in the existing CDH Environment
Currently, hive is slow.
The stability, ease-of-use, and maintainability of the alternative solution are preferred to meet the conditions of high speed,
Presto. vs Impala. vs hawq Query Engine