Greenplum (GPDB) is open source !~
Greenplum database (GPDB) is a non-shared large-scale parallel processing database. It is mainly used to process large-scale data analysis tasks, including data warehouses, business intelligence (OLAP), and data mining. GPDB is designed for mass data analysis. It uses the most advanced cost-based query optimizer and is one of the most advanced open source databases, quick and efficient query and Analysis of PB-level data.
PostgresQL-based commercial version database GreenPlum is officially open-source. Its source code is currently available on GitHub: Ghost.
Greenplum database server software is an advanced, full-featured open-source data warehouse management software. It provides powerful and efficient analysis functions for PB-level data. Especially in big data analysis, the Greenplum database is equipped with the world's most advanced query optimizer Based on computing costs to achieve high query and analysis performance for big data.
The Greenplum open-source project uses the Apache 2 copyright protocol. Greenplum also expressed its gratitude to community contributors and other enthusiasts for their contributions to their products. For the Greenplus community, any form of contribution is meaningful to the product, and Greenplum also appreciates and encourages contributions in various forms.
"Open-source large-scale parallel data warehouse"
About Greenplum Database
- Greenplum is developed based on PostgreSQL and has added many important innovative development related to data warehouse operations:
- Large-scale parallel processing architecture: Greenplum databases automatically provide parallel processing capabilities for all data and queries;
- PB-level load processing capability: MPP technology can be used to maintain high performance under high load, and each rack can process up to 10 TB of data per hour.
- Innovative query optimizer: Greenplum is the first query optimizer in the industry to design big data load based on the cost-first principle. It can be implemented in interactive mode or batch processing mode, analyze and process PB-level big data without reducing query performance and data processing throughput.
- Multi-state data storage and execution: Table or partition storage, execution, and compression settings can be flexibly configured according to the access mode. When you store or process row-level or column-level data, you can choose one based on your needs.
- Advanced machine learning: After the Apache MADLib library is introduced, the internal analysis function is expanded in the Greenplum database through user-defined functions.
1. Greenplum source code and documentation and related information: http://greenplum.org/
2. Greenplum source code: https://github.com/greenplum-db
3. the URL of Pivotal, a selfless contributor: https://pivotal.io/big-data/pivotal-greenplum