Enables incremental synchronization of data from Oracle to Greenplum

Source: Internet
Author: User

Brief introduction:

Greenplum is a database repository based on the MPP architecture developed by PostgreSQL database, which is suitable for OLAP systems, and supports the storage and processing of 50PB (1PB=1000TB)-level massive data.

Background:

One business today is the need to synchronize the underlying data in an Oracle database to the Greenplum Data Warehouse for data analysis and processing.

Scale:

Produce around 60G of data per day, and the largest table adds hundreds of billions of data per day.

Workaround:

1) The historical data is initialized by extracting the imported method.

2) Incremental update data:

Use Goldengate to pass Oracle log parsing to the node where Greenplum resides.

The Greenplum node synchronizes goldengate parsed log records incrementally to the Greenplum database repository through a program.

Final Result:

1. Initialize the data for about three days at a time, initializing about 5T of data.

2. The incremental synchronization data is delayed by no more than 3 hours.

3.GreenPlum performance is optimized to 10~100 times faster than queries on the Oracle database (Greenplum's machine configuration is considerably lower).

4. Compression of some large tables reduces the overhead of storage space and I/O.

5. No column storage is used, there are too many columns in the large table, and compression is only done for columns that are not suitable for column-type storage.

6. The distribution keys of some tables are adjusted, which greatly improves the efficiency of data analysis.

Enables incremental synchronization of data from Oracle to Greenplum

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.