InfoSphere CDC Real-time synchronization of local data to the cloud Biginsights

Source: Internet
Author: User

IBM InfoSphere CDC is a powerful data real-time replication software that is not only widely used for heterogeneous platform integration of traditional ODS, data warehouses, data marts and BI systems, but also provides full support for cloud, and for various cloud scenarios, CDC not only provides low impact , near real-time mass data replication, while also ensuring the integrity and security of data during transmission.

As IBM's flagship brand, the Bluemix public cloud platform is a platform-as-a-service (PaaS) offering based on the cloud Foundry Open source project, which enables organizations and developers to quickly and easily create, deploy, and manage applications on the cloud. Bluemix offers a wide range of applications and services to the world's customers, including IBM Hadoop products in the cloud: BigInsights.

With IBM InfoSphere CDC, you can easily synchronize the data on your local database (on Premise) to biginsights in the Bluemix cloud in real time to address several of the big data analytics challenges:

    • Processing of massive amounts of data
    • Diversity of data sources
    • The Agility of data analysis
    • The persistence of data analysis

Next, we'll show you how to use CDC to build a real-time synchronization scenario for a local database (for example, DB2) to the cloud biginsights.

On Premise System Configuration

1. Configure and confirm that the current DB2 database is functioning correctly.

2. Install Infosphere CDC for DB2 (the CDC engine at the source to capture incremental data changes by resolving DB2 logs in real time).

3. Install Infosphere CDC for DataStage (the CDC engine on the target side to apply real-time incremental data from the source side to the target Hadoop platform/hdfs file system).

4. Configure the CDC server's internal network connection (firewall) to Bluemix.

5. Install the CDC Configuration Management Monitoring platform (Management Console and Access Server).

Create a BigInsights for Apache Hadoop service

1. Sign in to the Bluemix platform (with Bluemix ID required).

https://console.ng.bluemix.net/

2. Click "Contents" at the top of the page and tick "data and analysis" in the "Services" section on the left side of the page, then select "BigInsights for Apache Hadoop".

3. Go to the BigInsights for Apache Hadoop page, specify the relevant properties and create the service.

Check the BigInsights for Apache Hadoop service

1. From the user dashboard of Bluemix, click on the newly created "BigInsights for Apache Hadoop" service.

2. Check the validity period of the current service, usually free of charge for one months.

3. Check the credentials and configuration information for the current service, such as user name and password.

Start the BigInsights for Apache Hadoop service

1. Click "Launch" in the Biginsights for Apache Hadoop page to start the service.

2. Check the hostname, port (8443), and URL prefix (/gateway/default/) for the biginsights for Apache Hadoop for use by the CDC configuration.

Create a CDC subscription and configure table mappings

1. In the CDC Configuration Management Monitoring platform (Management Console), create a subscription and run the Table Mapping Wizard.

2. Select the target-side delivery method for Apache Hadoop-Web HDFS.

3. Select the DB2 source table that you want to copy, and specify the Web HDFs directory path for the target-side biginsights.

Configure Hadoop properties for CDC subscriptions

1. Right click on the reservation and select Hadoop Properties.

2. You can modify the batch size value (Generate a trigger condition for the floor file) and enter the connection information biginsights through the Web HDFs connection.

3. Start the subscription and start the live copy.

Verifying data real-time synchronization results

1. Run several transactions on the local DB2 database to make changes to the source table data that CDC is monitoring.

2. Click Bigsheets on the target end of the biginsights for Apache Hadoop home page to monitor the data from the source side

The DB2 has been synchronized in real time, completely automated, low latency and accurate.


Friends interested in Biginsights can click the link below to view the details and download:
Http://bigdata.evget.com/product/385.html

InfoSphere CDC Real-time synchronization of local data to the cloud Biginsights

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.