CDC different modes in ODI embody one of the series synchronous mode

Source: Internet
Author: User

CDC different modes in ODI embody one of the series synchronous mode

Oracle Database change Data Capture feature

Change data capture is a generic term that is used to describe capturing incremental changes applied to the data store. With the increasing of data volume and the Changing of data storage, data capture is an important function of data warehouse, especially the requirement of real-time or near-real-time data warehouse.

In the Oracle database concept, data acquisition changes database functionality to capture increments on an Oracle database. Traditionally you will have to modify the source code of the app to capture incremental changes.

Oracle's change data capture enables incremental changes to be captured without making any changes to the source program.

Oracle Database 10g version 2nd introduces the asynchronous distributed CDC, which is much more improved than the original 9i database version 2nd. This recipe describes how to set up such an asynchronous change data capture environment.

Pattern Classification Comparison:

Synchronous CDC Mode (synchronous mode) captures incremental data by establishing trigger on the source library, so you can extract incremental data in real time.

The asynchronous Hotlog pattern (asynchronous Hotlog mode) extracts delta data directly from the online redo log file of source database, and there is a time delay due to the need to parse the log files. The change table must also be generated in the source library. This mode, because it is parsing the log in the source database, also creates some pressure on the source database, but is less stressful than synchronizing the CDC mode.

The asynchronous Distributed Hotlog pattern (asynchronous distributed Hotlog mode) compares to the asynchronous Hotlog pattern by parsing the delta data in the current online log of more than one source database. Then pass a staging database processing to facilitate centralized data management.

The asynchronous AutoLog pattern (asynchronous AutoLog mode) is to pass the log file from the source database to the staging database, and then perform a log analysis on the staging database. This minimizes the pressure on the source database

Publishers and Subscribers (publishing and subscribing users)

Publisher is the database user who publishes the change data capture. So the ODI needs to create a publisher first.

Subscriber is a database user who uses this captured change data through so-called ordering, and can also be understood as subscribers.

For security reasons, publishers and subscribing users should not be the same database user. A Publisher can support many subscriber users.

CDC leverages change tables and user views. Data changes are written to the change table to provide a scalable infrastructure for using CDC. Users can obtain a view of a consistent set of data changes for data in a change table. Users can expand and clear subscribed windows and implicitly change datasets through database views. If no user subscribes to the change data, the data in the change table can be erased. The CDC is called through a PL/SQL package.

CDC has two ways to capture data synchronously with CDC and asynchronous CDC.

There are 3 modes of asynchronous CDC: Asynchronous Hotlog mode, asynchronous distributed Hotlog mode, asynchronous AutoLog mode.

How to create a synchronous CDC mode:

1. Create a Publisher user, create a change set in the SYS user, and then create the corresponding change table in the change set

(change table), a change set can correspond to multiple change table.

2. Create subscriber users, and Publisher users grant change table access to subscriber users.

3.Subscriber can create a subscription (subscription) and can subscribe to multiple source table in the change set by creating a subscribe

This table is the source table for change table.

4. Activate subscription Activate subscription

5. Expand the window extend_windows, the extension window will not see the data changes. If you want to see the data change every time, you must expand the window every time.

The 5 steps above are the steps implemented in the Oracle database, so how does it manifest in ODI? Let's take a concrete approach to implementation.

Because synchronous mode CDC is implemented by means of triggers, the database can be run in a non-archived mode.

1. In ODI, CDC is implemented through the Log knowledge module, so we first introduce the Knowledge module

Select the journal JKM in the ODI project and choose the right button to import the Knowledge module, because we are using the synchronous mode CDC, select "JKM Oracle simple" knowledge km, we can also see JKM Oracle consistent, and then the asynchronous CDC, We will quote in, hehe.

2.

Select the Logging tab in the Model properties, and then select Simple, select logging km, you can see the km,jkm Oracle SIMPLE.CDC We added just now, this km is described below, not suitable for frequent operation, does not bring value to the production system, Instead, it affects performance. Because this jkm is the form of a trigger.

At this point, we have already defined the JKM in the model, we can add it to CDC in the model's data store, we can also do CDC for the whole pattern, here we choose a separate data store.

Let's look at the existing triggers for the system:

Sql> Conn Odi_tmp/odi_tmp

is connected.

Sql> select * from User_triggers;

Row not selected

Sql>

You can see that there are no triggers under the current development user.

  

After selecting JKM, click Apply.

Later, we can confirm with the database that the ODI has established a trigger in the database.

3. Start adding data store to DataSource

When you add a data source to CDC, you see an orange icon for the table in the upper-left corner of the CDC that has been added to the CDC but not yet started.

4. Start the log

After the log is started, we need to define the subscribers to the ODI, which is the user who needs to capture this log data.

The startup log pops up the Execution dialog box, selects the appropriate context, and if you switch to the production library, switch the context.

5. Start the subscription log and select Subscribers

Subscribe, because we use Supervisor user login, so the system will be used by default

Sunopsis Users

When you click OK, you'll see a green clock icon in the upper left corner of the data store, indicating that CDC has been successfully started

6. View the subscription status of CDC

The green icon indicates that the subscription has been successful.

7. Verify that the corresponding trigger is also used in the database

This time, look again at the odi_tmp below the trigger, why do you want to see odi_tmp below:

Because when we define Oracle connections in Topology Manager, we have defined the users of the work architecture,

The default is the same as the schema user, so we can see the benefits of choosing a working schema user.

You can see that the trigger has been successfully created by the ODI:

Sql>select Trigger_name,trigger_type,table_name,status,action_type from User_triggers;

Trigger_na trigger_type Table_na STATUS action_type

---------- ---------------- -------- ---------------- ----------------------

T$customer after each ROW CUSTOMER ENABLED PL/SQL

8. Let's test the CDC sync mode data

After we right-click on the data model, tap the data to edit the data

Click on the + sign, we add a piece of data

After clicking on the app, we can see that the data has been added

Click Log data to see if CDC has successfully captured data from database tables

9. See if ODI captures data

We add the synchronous CDC to the interface

We clicked the box in front of the source data source with a "record only recorded data" and then clicked on to add the CDC to the interface successfully.

1.jpg

2.jpg

3.jpg

4.jpg

5.jpg

6.jpg

7.jpg

8.jpg

9.jpg

10.jpg

11.jpg

12.jpg

13.jpg

14.jpg

CDC different modes in ODI embody one of the series synchronous mode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.