CDC different modes in ODI embody one of the series synchronous mode
Oracle Database change Data Capture feature
Change data capture is a generic term that is used to describe capturing incremental changes applied to the data store. With the increasing of data volume and the Changing of data storage, data capture is an important function of data warehouse, especially the requirement of real-time or near-real-time data warehouse.
In the Oracle database concept, data acquisition changes database functionality to capture increments on an Oracle database. Traditionally you will have to modify the source code of the app to capture incremental changes.
Oracle's change data capture enables incremental changes to be captured without making any changes to the source program.
Oracle Database 10g version 2nd introduces the asynchronous distributed CDC, which is much more improved than the original 9i database version 2nd. This recipe describes how to set up such an asynchronous change data capture environment.
Pattern Classification Comparison:
Synchronous CDC Mode (synchronous mode) captures incremental data by establishing trigger on the source library, so you can extract incremental data in real time.
The asynchronous Hotlog pattern (asynchronous Hotlog mode) extracts delta data directly from the online redo log file of source database, and there is a time delay due to the need to parse the log files. The change table must also be generated in the source library. This mode, because it is parsing the log in the source database, also creates some pressure on the source database, but is less stressful than synchronizing the CDC mode.
The asynchronous Distributed Hotlog pattern (asynchronous distributed Hotlog mode) compares to the asynchronous Hotlog pattern by parsing the delta data in the current online log of more than one source database. Then pass a staging database processing to facilitate centralized data management.
The asynchronous AutoLog pattern (asynchronous AutoLog mode) is to pass the log file from the source database to the staging database, and then perform a log analysis on the staging database. This minimizes the pressure on the source database
Publishers and Subscribers (publishing and subscribing users)
Publisher is the database user who publishes the change data capture. So the ODI needs to create a publisher first.
Subscriber is a database user who uses this captured change data through so-called ordering, and can also be understood as subscribers.
For security reasons, publishers and subscribing users should not be the same database user. A Publisher can support many subscriber users.
CDC leverages change tables and user views. Data changes are written to the change table to provide a scalable infrastructure for using CDC. Users can obtain a view of a consistent set of data changes for data in a change table. Users can expand and clear subscribed windows and implicitly change datasets through database views. If no user subscribes to the change data, the data in the change table can be erased. The CDC is called through a PL/SQL package.
CDC has two ways to capture data synchronously with CDC and asynchronous CDC.
There are 3 modes of asynchronous CDC: Asynchronous Hotlog mode, asynchronous distributed Hotlog mode, asynchronous AutoLog mode.
How to create a synchronous CDC mode:
1. Create a Publisher user, create a change set in the SYS user, and then create the corresponding change table in the change set
(change table), a change set can correspond to multiple change table.
2. Create subscriber users, and Publisher users grant change table access to subscriber users.
3.Subscriber can create a subscription (subscription) and can subscribe to multiple source table in the change set by creating a subscribe
This table is the source table for change table.
4. Activate subscription Activate subscription
5. Expand the window extend_windows, the extension window will not see the data changes. If you want to see the data change every time, you must expand the window every time.
The 5 steps above are the steps implemented in the Oracle database, so how does it manifest in ODI? Let's take a concrete approach to implementation.
Because synchronous mode CDC is implemented by means of triggers, the database can be run in a non-archived mode.
1. In ODI, CDC is implemented through the Log knowledge module, so we first introduce the Knowledge module
Select the journal JKM in the ODI project and choose the right button to import the Knowledge module, because we are using the synchronous mode CDC, select "JKM Oracle simple" knowledge km, we can also see JKM Oracle consistent, and then the asynchronous CDC, We will quote in, hehe.
2.
Select the Logging tab in the Model properties, and then select Simple, select logging km, you can see the km,jkm Oracle SIMPLE.CDC We added just now, this km is described below, not suitable for frequent operation, does not bring value to the production system, Instead, it affects performance. Because this jkm is the form of a trigger.
At this point, we have already defined the JKM in the model, we can add it to CDC in the model's data store, we can also do CDC for the whole pattern, here we choose a separate data store.
Let's look at the existing triggers for the system:
Sql> Conn Odi_tmp/odi_tmp
is connected.
Sql> select * from User_triggers;
Row not selected
Sql>
You can see that there are no triggers under the current development user.
After selecting JKM, click Apply.
Later, we can confirm with the database that the ODI has established a trigger in the database.
3. Start adding data store to DataSource
When you add a data source to CDC, you see an orange icon for the table in the upper-left corner of the CDC that has been added to the CDC but not yet started.
4. Start the log
After the log is started, we need to define the subscribers to the ODI, which is the user who needs to capture this log data.
The startup log pops up the Execution dialog box, selects the appropriate context, and if you switch to the production library, switch the context.
5. Start the subscription log and select Subscribers
Subscribe, because we use Supervisor user login, so the system will be used by default
Sunopsis Users
When you click OK, you'll see a green clock icon in the upper left corner of the data store, indicating that CDC has been successfully started
6. View the subscription status of CDC
The green icon indicates that the subscription has been successful.
7. Verify that the corresponding trigger is also used in the database
This time, look again at the odi_tmp below the trigger, why do you want to see odi_tmp below:
Because when we define Oracle connections in Topology Manager, we have defined the users of the work architecture,
The default is the same as the schema user, so we can see the benefits of choosing a working schema user.
You can see that the trigger has been successfully created by the ODI:
Sql>select Trigger_name,trigger_type,table_name,status,action_type from User_triggers;
Trigger_na trigger_type Table_na STATUS action_type
---------- ---------------- -------- ---------------- ----------------------
T$customer after each ROW CUSTOMER ENABLED PL/SQL
8. Let's test the CDC sync mode data
After we right-click on the data model, tap the data to edit the data
Click on the + sign, we add a piece of data
After clicking on the app, we can see that the data has been added
Click Log data to see if CDC has successfully captured data from database tables
9. See if ODI captures data
We add the synchronous CDC to the interface
We clicked the box in front of the source data source with a "record only recorded data" and then clicked on to add the CDC to the interface successfully.
1.jpg
2.jpg
3.jpg
4.jpg
5.jpg
6.jpg
7.jpg
8.jpg
9.jpg
10.jpg
11.jpg
12.jpg
13.jpg
14.jpg
CDC different modes in ODI embody one of the series synchronous mode