Data Extraction Technology Summary

Source: Internet
Author: User

Data Extraction Technology:

1. Static Data Capture
Used for Data Warehouse first-time data warehouse, second, data that needs to be completely modified

2. Incremental data capture

(1) transaction logs, or database logs, including flashback query of oracle. You can use the diff tool to analyze the differences. (Rarely used)
(2) Capture from database triggers: triggers write information in an incremental data change table (including deleted information ). Other integration components regularly read the table.
(3) capture based on date and time tags/similar capture based on absolute auto-increment ID: the deletion problem cannot be solved.
Similar to MS-sqlserver, Oracle 10 GB and later versions have a rowversion, which can also be used as a time mark.
If some source data does not have these fields, you can add some of the above marked fields without interrupting the original table. You can consult the source system developer first.
(4) For a database with a set of computing functions, such as Oracle with minus, you can perform a set operation based on some primary keys and store the differential results in a separate table for otherProgramRead.
(5) capture by full table scan and comparison: compare two snapshots of source data. When the data size is very large, performance becomes a problem.

This is the worst choice for fields without (3) and other methods are unavailable. Full use of various segment scansAlgorithm. (Rarely used)
(6) capture from the source application: Modify the Source ApplicationCode(Rarely used)

(7) customize your own JDBC driver: Method 1: directly modify or rewrite the driver. Method 2: Use AOP technology to Weaver existing driver interfaces and analyze and process captured SQL statements.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.