[Production library combat] How to use triggers for real-time synchronization of billions of large tables in the production library

Source: Internet
Author: User

The trigger migrates the same principle as MV, by creating a trigger in the source table, recording the DML operations log of the source table, and then synchronizing the DML affected records through the stored procedure to achieve the same effect as the target table and the source table data. This method is only a supplement to the prebuilt MV operation. Materialized view/on-line redefinition there is a disadvantage in the processing of DML large tables, mainly in the whole amount of initialization this step, such as: this should be multiple scans, direct path reading efficiency is very high, but due to the individual data block in memory resulting in a single block read; Or because DML operations are relatively frequent, it is possible to read data that needs to be consistent with the undo fabric read Data block, at this time, look at the waiting event is a single-block read, from the surface phenomenon of the initial synchronization is very fast synchronization, the later the slower, and ultimately unable to quickly complete or the snapshot over the old problem. Of course, creating trigger on a DML-frequent table will have an impact on the DML performance of the source table and needs to be evaluated in advance. For some small table synchronization, it is still recommended to use prebuilt MV, the reason is simple and convenient, this idea is also suitable for other databases.


This example achieves the goal:

the source-side user01.user_table table data is migrated to the target-user02.user_table, and the target-side data is synchronized with the source-side data in real-time data consistency. In the case of business migration, controlling the business impact within 1 minutes and choosing the right point-in-time switch can basically do the time required to restart the application, in other words, the impact on the business depends on the restart time of the application.

Database Source: Business Database 1

Database Target side : Business Database 2

source-side user01.user_table table:
Table Size: 60gb+, Number of records: 480 million +, Daily DML Volume: 200W or so, after 22 o'clock to the next morning 8 points DML volume is less, this time period is idle time period, can be initialized in this time period;
Use user_id to do hash partition, partition number: 16, primary key field: User_id+serviceid, unique index field: Uin+serviceid;

target end user02.user_table table:
use UIn to do hash partition, partition number: 64, primary key field: Uin+serviceid, unique index field: User_id+serviceid (the index query is not used in the current production library according to the business situation).

Limitations:

1, the source table needs to have a primary key or non-null unique constraints;
2, during synchronization, do not support DDL operation of the source table (note: Can improve the code to achieve the effect of supporting DDL);
3. DML frequent tables do not fit, may cause incremental to keep up (note: No standard, depending on database performance).

Brief description of the implementation steps:

Note: The Red font is the target-side operation, and the Green font is the source-side operation.

4, create a synchronization log table (User01. user_table _rep_sync_log), which records DML operation information for the source table;
       5, Create a trigger (USER01.TRIGGER_USER_TABLE_REP) to insert the captured information into the Log Synchronization table (Note: The CREATE trigger is made in the idle time period of the business table, reducing the DML log volume production and having less impact on the business, And be sure to create a before full-volume synchronization;
       6, Create a Dblink (to_xxxxx) for the destination library connection source database. Localdomain), and synchronize the table User01. 11, monitor the synchronization log table (User01. user_table_rep_sync_log), observe the amount of data that needs to be synchronized.

If the business system is cut to meet the need for business process coordination, monitoring the amount of data that needs to be synchronized at the least during the refresh interval, stop the scheduled refresh job, close the old business program, manually perform a synchronous flush in the target library until no new records are generated in the Synchronization log table, start a new business program, To achieve the least impact on the business system.




[Production library combat] How to use triggers for real-time synchronization of billions of large tables in the production library

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.