Goldengate the merging and splitting principle of extraction process
1. Documentation Overview
1.1. Document Description
This document describes the basic principles and detailed steps for splitting and merging the extraction process for goldengate.
1.2. Reader scope
This document is primarily disaster-tolerant and is used by people involved in data-level disaster-tolerant applications, and must be strictly adhered to in the entire life cycle of goldengate implementations and operations.
1.3. Terminology description
abbreviation
serial number | width=
Full-term | width= 425" ">
|
|
GG or ogg |
< /strong> |
&NBSP; |
|
&NBSP; | TD Width= "425" >
|
&NBSP; |
|
|
&NBSP; |
|
2. Splitting of the extraction process
2.1. Principle of splitting
Principles 1 : The extraction process is minimal and allows only one extraction process to be configured in principle. If one of the extraction processes does not meet the performance requirements, it is split according to the following guidelines.
Principles 2 : ensure data and business consistency, configure processes by business coupling: There are business coupling objects that must be configured in an extraction process;
Principles 3 : by Redo Log Volume configuration process: In the hour estimate, at the peak of business, a single extraction process per hour processing redo amount of not more than 40G, otherwise it is recommended process splitting, to ensure that each extraction process per hour processing redo not more than 40G;
Principles 4 : Priority of principle 2 is higher than principle 3;
2.2. Details of the split
1. If there is a business coupling between schemas, the extraction process is configured separately to ensure that the tables are within an extraction process;
2, if there is no business coupling between schemas, then the schema is the smallest unit configuration extraction process;
3. If there is a business coupling between a schema internal table and a table, you need to ensure that the tables are within a single extraction process
4, the transmission process and the extraction process needs one by one correspondence;
2.3. Splitting steps
1. Stop Extract Process
2. Stop DataPump Process
Ggsci> INFO Datapump_name
EXTRACTDPEF last Started 2011-01-28 12:34 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint File./dirdat/ef000010
2011-01-28 12:47:45.000000 RBA 148645
Until the RBA number does not change, can stop
3. Stop Replicat Process
Ggsci> INFO Replicat_name
REPLICATRPEF last Started 2011-01-28 12:30pm Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint File./dirdat/ef000006
2011-01-28 12:47:45.000000 RBA 149258
Until the RBA number does not change, can stop
4. Record Extract checkpoint
Extract checkpoints include: Recoverycheckpoint and current Checkpoint
Ggsci> INFO Extract_name, Showch
Extractexee last Started 2011-01-28 09:58 Status STOPPED
Checkpoint Lag 00:00:00 (updated 00:01:02 ago)
Log Read Checkpoint Oracle Redo Logs
2011-01-28 10:02:16 seqno, RBA 7090688
Current Checkpoint Detail:
Read Checkpoint #1
Oracle Redo Log
Startup Checkpoint (starting position in the data source):
Sequence #: 26
rba:289296
Timestamp:2011-01-28 09:27:31.000000
Redo file:c:\oracle\product\10.2.0\oradata\orcl\redo02. LOG
Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Sequence #: 26
rba:7088144
Timestamp:2011-01-28 10:02:16.000000
Redo file:c:\oracle\product\10.2.0\oradata\orcl\redo02. LOG
Current Checkpoint (position of last record, read in the data source):
Sequence #: 26
rba:7090688
Timestamp:2011-01-28 10:02:16.000000
Redo file:c:\oracle\product\10.2.0\oradata\orcl\redo02. LOG
Write Checkpoint #1
GGS Log Trail
Current Checkpoint (current write position):
Sequence #: 11
rba:31609
Timestamp:2011-01-28 10:02:19.072000
Extract Trail:./dirdat/ee
Header:
Version = 2
Record Source = A
Type = 4
# Input checkpoints = 1
# Output checkpoints = 1
File Information:
Block Size = 2048
Max Blocks = 100
Record Length = 2048
Current Offset = 0
Configuration:
Data Source = 3
Transaction Integrity = 1
Task Type = 0
Status:
Start time = 2011-01-28 09:58:34
Last Update time = 2011-01-28 10:02:19
Stop Status = G
Last Result = 400
5, modify the original corresponding parameter file, the split table is removed from the parameter file
6. Add New Extract,datapump and Replicat
--source--------------------------------------------------
Ggsci (win2k364) 15> add Extexef, Tranlog, begin Now
EXTRACT added.
Ggsci (win2k364) 16> add Exttrail./dirdat/ef, Extexef, megabytes 50
Exttrail added.
Ggsci (win2k364) 17> add Extdpef, Exttrailsource./dirdat/ef
EXTRACT added.
Ggsci (win2k364) 18> add Rmttrail./dirdat/ef, EXTDPEF, megabytes 50
Rmttrail added.
--target--------------------------------------------------
Ggsci (win2k364) 21> Add Rep rpef, Exttrail./dirdat/ef
Replicat added.
7. Modify the checkpoint of the new extract process
Checkpoint for two checkpoints recorded above: current read checkpoint and recovery checkpoint
--Modify Current read checkpoint
Ggsci (win2k364) 30> alter EXEFEXTSEQNO, Extrba 7090688[, thread n]
EXTRACT altered.
--Modify Recovery checkpoint
Ggsci (win2k364) 4> alter EXEFIOEXTSEQNO, Ioextrba 7088144[, thread n]
2011-01-28 10:46:18 INFOOGG-00989 warning:unsupported operation. This might cause transactional inconsistency. modifying iocheckpoint:ioseq = Iorba = 7088144.
Is you sure want to continue? Y
EXTRACT altered.
8, confirm all the parameters file is correct, start the process can
3. Merging of extraction processes
3.1. Principles of consolidation
Principles 1 : After merging, the extraction process is as small as possible. In principle, only one extraction process is allowed to be configured, and if an extraction process does not meet the performance requirements, the following principles are required.
Principles 2 : Merging by Business coupling: An object that has a business coupling degree must be configured in an extraction process;
Principles 3 : Consolidation by redo Log Volume: After merging, the amount of redo processed per hour by a single extraction process does not exceed 40G at peak business hours;
Principles 4 : Priority of principle 2 is higher than principle 3;
3.2. Merger rules
When extracting a process merge, follow these guidelines: (Sort by priority)
1. If a schema corresponds to multiple extraction processes, there is a business coupling between the objects within it, it must be merged;
2, although the extraction process is divided according to the schema, but the object between the schema has business coupling, it needs to be merged;
3, even if there is no business coupling between the extraction process, but the extraction process too much, the production system has a greater performance impact (such as I/O,CPU,MEM, etc.), the proposed process consolidation. After the process is merged, at the peak of business, the amount of redo processed by a single extraction process is not more than 40G per hour;
4, even if there is no business coupling between the extraction process, but the amount of log processed by a single extraction process is less than 10G, it is proposed to merge, at the peak of business, a single extraction process per hour processing redo amount of less than 40G;
3.3. Merge steps
This method uses the event mode provided by Goldengate to let all extract processes stop at the same place. Then the extract merge.
In the extract parameter file, set Tabletable_name, Eventactions (stop) parameters, to ensure that all extract process recoverycheckpoint stop on the same SCN number, current The checkpoint also stops on the same SCN number, which needs to be manually created on the source side of the table (the table field can be arbitrary), which is different from the above two methods. When we have merged the extract, we can delete the table by hand and have no effect on the existing application system.
1. Stop all extract processes
2. Stop all DataPump processes
Ggsci> INFO Datapump_name
EXTRACTDPEF last Started 2011-01-28 12:34 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint File./dirdat/ef000010
2011-01-28 12:47:45.000000 RBA 148645
Use Logdump tool, according to the above INFO information, confirm that the last queue file has been processed, and enter RBA after the number, there is no subsequent record, that is, ensure that the queue file is fully processed to stop.
confirm each DataPump according to the above method after that stop all the DataPump Process
3. Stop all Replicat processes
Ggsci> INFO Replicat_name
REPLICATRPEF last Started 2011-01-28 12:30pm Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint File./dirdat/ef000006
2011-01-28 12:47:45.000000 RBA 149258
Use Logdump tool, according to the above INFO information, confirm that the last queue file has been processed, and enter RBA after the number, there is no subsequent record, that is, ensure that the queue file is fully processed to stop.
confirm each Replicat according to the above method after that stop all the Replicat Process
4. Create the Ea_commit table under the Gg_user user of the source-side database
Sql> Conn gg_user/****
Sql>create table Ea_commit (time date);
5, in the source-side database to create extract process parameter files are added the following content:
Tablegg_user.ea_commit, Eventactions (stop);
6, the normal start all extract process, datapump process and replicat process.
7, the use of event mode, so that all extract in the same place to stop, in Sqlplus execute the following SQL statement
Sql> Conn gg_user/****
Sql>insert into gg_user.ea_commit values (sysdate);
Sql> commit;
It is important to note that the commit command must be executed to ensure that all extract processes are parked on the same checkpoint.
8, at this time through the infoext* command to see all the extract process should be stopped.
If not, the Eventactions setting is incorrect and you need to repeat the 4-7 steps above.
If all extract are stopped, use the INFOEXT*,SHOWCH command to see if all recoverycheckpoint stop on the same sequence# and RBA, and all current If the checkpoint stops on the same sequence# and RBA, if both are parked on the same checkpoint, proceed with the steps below, or repeat the 4-7 steps above.
ggsci> Info Ext*,showch
。。。。。。。。。。。。。。。。。。。。。。。。。 Omit part of the content
Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Sequence #: 26
rba:7088144
Timestamp:2011-01-28 10:02:16.000000
Redo file:c:\oracle\product\10.2.0\oradata\orcl\redo02. LOG
Current Checkpoint (position of last record, read in the data source):
Sequence #: 26
rba:7090688
Timestamp:2011-01-28 10:02:16.000000
Redo file:c:\oracle\product\10.2.0\oradata\orcl\redo02. LOG
。。。。。。。。。。。。。。。。。。。。。。。。。 Omit part of the content
9, stop all the datapump process, with the above Method 2.
10, stop all the replicat process, with the above method 3.
11, stop the source and target end MGR process.
12. Complete backup of the Goldengate installation directory on the source and target side, then empty the Dirdat directory
13, copy the original extract parameter file to rename the parameter file, modify the parameter file contents as follows:
Extractextract_newname
。。。
Discardfile./dirrpt/extract_newname.dsc,append,megabytes 1000
。。。
Exttrail./dirdat/xx
where xx is the new queue file prefix
Table ...
Merges the tables of all the extraction processes together.
Here you can combine all the extract process, merging into one of the original processes without modifying the process name.
9. Add New Extract,datapump and Replicat
------------------------------Source-Side--------------------------------------------------
Ggsci> Add extractextract_newname, Tranlog, beginnow [, Threads N]
EXTRACT added.
ggsci> add exttrail/dirdat/xx, extextract_newname, megabytes 50
Exttrail added.
where xx is the new queue file prefix
Increased datapump process, specified as new queue file prefix xx
Edit the parameter file and modify the following:
Extractdatapump _newname
。。。
Rmttrail./dirdat/xx
Table ...
To copy a modified list of extract processes
Ggsci > Add extractdatapump_newname, Exttrailsource./dirdat/xx
EXTRACT added.
Ggsci > Add rmttrail/dirdat/xx, extractdatapump_newname, megabytes 50
Rmttrail added.
--------------------------------------Target End--------------------------------------------------
For each replicat process, delete the original replicat process, re-add the Replicat process, and specify the new queue file prefix xx
Ggsci>deleteReplicat_name
Ggsci > Add Rep replicat_name, Exttrail./dirdat/xx
Replicat added.
If you have more than one replicat process, modify it as described above
10. Modify the checkpoint of the new extract process
if the first - Merge into the existing process, the checkpoint is correct and the INFO ext ..., showch "command to verify that there is no need for a checkpoint to be modified at this point.
--Modify Current read checkpoint
Ggsci > Alter extract extract_newnameextseqno26 extrba7090688 [, Thread n]
EXTRACT altered.
--Modify Recovery checkpoint
Ggsci > Alter extract extract_newnameioextseqno26, ioextrba7088144 [, Thread n]
2011-01-28 10:46:18 INFOOGG-00989 warning:unsupported operation. This might cause transactional inconsistency. modifying iocheckpoint:ioseq = Iorba = 7088144.
Is you sure want to continue? Y
EXTRACT altered.
11. Start a new extract process
12. Start a new DataPump process
13. If working properly, delete all extract and datapump processes before merging
Ggsci>delete Extract Extract_oldname
14. Start a new Replicat process
15. If working properly, delete all Repicat processes before merging.
16. Delete the Ea_commit table under Gg_user, and execute the following statement in Sqlplus:
Sql> Conn gg_user/****
Sql> drop Tablegg_user.ea_commit;
3.4. Precautions
Extract process merging is usually done in the form of initialization, in view of the large amount of data currently applied by the national network, in order to save the combined working time, the following issues should be noted when merging:
1, during the extract process, the deletion of database archive files is forbidden.
2, before the extract process merges, the full backup source and target end goldengate All files under the installation directory.
3. Comment out the autostart, AutoRestart, and Purgeoldextracts parameters in the manager parameter file before the extract process is merged.
4. Splitting of the replication process
4.1. Principle of splitting
Principle 1: Replication and extraction process as much as one by one corresponding, one extraction can correspond to multiple replication processes.
Principle 2: Performance issues occur with replication, first consider database performance tuning.
Principle 3: The replication process is minimal, and if a replication process does not meet performance requirements, it is split according to the following guidelines.
Principle 3:replicat Process performance depends not only on the Ogg process split, but also on the database itself, the database optimization should be considered first, followed by process splitting. (Example: Table with or without index)
Principle 4: All Replicat process loads are balanced.
4.2. Split details (requires Oracle input)
4.3. Split step (requires Oracle input)
To split the point of consideration:
1. Research Big Things
2. Hot list, Big data table
3. OGG Stats repx. Observation data increment
4. DB record count, change amount
5 The number of split processes corresponds to the number of CPUs
Specification for merging and splitting of goldengate processes