The best way to master and use a tool is to master its principles first, understand its architecture, and then use it handy, or else use it, and don't know why this step does this, not to quickly locate the problem, let alone solve the problem quickly.
Ogg Architecture and principles:
the principle of Oracle GoldenGate is quite simple, that is, by extracting the source-side redo log or archive log, and then through the TCP/IP protocol, to the target side, and then parse and apply to the target library, so as to achieve source-to-target replication.
The diagram below, which explains the process in detail below
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/4D/CE/wKiom1RaHxKDwQFCAAL3Su1uXr0696.jpg "title=" Oracle GoldenGate Architecture "alt=" Wkiom1rahxkdwqfcaal3su1uxr0696.jpg "/>
An official document is a good choice to learn more about OGG.
:
Http://www.oracle.com/technetwork/middleware/goldengate/documentation/index.html
Official documents provide the official detailed explanation, here do not do translation, I use my understanding to explain the structure of OGG, if there is anything wrong, please also point out the common progress.
Process Understanding:
I think of an Ogg instance as a football team, my home home Real Madrid.
1) Manager Process---midfield metronome (Luka Modric, Kroos, brother Long [though no longer])
As the name implies, the manager process is the management process of the Ogg instance, which simultaneously runs on the source and target side, controlling the other processes of Ogg, starting, restarting, monitoring, reporting errors and events, allocating space, etc. Each Ogg instance requires a manager process. The manager process is like the midfield commander on the pitch, holding the team's rhythm and organizing the defensive and offensive.
2) Extract Process---Defensive Day regiment (water master, Monk, Carvajal, team pet, Golden Retriever)
The extract process, which runs in the source-side database, is responsible for capturing data from the source-side data tables or logs. Depending on the time period, the Extract function is different, the initial stage, the extract directly from the source data table to extract data, completed the initialization of the data, so that the source data and the target end of the same data. The synchronization change phase, is the extract process captures the source-side database Changes (DML,DDL), has been transmitted and applied to the target side. Like the water Master of the Defence Corps, the monk, who handles changes in the field, steals the ball, and points the ball to the side-guard or midfield, allowing them to teleport to the front to complete the assists "Sometimes the water master also directly attacks the front, as if no data dump was configured, The extract process transmits the captured data directly to the remote trail file in the target terminal.
Checkpoint mechanism---door keeper (Cassie)
Extract process will always have an abnormal termination or server abnormal outage or the middle of the network, when the extract process restarts, GG will know what data has been synchronized and what is not synchronized? This time, you need to configure the Checkpoint,extract process to use its intrinsic checkpoint mechanism, periodically check and record the location of its read and write, and usually recorded in the trail file, after restarting the extract process, will read the trail file, The synchronization is then continued to prevent loss of data. Just like Saint-Cassil, the team's final guarantee, at the defense of the Day Regiment interrupted meditation, San Siro stood up to uphold the dignity of the team, to keep the goal, and launched a counter-attack.
3) Data Dump process---Midfielder (Modric, Isco)
The dump process runs at the source, and if the source uses a local trail file, the dump process drops the trail file in the form of a block of data that is sent over the TCP/IP protocol to the target side, which is usually the safest way. Of course, if the source side is not configured the Trail,extract process will be passed directly to the target after the data has been extracted. Just like the soft sister's outer instep and Isco's tape, they are guaranteed that the ball can pass the midfield, the ball to the front of the feet.
Server Collector Process---player number 10th (J-Lo, Benzema)
The Server collector process does not need to be configured, so you want to be transparent, do not need special attention, and run on the target side, and the source side of the dump process corresponds to each other, its task is to send the extract or dump process to the data block, re-composed trail file, is the so-called remote Trail file, which is the long trail file. J Luo, I Luo's brain residue, get the ball to finish the final pass is the matter, why Benzema added, I think Benzema is the current striker most will do the ball, and even the most suitable for Real Madrid.
4) REPLICAT Process---a fatal blow (I, the holy Sage)
The Replicat process, also called the application process, is responsible for reading the contents of the target trail file, or extract the content that the process passes directly, parsing it into DML/DDL operations, and applying it to the target database. Just like me and the Holy One, always the last part of the team's offensive, and put the ball in the other goal to finish the final blow.
The replicat process, as well as its internal checkpoint mechanism, ensures that the replicate process restarts and recovers from the last recorded position without the risk of data loss.
5) Ggsci---coach (Ancelotti)
Ggsci is goldengate software command unterface abbreviation, GG commands interface, to complete a variety of operations, like Ancelotti, not much to say, there are problems to eat quick-acting to save the heart pills.
Places to be aware of:
Just tidy up a small part, there must be a missing place, welcome to add, may also have said wrong place, welcome to correct.
1: Memory
A. Number of concurrent processes determines the size of GG memory
B. Requires at least one extract and data-pump process at the source
At least one replicate process is required on the target side
Each GG instance requires a manager process
on each GG instance, Ggsci supports up to 300 extract and
replicate process, each extract and replicate process probably need to 25-55 m of Memory
c. GG CacheManager use cirtual memory to ensure the normal operation of GG, so the system needs to ensure a certain amount of swap space
--Determine the size of the required swap space by the following methods
(1) Start up oneextract or Replicat.
(2) Run Ggsci.
(3) View Thereport file and find the line PROCESS VM AVAIL from OS (min).
(4) Round up Thevalue to the next full gigabyte if needed. For example, round up 1.76GB to 2GB.
(5) Multiplythat value by the number of Extract and replicat processes that would be running. The result is the maximum amount of swap space, which could be required. To determinethe number of processes you'll need, consult the configuration chapters in Theoracle GoldenGate Windows and UNIX Administrator ' s Guide.
2 : Hard disk space
A. GG software requires 50-150m space, according to the database and the GG version will be
B. the working directory of each GG instance needs about 80M of space
C. Note If you are deploying Ogg on a cluster then GG binaries and files need to be placed on shared storage
D. It is possible that the data_pump process is dead, but the extract process is still crawling data so trails files are getting bigger, so at least 1G of space is needed to hold the trail file, Drop the trail file.
If the trail file is placed on the target side, then it needs to be determined according to the purgeoldextracts parameter.
3 : Cluster Environment
for a RAC cluster, GG needs to be installed on a shared device so that it can be started from any of the contacts, and when a contact fails to start, it can be started directly from other nodes without the need to configure
4 : Network
A. GG port default starting from 7840
B. You can specify a set of ports by parameters when configuring the manager process so that the needs of Ogg
A range ofports forlocal Oracle GoldenGate communications:can be the default rangestarting atport 7840 or a customize D range of up to the other ports.
5 : System Permissions
It is best to build a GG user, the GG installation directory has read and write permissions, you can also use the Oracle user
6 : Database Configuration
A. In the case of Oracle 10G, you must set bequeath_detach=true in Sqlnet.ora to use Dequeath_detach
7: using ASM
If ASM is used, then the manager needs to be able to access ASM instances
8. Environment variables
A. To run GG on a linux32bit system, you need to specify Ld_library_path, which contains 32bit of Oracle libraries
B. If it is a instance then you can set Oracle_home and Oracle_sid at the system level
If it is multiple instance then you can configure the extract and replicate processes by setenv to specify
Oracle_home and Oracle_sid
SETENV (Oracle_home = "<path to Oraclehome location>")
SETENV (Oracle_sid = "<SID>")
Multiple instances require setenv settings for each extract replicate process
Setenv Parameters Overrides system-level settings
This article is from the "Dbguy" blog, make sure to keep this source http://dbguy.blog.51cto.com/8921728/1572724
Oracle GoldenGate Real Madrid's first play: to master a tool, you must first master its principles