Oracle GoldenGate (OGG) diagnostics

Source: Internet
Author: User

Oracle GoldenGate (OGG) diagnostics

This document briefly lists the issues that are frequently seen when using Oracle GoldenGate (OGG) and the diagnostic steps and tools. It describes the key file used to find the cause of failure for the OGG Group. Understanding how OGG captures and transfers data helps you understand why problems occur and how to avoid them. Correct installation and configuration of OGG, combined with logs, can greatly help solve the problem.

Note:
* Group-is an official term used for extraction or replication. From now on, we use this term to describe extraction or replication.
Common OGG Problems
The following lists typical problems that may occur when OGG is used. Refer to the Document section below to obtain the extensions and details of these problems.
• Database errors and database Problems
• Non-Database Error Reporting
• The OGG group is suspended.
• Performance problems
• The OGG group suddenly ends abnormally.
• Hardware faults (disks and servers) and damages

Research problems
The files and information listed below will help study the OGG problem. We recommend that you familiarize yourself with these files and their locations. The reference manual in the official documentation describes how to locate and explain them. For example, a report document is almost always needed, and most of its content is self-explanatory.
• Report documentation-it is useful for 99% of OGG problems, because errors and warnings here are mostly self-explanatory and we should check this file frequently.
• OGG error logs, ggserr. log, error summary, ggsci command logs, and timestamps of events (such as process stop and startup.
• Use the logdump tracking file-detailed analysis of records captured by the OGG extraction process.
• The Discard file tracks the text display of problem records.
• Database warning logs-useful for database-related issues other than Data Query errors such as ORA-1403 and ORA-0001.
• System warning logs-useful for performance and resource problems.
• Database Performance Tools, health check, and AWR reports.

Diagnostic tools and settings
The following list provides diagnostic/troubleshooting tools. Please familiarize yourself with these tools. Many diagnostic tools such as Logdump are usually read-only, so you can install them in a production environment with confidence.
• Set REPORT parameters and correct options, such as ddl report. Use the REPORT and REPORTCOUNT parameters to avoid reportrolover.
• LOGDUMP
• Database tool-logminer, database health check report, 10046 session tracking.
• Pstack

The first step in diagnosing an OGG Problem
When a problem occurs, the first step is to find the file/information that best describes the problem. For example, an OGG group always records abnormal termination in the report file, which may contain multiple warnings and errors. Classifying these errors will help you further obtain relevant data. For example
• What is an error?
• Which group is it?
• Where is the error message displayed?
• Which file provides detailed information about this error, such as a report file?
• What type of error is this error reported?
• Is this error recorded in the error information manual or knowledge base document?

Common OGG Problems

Database errors and database Problems
• Database errors in the report file, and any warning information prior to the error that causes OGG to terminate abnormally. This is usually the exact cause of its failure, such as ORA-01403 (No data found ).
• Database errors related to queries (ORA-1403, ORA-0001) typically mean that the target table is not synchronized and/or the constraints are different (primary key, foreign key, etc ), it is usually impossible to track why this happens, because they have happened a long time ago. Resynchronize and monitor this table as a potential solution.
• Is the target database updated by operations other than OGG? Search for early trail files, find possible missing DML operations, and determine why they are not applied?
• Non-query-related errors, such as failure to find archiving logs, permission issues, and incorrect replacement location. It is recommended that you try to cut and paste the actual locations and list them in the OS shell.
• Report files usually contain actual SQL statements executed by the OGG component. Execute this SQL statement manually under SQLPlus.
• Common replication problems: ERROR OGG-01296 Error mapping from SCHEMA1.TABLE1 to SCHEMA1.TABLE1. the potential causes of this error include the error size, incorrect column ing, unacceptable characters, incorrect data types, and incompatible character sets.
Non-Database Error • Search for the error information displayed in the report file in the knowledge base document.
• For example, the checkpoint file reports an error, the report file cannot be written, and common permission problems.
OGG group hang • determines that it is in hanging, not slow. Evidence of hang in the OGG group is that multiple SEND commands have timed out, and the trail file does not grow.
• It is usually caused by table locks or waiting for some resources (downstream extraction is waiting for archiving logs ).
• Run the following command to kill the ogg group:
Ggsci> kill
Or use
Kill-9
Kill the process and try to restart it. This will clear the lock and other resources.
• Add the TRACE and TRACE2 parameters after KILL and restart. Check the last activities recorded in these trace files.
Performance problems • has a large batch processing task been executed before the latency occurs in the OGG group?
• Is a group doing too much work?
• Are there any resource problems, such as memory and disk performance?
• If there is a large delay in replication, the copy itself or the trails file is not produced fast enough.
• Use parameters such as REPORTCOUNT to record performance indicators.
• Are resource-consuming parameters used, such as FETCH, inserting queue tokens, a large number of filters, conversion functions, and SQLEXEC calls?
• Is Extract slow to start?
• Integrated extraction and integrated replication require database performance analysis.
Abnormal Termination of OGG group • Execution group in OS command line, for example
./Extract paramfile dirprm/myext. prm
The last line shows why the OS terminated the process.
• Refer
Document 1458985.1 Oracle GoldenGate-How To Use GDB To Generate Stack Trace For Troubleshooting Purpose
To collect dmp files.
Hardware errors (disks, servers) and damages • when physical components such as servers or disks fail, the OGG data file (dir * directory) or OGG trails file may be damaged, the OGG group cannot be started. In this case, recovery is more important than diagnosis. There are a large number of knowledge base documents in this regard. You can find them by searching for GoldenGate upload uption.

Submit SRs in good format
When a service request is made to Oracle, providing accurate and conclusive information will greatly accelerate the process of analysis and resolution. Sometimes, when collecting relevant information, you will have a better understanding of this problem, and can solve this problem by yourself with the assistance of My Oracle Support knowledge documentation. As mentioned above, the report file is very important, so it should always be provided in the SR. The information/documents listed below are often a good starting point when an SR is opened.
• Accurate basic information, OGG version, database/platform/OS version.
• Precise and brief summaries, for example, extracting exceptional termination due to "Error.
• Report files and parameter files are always attached.
• Events that cause this error, for example, when the server drops/modifies parameters to add new tables.
• What possible changes have been made recently.

Reference
Knowing where to find the information is crucial for diagnosis. The following reference documents are the starting point for configuring and maintaining an OGG instance.
• Oracle GoldenGate documentation library
• Oracle GoldenGate Product portal
• Database Synchronization with Oracle GoldenGate tutorials

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.