1. Background:
Operators and more or less will encounter analysis problems, analysis of failures, often encountered some difficult problems, we will go to deeper experts for help. Whether you're a second-line expert or an Oracle Global service engineer (later referred to as a GCS engineer), you'll often be asked to submit such data. Basically every time you encounter such a situation, ask a little data to a point, the main cost of consumption is in the extraction of data.
2. What data to obtain:
After years of experience with GCs and their own analytical issues, the following data are recommended:
L OSW
L Nmon
L Log File
L Awr/ash
L Dba_hist_active_sess_history
L Opatch
3. How to obtain data:
After years of operational experience, we recommend that you provide at least 1 hours of data before the problem occurs, if the problem occurs at 10:00 today, then the data will be available from 9:00 today until 30 minutes after the problem is restored.
L OSW
This is the Oracle provides system monitoring tools, in essence, vmstat, and sometimes osw have data but nmon no data, and sometimes the opposite, so even if the deployment of Nmon still recommend deployment OSW reasons. The set data interval is recommended to be collected once in 20 seconds.
Get Data script:
Touch-t 201705091300 Test1
Touch-t 201705091400 Test2
Find. -type f-newer test1! -newer Test2|xargs tar cvf oswdata_ggdb2.tar
PS:201705091300 Representative May 9, 2017 13 o'clock
L Nmon
The classic system Monitoring Tool has its version on AIX and Linux. Recommended up to 60 seconds to collect, at present, compared to a number of places with Nmon data acquisition, but the acquisition interval of 10-20 minutes, resulting in the acquisition of the same as no acquisition.
Access to data requires only cat or FTP-related tools.
L Log File
TFA is a tool for collecting diagnostic logs from a grid Infrastructure/rac environment in the 11.2 release, which assists users in collecting logs from RAC in a very simple command for further diagnostics ; TFA is an Oracle cluster log collector similar to Diagcollection, and TFA is more powerful than diagcollection centralized and automated diagnostic information collection capabilities. TFA has the following features:
- TFA can execute a simple command on a single machine to package and package the logs of all nodes;
- TFA can "trim" the log during the collection process to reduce the amount of data collected;
- TFA can collect data for diagnostic use "over time";
- TFA can collect and encapsulate the logs of all the nodes and put them on a node for transmitting reading;
- TFA can specify a subset of components in the cluster for log collection, such as: ASM, Rdbms,clusterware
- TFA can perform real-time scanning of alarm logs based on conditional configuration (DB alert Logs, ASM alert Logs, Clusterware alert Logs, etc);
- TFA can automatically collect diagnostic logs based on the results of real-time scanning;
- TFA can scan the alarm log according to the specified error;
- TFA can collect diagnostic logs based on the results of the specified error scan;
Script to get data:
Execute with Root or grid
# $TFA _home/bin/tfactl diagcollect-all-from "jan/21/2013 00:00:00"-to "jan/21/2013 13:00:00"
Before 11GR2, you can execute commands according to the script of the industry support team to collect the data.
L Awr/ash
AWR and Ash are common tools for Oracle, and need to provide both the AWR and ash reports for normal time periods and problem periods when problem analysis is required. The AWR report generates one copy per hour, and ash generates a copy every 15-30 minutes.
L Dba_hist_active_sess_history
In many cases, when there is a performance problem with the database, we do not have the opportunity to collect enough diagnostic information, such as System State dump or hang analyze, even when the problem occurs and the DBA is not present at all. This gives us a great difficulty in diagnosing the problem. So in this case, can we collect some information afterwards to analyze the cause of the problem? On Oracle 10G or later, the answer is yes. In this paper, we will introduce a method to analyze problems by dba_hist_active_sess_history data.
How to get Data:
Perform exp fetch required data on the database on which the problem occurred:
Sql> Conn USER/PASSWD
Sql> CREATE TABLE M_ash as SELECT * from Dba_hist_active_sess_history where sample_time between To_timestamp (' <time _begin> ', ' yyyy-mm-dd HH24:MI:SS ') and To_timestamp (' <time_end> ', ' yyyy-mm-dd HH24:MI:SS ');
To perform an export:
$ exp user/passwd file=m_ash.dmp tables= (m_ash) Log=m_ash.exp.log
L Opatch
The data collected from the Opatch is mainly to confirm the patch information to confirm whether to hit some related bugs. Data acquisition using Oracle and grid users, respectively:
$ $ORACLE _home/opatch/opatch lsinventory
4. Summary:
This article provides the problem analysis, the general need to collect data, through the established standardization of data collection, reduce the number of people repeatedly access to the time consumption, but also reduce the lack of the first time to collect data resulting in data loss can not analyze the root cause of the risk of the problem.
Oracle problem analysis method for data acquisition