[Oracle] Exadata routine inspection tool Exachk
In Exadata, it is best to run exachk for health check when there is any database-independent problem. Exachk collects a wide range of information, eliminating the hassle of manual collection. After the collection is complete, you can make an overall assessment of the system's health status. The report contains information on software, hardware, firmware version, configuration, and other aspects, from which you can find some suspicious points, then, narrow down the scope for the next diagnosis.
This article mainly records the basic usage of exachk, which can be downloaded from the MOS document: 1070954.1.
First, declare two environment variables RAT_ORACLE_HOME and RAT_EXADATA_VERSION. Otherwise, an error will be reported during subsequent use:
[Oracle @ dm02db01 dbhome_1] $ echo $ ORACLE_HOME
/U01/app/oracle/product/11.2.0.4/dbhome_1
[Oracle @ dm02db01 dbhome_1] $ export RAT_ORACLE_HOME =/u01/app/oracle/product/11.2.0.4/dbhome_1
[Oracle @ dm02db01 exachk] $ rpm-qa | grep exadata
Exadata-oswatcher-11.2.3.3.0.131014.1-1
Exadata-asr-11.2.3.3.0.131014.1-1
Exadata-sun-computenode-11.2.3.3.0.131014.1-1
Exadata-base-11.2.3.3.0.131014.1-1
Exadata-applyconfig-11.2.3.3.0.131014.1-1
Exadata-ibdiagtools-11.2.3.3.0.131014.1-1
Exadata-exachk-11.2.3.3.0.131014.1-1
Exadata-validations-compute-11.2.3.3.0.131014.1-1
Exadata-ipconf-11.2.3.3.0.131014.1-1
Exadata-commonnode-11.2.3.3.0.131014.1-1
Exadata-firmware-compute-11.2.3.3.0.131014.1-1
Exadata-sun-computenode-minimum-11.2.3.3.0.131014.1-1
[Oracle @ dm02db01 exachk] $ export RAT_EXADATA_VERSION = 11.2.3.3.0
Then run exachk:
[Oracle @ dm02db01 dbhome_1] $ cd/opt/oracle. SupportTools/
[Oracle @ dm02db01 oracle. SupportTools] $ cd exachk
[Oracle @ dm02db01 exachk] $./exachk
CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to/u01/app/11.2.0.4/grid? [Y/n] [y] -- confirm the CRS_HOME path
Checking ssh user equivalency settings on all nodes in cluster
Node dm02db02 is configured for ssh user equivalency for oracle user
Searching for running databases .....
..............
List of running databases registered in OCR
1. bdataedw
2. bdataetl
3. cata
4. edw
5. ETL
6. OMSSTD
7. portalstd
8. rdsdbstd
9. All
10. None
Select respective number to choose database for checking best practices. for multiple databases, select 9 for All or comma separated number like 1, 2, etc [1-10] [9]. -- select the database to be checked. 1-8 is the eight scanned databases, 9 is all checked, and 10 is skipped.
Searching out ORACLE_HOME for selected databases.
...................
Ls:/u01/app/oracle/product/11.2.0.4/dbhome_1ORACLE_HOME_OLD/bin/oracle: No such file or directory
Checking Status of Oracle Software Stack-Clusterware, ASM, RDBMS
........................................ ........................................ ........................................
Bytes -------------------------------------------------------------------------------------------------------
Oracle Stack Status
Bytes -------------------------------------------------------------------------------------------------------
Host Name CRS Installed asm home rdbms Installed crs up asm up rdbms up db Instance Name
Bytes -------------------------------------------------------------------------------------------------------
Dm02db01 Yes bdataedw1 bdataetl1 cata1 edw1 ETL1 OMS3 portal1 rdsdb1
Dm02db02 Yes bdataedw2 bdataetl2 cata2 edw2 ETL2 OMS4 portal rdsdb2
Bytes -------------------------------------------------------------------------------------------------------
Root user equivalence is not setup between dm02db01 and storage server dm02cel01.
1. Enter 1 if you will enter root password for each storage server when prompted.
2. Enter 2 to exit and configure root user equivalence manually and re-run exachk.
3. Enter 3 to skip checking best practices on storage server.
Please indicate your selection from one of the above options [1-3] [1]:-
Is root password same on all storage server [y/n] [y]
Enter root password for storage server:--- password of all cell nodes
Root password for 192.168.0.19 was incorrect. 2 retries remaining.
Enter root password for 192.168.0.19 :-
Root password for 192.168.0.19 was incorrect. 1 retries remaining.
Enter root password for 192.168.0.19 :-
Root password for 192.168.0.19 was incorrect. root privileged checks will not be executed on 192.168.0.19
-- If the root password of a node is different from that of other nodes, you will be prompted to enter it separately. If you do not know it, the exachk will skip this node during the collection phase without affecting the normal operation of other nodes.
Failed CT: spawn id exp6 not open
While executing
"CT "*? Assword :*""
Failed CT: spawn id exp6 not open
While executing
"CT "*? Assword :*""
Failed CT: spawn id exp6 not open
While executing
"CT "*? Assword :*""
Failed CT: spawn id exp6 not open
While executing
"CT "*? Assword :*""
120 of the authorized ded audit checks require root privileged data collection on database server. If sudo is not configured or the root password is not available, audit checks which require root privileged data collection can be skipped.
1. Enter 1 if you will enter root password for each on database server host when prompted
2. Enter 2 if you have sudo configured for oracle user to execute root_exachk.sh script on DATABASE SERVER
3. Enter 3 to skip the root privileged collections on DATABASE SERVER
4. Enter 4 to exit and work with the SA to configure sudo on database server or to arrange for root access and run the tool later.
Please indicate your selection from one of the above options [1-4] [1]:-
Is root password same on all compute nodes? [Y/n] [y]
Enter root password on database server:--- root password of all DB nodes
9 of the specified ded audit checks require root privileged data collection on infiniband switch.
1. Enter 1 if you will enter root password for each infiniband switch when prompted
2. Enter 2 to exit and to arrange for root access and run the exachk later.
3. Enter 3 to skip checking best practices on INFINIBAND SWITCH
Please indicate your selection from one of the above options [1-3] [1]:-
Is root password same on all infiniband switch? [Y/n] [y] -- root password of INFINIBAND
Enter root password for infiniband switch :-
Root passwords for following nodes are incorrect.
You can still continue but root privileged checks will not be executed on following nodes.
1. 192.168.0.19
Do you want to continue [y/n] [y]:-
* ** Checking Best Practice Recommendations (PASS/WARNING/FAIL )***
Log file for collections and audit checks are
/Opt/oracle. SupportTools/exachk/exachk_2017114_162425/exachk. log
========================================================== ==================================
Node name-dm02db01
========================================================== ==================================
Collecting-ASM DIsk I/O stats
Collecting-ASM Disk Groups
Collecting-ASM Diskgroup Attributes
Collecting-ASM disk partnership imbalance
Collecting-ASM initialization parameters
Collecting-Active sessions load balance for bdataedw database
Collecting-Active sessions load balance for bdataetl database
Collecting-Active sessions load balance for cata database
Collecting-Active sessions load balance for edw database
..............
Collecting patch inventory on crs home/u01/app/11.2.0.4/grid
Collecting patch inventory on ORACLE_HOME/u01/app/oracle/product/11.2.0.4/dbhome_1
Collecting patch inventory on ORACLE_HOME/u01/app2/oracle/product/11.2.0.2/dbhome_1
---------------------------------------------------------------------------------
Detailed report (html)-/opt/oracle. SupportTools/exachk/exachk_rdsdbstd_2017114_162425/exachk_rdsdbstd_2017114_162425.html
UPLOAD (if required)-/opt/oracle. SupportTools/exachk/exachk_rdsdbstd_2017114_162425.zip
Now that the exachk is running, you can download the/opt/oracle. SupportTools/exachk/hosts file and open the/opt/oracle. SupportTools/exachk/exachk_rdsdbstd_1_114_162425/hosts file for viewing.
The following figure shows the existing problems: