Oracle Version:
10.2.0.1.0
SQL*Plus - Version:
10.2.0.1.0
可能導致本地發起的所有串連ORACLE資料庫服務端的串連無響應。
現象 :
切換到oracle 後 執行sqlplus 沒有反映,也不報錯,就掛起在那裡
[root@localhost~]# su - oracle
[oracle@localhost ~]$ sqlplus #這裡就無響應了。。無論多久都無響應
oracle下其他的命令可以執行
通過sar 看到資訊
Linux 2.6.9-22.ELsmp (economy2)
09:43:13 PM CPU %user %nice %system %iowait %idle
09:43:15 PM all 46.77 0.00 53.23 0.00 0.00
09:43:17 PM all 47.00 0.00 53.00 0.00 0.00
09:43:19 PM all 43.20 0.00 54.37 2.43 0.00
Average: all 45.63 0.00 53.54 0.82 0.00
結束sqlplus
通過sar 2 4 看到系統狀況
09:44:15 PM CPU %user %nice %system %iowait %idle
09:44:17 PM all 0.00 0.00 0.00 0.00 0.00
09:44:19 PM all 0.00 0.00 0.00 0.00 0.00
09:44:21 PM all 0.00 0.00 0.00 100.00 0.00
09:44:23 PM all 0.00 0.00 0.00 0.00 0.00
Average: all 0.00 0.00 0.00 100.00 0.00
但只要UpTime200天左右就會出現該現象
[oracle@xxxxx ~]$uptime
16:48:26 up 200 days, 2:20, 2 users, load average: 0.00, 0.03, 0.10
問題基本定位為ORACLE用戶端軟體的BUG,
涉及的版本:ORACLE 10.2.0.1.0
現象就是UPTIME>50天,即有可能出現運行SQLPLUS後無反應的現象,主要原因是時間溢出錯誤。事實上只要Linux x86主機運行天數是24.8(本系統運行198天)的倍數都有可能引發該bug,因為time()函數值為null,造成無限死迴圈,從而耗盡cpu。
Running STRACE tool shows: $ strace /oracle/home/bin/sqlplus -V 2>&1 |less ...... old_mmap(NULL, 385024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x41794000 gettimeofday({1122996561, 411035}, NULL) = 0 access("/usr/local/UD/conf/sqlnet.ora", F_OK) = -1 ENOENT (No such file or directory) access("/usr/local/UD/lib/oracle/network/admin/sqlnet.ora", F_OK) = -1 ENOENT (No such file or directory) access("/usr/local/UD/conf/sqlnet.ora", F_OK) = -1 ENOENT (No such file or directory) access("/usr/local/UD/lib/oracle/network/admin/sqlnet.ora", F_OK) = -1 ENOENT (No such file or directory) fcntl64(-1218313656, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor) It is looping on the times() function.--死迴圈中 times(NULL) = -1825782405 times(NULL) = -1825782405 times(NULL) = -1825782405 times(NULL) = -1825782405 times(NULL) = -1825782405 times(NULL) = -1825782405 times(NULL) = -1825782405
解決辦法:
1、重啟系統。
2、對該bug單獨打臨時patch 4612267。
3、升級ORACLE用戶端到10.2.0.2.0(官網已經公布在這個版本解決了此問題)。
4、降級到9I,9I絕對不會出現這個問題;或降級到10.1.0.4版本(沒有經過全面測試)。
第一種方法沒有徹底解決問題,以後照舊;第三、四種方法,升級時間長,且要求停庫很久,當前生產環境暫不適合;
我採取的是第二種方法,打補丁包的方式。而且據oracle官方文檔說明,oracle11已經修複該問題。
下面是pache 4612267補丁包的安裝及驗證方法:
先停監聽、dbconsole和資料庫
$ lsnrctl stop $ emctl stop dbconsole $ sqlplus / as sysdba SQL> shutdown immediate
注意:dbconsole是在已經裝了Oracle EM的情況下要停止,如果未安裝則無需幹涉。
安裝patch
$ mkdir $ORACLE_BASE/patches $ cd $ORACLE_BASE/patches $ rz (SecureCRT裡上傳 p4612267_10201_LINUX.zip 檔案, 其它上傳方式也可以) $ unzip p4612267_10201_LINUX.zip $ cd 4612267/ $ $ORACLE_HOME/OPatch/opatch apply Invoking OPatch 10.2.0.1.0 ... Please shutdown Oracle instances running out of this ORACLE_HOME on the local system. (Oracle Home = '/u01/app/oracle/product/10.2.0/db_1') Is the local system ready for patching? Do you want to proceed? [y|n] y (此處輸入y) User Responded with: Y ... ApplySession adding interim patch '4612267' to inventory The local system has been patched and can be restarted. OPatch succeeded.
驗證patch
$ $ORACLE_HOME/OPatch/opatch lsinventory Invoking OPatch 10.2.0.1.0 Oracle interim Patch Installer version 10.2.0.1.0 Copyright (c) 2005, Oracle Corporation. All rights reserved.. Oracle Home : /u01/app/oracle/product/10.2.0/db_1 Central Inventory : /u01/app/oracle/oraInventory from : /u01/app/oracle/product/10.2.0/db_1/oraInst.loc OPatch version : 10.2.0.1.0 OUI version : 10.2.0.1.0 OUI location : /u01/app/oracle/product/10.2.0/db_1/oui Log file location : /u01/app/oracle/product/10.2.0/db_1/cfgtoollogs/opatch/opatch-2009_Jan_13_11-06-27-HKT_Tue.log Lsinventory Output file location : /u01/app/oracle/product/10.2.0/db_1/cfgtoollogs/opatch/lsinv/lsinventory-2009_Jan_13_11-06-27-HKT_Tue.txt -------------------------------------------------------------------------------- Installed Top-level Products (2): Oracle Database 10g 10.2.0.1.0 Oracle Database 10g Products 10.2.0.1.0 There are 2 products installed in this Oracle Home. Interim patches (1) : Patch 4612267 : applied on Tue Jan 13 11:05:10 HKT 2009 Created on 5 Oct 2005, 13:48:00 hrs US/Pacific Bugs fixed: 4612267 --------------------------------------------------------------- OPatch succeeded.
啟動資料庫、監聽和dbconsole
$ sqlplus / as sysdba SQL> startup $ lsnrctl start $ emctl start dbconsole
如果有需要,還可以刪除patch,刪除前先停庫
$ cd $ORACLE_BASE/patches/4612267 $ $ORACLE_HOME/OPatch/opatch rollback -id 4612267 Invoking OPatch 10.2.0.1.0 ... Please shutdown Oracle instances running out of this ORACLE_HOME on the local system. (Oracle Home = '/u01/app/oracle/product/10.2.0/db_1') Is the local system ready for patching? Do you want to proceed? [y|n] y (此處輸入y) User Responded with: Y ... RollbackSession removing interim patch '4612267' from inventory The local system has been patched and can be restarted. OPatch succeeded.
此時再執行上面的驗證patch命令就會發現該patch已經刪除了。