問題: 1、今天早上所有用戶端都無法登陸,提示應用伺服器配置錯誤。 分析及連線處理: 1、檢查用戶端日誌,探索資料庫異常。 Caused by: java.sql.SQLException: Listener refused the connection with the following error: ORA-12528, TNS:listener: all appropriate instances are blocking new connections The Connection descriptor used by the client was: //192.168.2.245:1521/eas54 at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:111) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:260) at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:386) at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:438) at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:164) at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:34) at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:752) at java.sql.DriverManager.getConnection(DriverManager.java:525) at java.sql.DriverManager.getConnection(DriverManager.java:140) at com.kingdee.bos.sql.shell.KDDriver.connect(Unknown Source) ... 36 more 從錯誤資訊看,資料庫無法串連,提示執行個體不允許建立串連。 2、檢查監聽狀態,顯示EAS54執行個體處於阻塞狀態。 3、檢查alert日誌,發現有異常。 Thread 1 advanced to log sequence 1397 Current log# 1 seq# 1397 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\REDO01.LOG Mon Nov 30 18:44:23 2009 Process startup failed, error stack: Mon Nov 30 18:44:23 2009 Errors in file d:\oracle\product\10.2.0\admin\eas54\bdump\eas54_psp0_3504.trc: ORA-27300: OS 緋葷粺鐩稿叧鎿嶄綔: spcdr:9261:4200 澶辮觸, 鐘舵€佷負: 997 ORA-27301: OS 鏁呴殰娑堟伅: 重疊 I/O 操作在進行中。 ORA-27302: 閿欒鍙戠敓鍦? skgpspawn Mon Nov 30 18:44:24 2009 Process J000 died, see its trace file Mon Nov 30 18:44:24 2009 kkjcre1p: unable to spawn jobq slave process Mon Nov 30 18:44:24 2009 Errors in file d:\oracle\product\10.2.0\admin\eas54\bdump\eas54_cjq0_3648.trc: Mon Nov 30 19:00:53 2009 Process startup failed, error stack: Mon Nov 30 19:00:53 2009 Errors in file d:\oracle\product\10.2.0\admin\eas54\bdump\eas54_psp0_3504.trc: ORA-27300: OS 緋葷粺鐩稿叧鎿嶄綔: spcdr:9261:4200 澶辮觸, 鐘舵€佷負: 997 ORA-27301: OS 鏁呴殰娑堟伅: 重疊 I/O 操作在進行中。 ORA-27302: 閿欒鍙戠敓鍦? skgpspawn 從昨天晚上18:44開始,資料庫開始報錯,由於亂碼不知具體原因,但可以看到OS以及I/O字樣,所以判斷應該是出現了IO錯誤。 4、今天早上有人嘗試重啟過資料庫,但啟動沒有成功。 Starting up ORACLE RDBMS Version: 10.2.0.1.0. System parameters with non-default values: processes = 200 sessions = 225 sga_max_size = 1367343104 __shared_pool_size = 318767104 shared_pool_size = 318767104 __large_pool_size = 8388608 large_pool_size = 8388608 __java_pool_size = 8388608 java_pool_size = 8388608 __streams_pool_size = 0 streams_pool_size = 0 spfile = D:\ORACLE\PRODUCT\10.2.0\DB_1\DBS\SPFILEEAS54.ORA nls_language = SIMPLIFIED CHINESE nls_territory = CHINA sga_target = 1367343104 control_files = D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\CONTROL01.CTL, D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\CONTROL02.CTL, D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\CONTROL03.CTL db_block_size = 8192 __db_cache_size = 1023410176 db_cache_size = 838860800 compatible = 10.2.0.1.0 db_file_multiblock_read_count= 16 db_recovery_file_dest = D:\oracle\product\10.2.0/flash_recovery_area db_recovery_file_dest_size= 2147483648 undo_management = AUTO undo_tablespace = UNDOTBS1 remote_login_passwordfile= EXCLUSIVE db_domain = dispatchers = (PROTOCOL=TCP) (SERVICE=eas54XDB) job_queue_processes = 10 audit_file_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\EAS54\ADUMP background_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\EAS54\BDUMP user_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\EAS54\UDUMP core_dump_dest = D:\ORACLE\PRODUCT\10.2.0\ADMIN\EAS54\CDUMP db_name = eas54 open_cursors = 300 _b_tree_bitmap_plans = FALSE _no_or_expansion = TRUE optimizer_index_cost_adj = 50 optimizer_index_caching = 90 pga_aggregate_target = 203423744 MMAN started with pid=4, OS id=3952 DBW0 started with pid=5, OS id=3968 LGWR started with pid=6, OS id=3976 CKPT started with pid=7, OS id=3992 SMON started with pid=8, OS id=4004 RECO started with pid=9, OS id=4020 CJQ0 started with pid=10, OS id=4044 MMON started with pid=11, OS id=4056 MMNL started with pid=12, OS id=4064 Tue Dec 01 09:21:28 2009 starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'... starting up 1 shared server(s) ... PMON started with pid=2, OS id=3916 Tue Dec 01 09:21:29 2009 alter database mount exclusive ORA-214 signalled during: alter database mount exclusive... PSP0 started with pid=3, OS id=3948 5、sqlplus串連資料庫執行個體,並檢查執行個體狀態。 C:\Documents and Settings\Administrator>set ORACLE_SID=EAS54 C:\Documents and Settings\Administrator>sqlplus / as sysdba SQL*Plus: Release 10.2.0.1.0 - Production on 星期二 12月 1 10:39:46 2009 Copyright (c) 1982, 2005, Oracle. All rights reserved. 串連到: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production With the Partitioning, OLAP and Data Mining options SQL> select instance_name ,status from v$instance; INSTANCE_NAME STATUS -------------------------------- ------------------------ eas54 STARTED 執行個體狀態為STARTED,未啟動成功。 6、停止並啟動資料庫,發現有錯誤。 SQL> startup ORACLE instance started. Total System Global Area 1367343104 bytes Fixed Size 1250884 bytes Variable Size 335546812 bytes Database Buffers 1023410176 bytes Redo Buffers 7135232 bytes ORA-00214: control file 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\CONTROL02.CTL' version 14018 inconsistent with file 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EAS54\CONTROL01.CTL' version 14016 錯誤資訊表明控制檔案出現不一致的情況,估計是昨天晚上的IO故障導致控制檔案出現不一致。 CONTROL02.CTL的版本號碼14018 > CONTROL01.CTL版本號碼14016 7、將三個控制檔案先備份,然後用CONTROL02.CTL覆蓋CONTROL01.CTL和CONTROL03.CTL 8、重啟資料庫,可以正常啟動,系統復原正常。 9、後檢查作業系統事件檢視器的日誌,發現已經被清除,無法獲知是否有IO方面的錯誤資訊。 後續建議: 1、請密切關注伺服器的IO狀況,並檢查磁碟是否有異常,並定時檢查作業系統相關日誌。 2、確保每天的備份正常,並複製到另外的機器上,以防萬一。 轉載自:http://dev.kingdee.com/bbs/viewthread.php?tid=837 |