ORA-21561、ORA-15055、ORA-25253 導致DG備庫無法應用歸檔,ora-21561ora-15055

來源:互聯網
上載者:User

ORA-21561、ORA-15055、ORA-25253 導致DG備庫無法應用歸檔,ora-21561ora-15055
昨天去某客戶那裡做巡檢,順便看一下上次搭建的RAC-DG環境是否正常,不看不知道,一看嚇一跳,上次的DG是8月20日啟動並執行,而DG備庫從8月31日之後執行個體就沒有開啟過,後來詢問後才得知,原來那天斷過一次電,後來重啟了機器。直到今天我過去了,才把執行個體啟動起來。也就是說,從8月31日到今天快1個月的時間,備庫一致處於未使用狀態。
接著查看備庫歸檔,顯然已經缺失了很多了,tnread1 最後一個日誌為1661,tnread2 最後一個日誌為1324,而此時主庫中還保留的最早的日誌是9月8日的,thread 1 最早為2055,thread 2 最早為1555。主備之間歸檔足足差了有好幾百個(正常,都快一個月沒開執行個體了)
ASMCMD> ls2014_09_08/2014_09_09/2014_09_10/2014_09_11/2014_09_12/2014_09_13/2014_09_14/2014_09_15/2014_09_16/2014_09_17/2014_09_18/2014_09_19/2014_09_20/2014_09_21/2014_09_22/2014_09_23/2014_09_24/2014_09_25/ASMCMD> cd 2014_09_08ASMCMD> lsthread_1_seq_2055.500.857723297thread_1_seq_2056.494.857725223thread_1_seq_2057.493.857728031thread_1_seq_2058.490.857729849(略)……
thread_2_seq_1555.502.857723297thread_2_seq_1556.499.857723301thread_2_seq_1557.497.857723305thread_2_seq_1558.496.857725225(略)……ASMCMD>
儘管在指令碼中配置了備份完歸檔後用delete input來刪除歸檔,以減小歸檔佔用的磁碟空間,可以在RMAN指令碼的備份日誌中看到,從8月31日起,陸續有報RMAN-08137,提示由於備庫還未獲得歸檔,導致無法刪除:
歸檔記錄檔名=+DATA/sis/archivelog/2014_08_31/thread_1_seq_1661.1208.857041081 RECID=3930 STAMP=857041081RMAN-08137: 警告: 歸檔日誌未刪除, 因為備用或上遊捕獲進程需要它歸檔記錄檔名=+DATA/sis/archivelog/2014_08_31/thread_1_seq_1662.1212.857042297 線程=1 序列=1662RMAN-08137: 警告: 歸檔日誌未刪除, 因為備用或上遊捕獲進程需要它歸檔記錄檔名=+DATA/sis/archivelog/2014_09_01/thread_2_seq_1325.1204.857122335 線程=2 序列=1325RMAN-08137: 警告: 歸檔日誌未刪除, 因為備用或上遊捕獲進程需要它
但是,由於FRA磁碟空間是有限的,使用到一定的百分比(有參數可調整),Oracle會自動清空其中的內容,以釋放空間,因此在FRA中的歸檔日誌大約保留了18天,從9月8日到9月25日,而8月31日的歸檔肯定是沒有的了,最近的備份組只有到9月15日的。
於是決定重新搭建一下DG,關閉備庫執行個體,刪除全部資料庫檔案(資料檔案、控制檔案、記錄檔),只保留密碼檔案、參數檔案、tnsnames.ora、listener.ora即可,重建很方便,用11g的duplicate重新同步一下就可以了,命令如下:
rman target / auxiliary sys/oracle@sisdgRMAN> run{
allocate channel c1 device type disk;allocate auxiliary channel c2 device type disk;set newname for tempfile 1 to 'D:\app\administrator\oradata\sis\temp.269.852648395';duplicate target database for standby from active database dorecover;release channel c1;release channel c2;}
執行完以上操作後,備庫與主庫的歸檔就同步了
主庫:
SQL> archive log list資料庫記錄模式            存檔模式自動封存             啟用存檔終點            USE_DB_RECOVERY_FILE_DEST最早的聯機日誌序列     2617下一個存檔日誌序列   2619當前日誌序列           2619SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
   THREAD# MAX(SEQUENCE#)---------- --------------         1           2619         2           2556
備庫:
SQL> archive log list資料庫記錄模式            存檔模式自動封存             啟用存檔終點            D:\archivelog最早的聯機日誌序列     0下一個存檔日誌序列   0當前日誌序列           0SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
   THREAD# MAX(SEQUENCE#)---------- --------------         1           2619         2           2556
主備庫查看了一下v$archive_dest_status,兩邊都是的status列都是valid的,因此開啟備庫的redo apply,看到日誌也開始已經應用了
SQL> select thread#,sequence#,applied from v$archived_log;
   THREAD#  SEQUENCE# APPLIED---------- ---------- ------------------         1       2617 YES         1       2618 YES         2       2556 YES         1       2619 NO         1       2620 NO
由於採用的是LGWR ASYNC模式傳遞日誌,再重新建立一次standby redo logfile,主庫每個thread有3組日誌,所以備庫建立了7組日誌
此外,備庫的alertlog裡還報了個錯誤,因為是從主庫duplicate過來的,RMAN的配置資訊還保留著主庫的一些參數:
Starting control autobackupGot error: 19624********************  WARNING ***************************The errors during Server autobackup are not fatal, as itis attempted after sucessful completion of the command.However, it is recomended to take an RMAN control filebackup as soon as possible because the Autobackup failedwith the following error:ORA-19624: operation failed, retry possibleORA-19504: failed to create file "C:\ORABACKUP\BACKUPSETS\SIS1-C-3160648191-20140925-02.CTL"ORA-27040: file create error, unable to create fileOSD-04002: 無法開啟檔案O/S-Error: (OS 3) 系統找不到指定的路徑。********************  END OF WARNING *******************
RMAN> show all;

使用目標資料庫控制檔案替代恢複目錄db_unique_name 為 SISDG 的資料庫的 RMAN 配置參數為:CONFIGURE RETENTION POLICY TO REDUNDANCY 7;CONFIGURE BACKUP OPTIMIZATION OFF; # defaultCONFIGURE DEFAULT DEVICE TYPE TO DISK;CONFIGURE CONTROLFILE AUTOBACKUP ON;CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO 'c:\orabackup\backupsets\sis1-%F.ctl'; CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET;CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # defaultCONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # defaultCONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # defaultCONFIGURE ENCRYPTION ALGORITHM 'AES128'; # defaultCONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # defaultCONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # defaultCONFIGURE SNAPSHOT CONTROLFILE NAME TO 'D:\APP\ADMINISTRATOR\PRODUCT\11.2.0\DBHOME_1\DATABASE\SNCFSIS.ORA'; # default
備庫並沒有“ 'c:\orabackup\backupsets\ ”這個路徑,所以報錯,clear掉就可以了RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK CLEAR;

問題到這裡似乎是圓滿地解決了,但事實並非如此,當我在節點1連續切換了幾次日誌後發現,雖然歸檔都能順利傳到備庫,但後續的日誌始終應用不了,查看節點1和備庫的alertlog,也沒有發現有什麼異常,直到我意識到應該看看節點2的情況時,才發現了點端倪。
照理說,正常情況下,在v$archive_dest_status視圖中查詢到的遠程歸檔路徑的status必須是要valid的,並且error列的狀態為空白,而節點2的查詢結果是這樣的:
SQL> select dest_id,status,error from v$archive_dest where dest_id<3;
   DEST_ID STATUS    ERROR---------- --------- ------------------------------         1 VALID         2 ERROR     ORA-21561: 產生 OID 失敗
此時再去查看節點2的alertlog,發現除了報ORA-21561,伴隨著的還有ORA-15055、ORA-25253等
Thu Sep 25 15:05:06 2014WARNING: ASM communication error: op 0 state 0x0 (15055)ERROR: direct connection failure with ASMWARNING: ASM communication error: op 0 state 0x0 (15055)ERROR: direct connection failure with ASMWARNING: ASM communication error: op 0 state 0x0 (15055)ERROR: direct connection failure with ASMWARNING: ASM communication error: op 0 state 0x0 (15055)ERROR: direct connection failure with ASMWARNING: ASM communication error: op 0 state 0x0 (15055)ERROR: direct connection failure with ASMThu Sep 25 15:05:37 2014Error 21561 received logging on to the standbySuppressing further error logging of LOG_ARCHIVE_DEST_2.Thu Sep 25 15:05:46 2014ORA-25253 encountered when generating server alert SMG-3503Thu Sep 25 15:05:58 2014Thread 2 advanced to log sequence 2561 (LGWR switch)  Current log# 6 seq# 2561 mem# 0: +DATA/tc/onlinelog/group_6.1523.854793175  Current log# 6 seq# 2561 mem# 1: +DATA/tc/onlinelog/group_6.1524.854793175Thu Sep 25 15:05:58 2014Archived Log entry 6148 added for thread 2 sequence 2560 ID 0xbc63a5fc dest 1:Thu Sep 25 15:05:58 2014Error 21561 received logging on to the standbySuppressing further error logging of LOG_ARCHIVE_DEST_2.FAL[server, ARC3]: FAL archive failed, see trace file.ARCH: FAL archive failed. Archiver continuingThu Sep 25 15:07:49 2014ORA-25253 encountered when generating server alert SMG-3503Thu Sep 25 15:08:07 2014ARCH: Possible network disconnect with primary databaseThu Sep 25 15:11:38 2014Error 21561 received logging on to the standby
在節點2執行lsnrctl status查看監聽狀態以及用tnsping NET SERVICE NAME,都會hang在那邊,不顯示結果
於是到MOS上面查閱了一下關於ORA-21561和ORA-15055的文檔,有好幾篇相關文檔都有提到這一現象,主要描述如下:1. 通常在10.2.0.x升級後發生,登入SQLPLUS時直接提示:ORA - 21561 : OID GENERATION FAILED,需要在.../hosts中加入localhost的完整主機名稱,我可以正常登入SQLPLUS,似乎這個不太符合我碰到的情形。

2. 由bug引起,Bug 14324057 and Bug 12529945有潛在消耗desktop heap memory的可能,需要修改註冊表,來增加預設desktop heap size來解決。
文檔中還提到,bug針對的是11.2.0.2 patch 19——11.2.0.3 patch 7,但這個庫已經打到patch 27了,如果是bug也應該已經修複了。於是嘗試修改了註冊表,修改\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\SubSystems\下面的SharedSection=1024,20480,1024,修改前先備份一下註冊表

修改前:
%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,768 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16
修改後:
%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,1024 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16
注意,改完註冊表後需要重啟該節點主機後才會生效。



相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.