oracle資料檔案的一次恢複經曆

來源:互聯網
上載者:User

環境介紹:

雙機作業系統:solaris 10資料庫版本:oracle 11g R1 64bit

      1、半夜接到電話,說資料庫報大量錯誤,起來查看資料庫,探索資料庫已崩潰,查看alert日誌,發現I/O報錯:

Wed Dec 18 00:36:57 2013KCF: write/open error block=0x98abe online=1file=89 /dev/raw/raw03error=27063 txt: 'SVR4 Error: 5: I/O errorAdditional information: -1Additional information: 8192'Wed Dec 18 00:36:57 2013KCF: write/open error block=0x9d70f online=1file=91 /dev/raw/raw05error=27063 txt: 'SVR4 Error: 5: I/O errorAdditional information: -1Additional information: 8192'Automatic datafile offline due to write error onAutomatic datafile offline due to write error on

      2、之前出過因為工程隊碰到線,導致現網問題,問機房人員今晚是否有工程,機房人員說,今晚有新裝置接入SAN網路,但經瞭解,無人碰到線,查看本機系統日誌,報錯如下:

Dec 17 23:33:10 fly-db01 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@fly6000c5d0008a0000006b131400440 (ssd28):Dec 17 23:33:10 fly-db01 SCSI transport failed: reason 'tran_err': retrying commandDec 17 23:33:10 fly-db01 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@fly6000c5d0008a0000006b131400930 (ssd52):

     3、查看其他使用到SAN儲存的伺服器資料庫運行情況,alert日誌,作業系統日誌是否報錯,以及掛載的為SAN儲存的檔案系統是否變成唯讀,發現部分資料庫也已崩潰,部分主機的檔案系統變成唯讀,作業系統日誌報lpfc的錯誤,之前也出現過這個錯誤,一般在出現lpfc的錯誤一段時間後,檔案系統就會出現唯讀

lpfc錯誤:fly008:/var/log # cat messages | grep lpfcDec 18 00:34:05 fly008 kernel: [10201542.768302] lpfc 0000:03:00.0: 0:(0):0203 Devloss timeout on WWPN 21:4g:00:0b:5e:6a:18:14 NPort x014400 Data: x40000 x1 x0Dec 18 00:34:07 fly008 kernel: [10201544.816750] lpfc 0000:03:00.0: 0:(0):0203 Devloss timeout on WWPN 21:4h:00:0b:5e:6a:18:14 NPort x014500 Data: x0 x7 x0Dec 18 00:34:07 fly008 kernel: [10201544.816802] lpfc 0000:03:00.0: 0:(0):0203 Devloss timeout on WWPN 21:4k:00:0b:5e:6a:18:14 NPort x014600 Data: x0 x7 x0檔案系統唯讀錯誤:fly008~ #df -hFilesystem            Size  Used Avail Use% Mounted on/dev/mapper/vg_fly008_app-lv_fly008_app99G   41G   53G  44% /home/fly008fly008~ # cd /home/fly008fly008/home/fly008 # touch 1.txttouch: cannot touch `1.txt': Read-only file system

     4、對今晚所做配置進行回退,問題消失,重新拉起資料庫,資料庫正常啟動,拉起應用,應用拉起失敗,在應用的日誌中,報如下錯誤:

SQLErrorCode: 376 ORA-00376: file 92 cannot be read at this timeORA-01110: data file 92: '/dev/raw/raw06'

     5、在資料庫的alert日誌,也報相關錯誤

DDE: Problem Key 'ORA 1110' was flood controlled (0x5) (no incident)ORA-01110: 資料檔案 92: '/dev/raw/raw06'*** 2013-12-18 05:04:16.284ORA-12012: 自動執行作業 226 出錯ORA-00372: 此時無法修改檔案 92ORA-06512: 在 "FLY.DELETE_FLY_EXCEPTION_INFO", line 8ORA-06512: 在 line 1

     6、查看資料檔案的狀態,標記為recover,需要進行恢複操作

SQL> SELECT file_name, file_id, tablespace_name, status, online_status FROM DBA_DATA_FILES ORDER BY TABLESPACE_NAME;FILE_NAME       FILE_ID  TABLESPACE_NAME   STATUS        ONLINE_STATUS/dev/raw/raw06     92        FLY           AVAILABLE      RECOVER

      7、資料庫開啟了歸檔,有資料庫的備份,對92的檔案進行恢複操作

# su - oracle$ sqlplus / as sysdbaSQL> archive log list;SQL> recover datafile 92;SQL> alter database datafile 92 online

     8、恢複後,應用拉起正常,業務測試正常。

本文出自 “斜陽悠悠寸草心” 部落格,請務必保留此出處http://fly1116.blog.51cto.com/8301004/1344552

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.