Recovery cases for several sets of ASM RAC for Oracle databases

Source: Internet
Author: User
Tags dba error code rollback

Recently helped a customer recover 6 sets of Oracle RAC, both ASM and version 10.2.0.4. Stay up for several days, almost vomiting blood.
Here is a brief description of the recovery of one of the libraries, to share with you.
Several of these have basically encountered the following ORA-00600 error:


Thu Dec 31 11:55:46 2015
Success:diskgroup DG1 was mounted
Thu Dec 31 11:55:50 2015
Errors in FILE/ORACLE/ADMIN/XXX/UDUMP/XXX1_ORA_28803.TRC:
Ora-00600:internal error code, arguments: [Kccpb_sanity_check_2], [13715626], [13715623], [0x000000000], [], [], [], []
Success:diskgroup DG1 was dismounted
Thu Dec 31 11:55:51 2015
The error, in fact, is very simple, mainly because the control file corruption, by rebuilding the control file or using the backup control files for restore can be mount, even if we use the control file snapshots are able to mount the database, and then proceed to restore operations. The following errors were also encountered during the recovery process:


Errors in FILE/ORACLE/ADMIN/XXX/UDUMP/XXX1_ORA_6990.TRC:
Ora-00600:internal error code, arguments: [Kclchkblk_4], [3431], [18446744072948603858], [3431], [18446744072948586897 ], [], [], []
Tue 5 10:52:28 2016
Errors in FILE/ORACLE/ADMIN/XXX/BDUMP/XXX1_ARC0_8205.TRC:
ora-19504:failed to create file "+DG1/XXX/ARCHIVELOG/1_2_900069464.DBF"
Ora-17502:ksfdcre:4 Failed to create file +dg1/xxx/archivelog/1_2_900069464.dbf
Ora-00600:internal error code, arguments: [kffbAddBlk04], [], [], [], [], [], [], []
Tue 5 10:52:28 2016
Arc0:error 19504 Creating archive log file to ' +dg1/xxx/archivelog/1_2_900069464.dbf '
Arch:archival stopped, error occurred. would continue retrying
Tue 5 10:52:30 2016
ORACLE Instance Xxx1-arc
The above ORA-00600 error is actually very simple, mainly is the data block SCN problem. Here is a general description of the recovery of one set of libraries, because there is something very magical in the process of restoring the library.




sql> startup Mount Pfile= '/tmp/p.ora ';


ORACLE instance started.





Total System Global area 2.1475E+10 bytes


Fixed Size 2122368 bytes


Variable Size 2399145344 bytes


Database buffers 1.9059E+10 bytes


Redo buffers 14651392 bytes


Database mounted.


sql> ALTER DATABASE ADD LOGFILE THREAD 2


2 GROUP 3 (


3 ' +dg/xxxx/onlinelog/group_3.271.752099989 ',


4 ' +dg/xxxx/onlinelog/group_3.272.752099991 '


5 SIZE 100M Reuse,


6 GROUP 4 (


7 ' +dg/xxxx/onlinelog/group_4.273.752099991 ',


8 ' +dg/xxxx/onlinelog/group_4.274.752099993 '


9 SIZE 100M Reuse,


GROUP 6 (


One ' +dg/xxxx/onlinelog/group_6.275.752099993 ',


' +dg/xxxx/onlinelog/group_6.276.752099993 '


SIZE 100M reuse;


ALTER DATABASE ADD LOGFILE THREAD 2


*


ERROR at line 1:


Ora-01276:cannot Add File +dg/xxxx/onlinelog/group_3.271.752099989. File has


An Oracle Managed Files file name.


Since it is Oracle RAC, it is necessary to add redo logfile after rebuilding the control file, but add logfile found the above error. According to Oracle Metalink some of the methods are not successful, are reported above the error, it is really weird.


Some people look at the above error, may think is set the OMF parameter, in fact, here is not, I will all the relevant parameters modified, the error remains.


When actually adding logfile, write only the disk group name and do not need to write an absolute path.


Then after the recover for open resetlogs, the error ORA-01248, as follows:




sql> startup Mount Pfile= '/tmp/p.ora ';


ORACLE instance started.





Total System Global area 2.1475E+10 bytes


Fixed Size 2122368 bytes


Variable Size 2399145344 bytes


Database buffers 1.9059E+10 bytes


Redo buffers 14651392 bytes


Database mounted.


sql> Recover database using Backup controlfile until cancel;


Ora-00279:change 13300428179625 generated at 04/04/2013 12:51:35 for


Thread 1


Ora-00289:suggestion: +dg/archivelog/arch1_752099890_12809_1.log


Ora-00280:change 13300428179625 for thread 1 was in sequence #12809





Specify log: {<ret>=suggested | AUTO | CANCEL}


Auto


Ora-00308:cannot Open Archived Log


' +dg/archivelog/arch1_752099890_12809_1.log '


Ora-17503:ksfdopn:2 Failed to open file


+dg/archivelog/arch1_752099890_12809_1.log


Ora-15173:entry ' Arch1_752099890_12809_1.log ' does not exist in directory


' Archivelog '





Ora-00308:cannot Open Archived Log


' +dg/archivelog/arch1_752099890_12809_1.log '


Ora-17503:ksfdopn:2 Failed to open file


+dg/archivelog/arch1_752099890_12809_1.log


Ora-15173:entry ' Arch1_752099890_12809_1.log ' does not exist in directory


' Archivelog '





Ora-01547:warning:recover succeeded but OPEN Resetlogs would get error below


Ora-01194:file 1 needs more recovery to be consistent


Ora-01110:data file 1: ' +dg/xxxx/datafile/system.256.752099833 '





sql> ALTER DATABASE open resetlogs;


ALTER DATABASE open Resetlogs


*


ERROR at line 1:


Ora-01248:file created in the future of incomplete recovery


Ora-01110:data file: ' +dg/xxxx/datafile/file_tab_xdidx03.ora '


This error is still relatively rare, and in fact, those claims on the web, as well as the solutions offered by Oracle MOS, I find impossible.


Helpless can only offline it first, and then to restore. Before I proceed to open I queried the current checkpoint SCN as follows:




Sql> select file#,checkpoint_change# from V$datafile;





file# checkpoint_change#


---------- -----------------------


1 14731601024328


2 14731601024328


3 14731601024328


4 13300428179625


5 14731601024328


6 14731601024328


7 14731601024328


.......


39 14731601024328


40 14731601024328


41 14731601024328


42 14731601024328


43 14731601024328





Selected rows.





Sql> C/datafile/datafile_header


1* Select file#,checkpoint_change# from V$datafile_header


Sql>/





file# checkpoint_change#


---------- -----------------------


1 14731601024328


2 14731601024328


3 14731601024328


4 13300428179625


5 14731601024328


6 14731601024328


7 14731601024328


......


40 14731601024328


41 14731601024328


42 14731601024328


43 14731601024328





Selected rows.


Because of open failure, here I think is not these 2 files have a problem, and with the previous snapshot control file for recover, and then again with the reconstructed control file database for recover, found that the magical things appear:




sql> Recover database using Backup controlfile until cancel;


Ora-00279:change 13305808683011 generated at 01/11/2016 21:09:02 for


Thread 1


Ora-00289:suggestion: +dg/archivelog/arch1_900882531_1_1.log


Ora-00280:change 13305808683011 for thread 1 was in sequence #1





Specify log: {<ret>=suggested | AUTO | CANCEL}


Auto


Ora-00308:cannot Open archived log ' +dg/archivelog/arch1_900882531_1_1.log '


Ora-17503:ksfdopn:2 Failed to open file +dg/archivelog/arch1_900882531_1_1.log


Ora-15173:entry ' Arch1_900882531_1_1.log ' does not exist in directory


' Archivelog '





Ora-00308:cannot Open archived log ' +dg/archivelog/arch1_900882531_1_1.log '


Ora-17503:ksfdopn:2 Failed to open file +dg/archivelog/arch1_900882531_1_1.log


Ora-15173:entry ' Arch1_900882531_1_1.log ' does not exist in directory


' Archivelog '





Ora-01547:warning:recover succeeded but OPEN Resetlogs would get error below


Ora-01194:file 1 needs more recovery to be consistent


Ora-01110:data file 1: ' +dg/xxxx/datafile/system.256.752099833 '





sql> ALTER DATABASE datafile offline;


ALTER DATABASE datafile offline;


Database altered.





Sql>





Database altered.





Sql>


sql> ALTER DATABASE open resetlogs;


ALTER DATABASE open Resetlogs


*


ERROR at line 1:


Ora-01092:oracle instance terminated. Disconnection forced


We can see that the open failed, and for the open failure, we first looked at alert log and then 10046 trace.




sql> startup Nomount pfile= '/tmp/p.ora ';


ORACLE instance started.





Total System Global area 2.1475E+10 bytes


Fixed Size 2122368 bytes


Variable Size 2399145344 bytes


Database buffers 1.9059E+10 bytes


Redo buffers 14651392 bytes


Sql> Oradebug Setmypid


Statement processed.





Sql> ALTER DATABASE Mount;





Database altered.





sql> Recover database;


Media recovery complete.


sql> ALTER DATABASE open;


ALTER DATABASE Open


*


ERROR at line 1:


Ora-01092:oracle instance terminated. Disconnection forced


Ora-00704:bootstrap Process Failure


Ora-00704:bootstrap Process Failure


Ora-00604:error occurred at recursive SQL level 1


Ora-01555:snapshot too old:rollback segment number 4 with name "_syssmu4$"


Too small


And here I'm shielding the undo-related parameters. Try again to find the error still. Start again, something magical happened, and the SCN actually regressed.

sql> startup Mount Pfile= '/tmp/p.ora ';


ORACLE instance started.





Total System Global area 2.1475E+10 bytes


Fixed Size 2122368 bytes


Variable Size 2399145344 bytes


Database buffers 1.9059E+10 bytes


Redo buffers 14651392 bytes


Database mounted.


sql> Recover database;


Media recovery complete.





Sql> select checkpoint_change#,file# from V$datafile;





checkpoint_change# file#


--------------------------- ----------


13314398637607 1


13314398637607 2


13314398637607 3


13314398637607 4


13314398637607 5


......


13314398637607 38


13314398637607 39


13314398637607 40


13314398637607 41


0 42


0 43





Selected rows.





Sql> select Checkpoint_change#,file#,checkpoint_time from V$datafile_header;





checkpoint_change# file# Checkpoin


--------------------------- ---------- ---------


13314398637607 1 11-jan-16


13314398637607 2 11-jan-16


13314398637607 3 11-jan-16


13314398637607 4 11-jan-16


13314398637607 5 11-jan-16


......


13314398637607 11-jan-16


13314398637607 11-jan-16


13314398637607 11-jan-16


14731601024328 30-dec-15


14731601024328 30-dec-15





Selected rows.


Obviously, this 133 SCN is back to the past 2 years ago, and time and space travel through .... Of course, open must still be an error:


sql> ALTER DATABASE open;
ALTER DATABASE Open
*
ERROR at line 1:
Ora-01092:oracle instance terminated. Disconnection forced
Ora-00704:bootstrap Process Failure
Ora-00704:bootstrap Process Failure
Ora-00604:error occurred at recursive SQL level 1
Ora-01555:snapshot too old:rollback segment number 4 with name "_syssmu4$"
Too small
No matter why the SCN in the data file header has been reversed (the SCN of the 2 files previously offline is OK). Through 10046 Trace, you get the following:




Parsing in CURSOR #5 len=52 dep=1 uid=0 oct=3 lid=0 tim=1418474357830663 hv=429618617 ad= ' 5db7ea50 '


Select CTime, Mtime, stime from obj$ where obj# =: 1


End of STMT


PARSE #5: c=0,e=531,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1418474357830659


Binds #5:


Kkscoacd


Bind#0


oacdty=02 mxl=22 mxlc=00 mal=00 scl=00 pre=00


oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0


kxsbbbfp=2ad7172aa020 bln=22 avl=02 flg=05


Value=20


EXEC #5: c=1000,e=673,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=1418474357831431


Wait #5: nam= ' db file sequential read ' ela= 9843 file#=1 block#=218 Blocks=1 obj#=-1


.....


FETCH #6: c=17997,e=64936,p=23,cr=566,cu=0,mis=0,r=1,dep=2,og=3,tim=1418474357907669


STAT #6 id=1 cnt=1 pid=0 pos=1 obj=15 op= ' TABLE ACCESS by INDEX ROWID undo$ (cr=566 pr=23 pw=0 us) '


STAT #6 id=2 cnt=1 pid=1 pos=1 obj=34 op= ' INDEX UNIQUE SCAN i_undo1 (cr=1 pr=1 pw=0 time=5769 us) '


Wait #5: nam= ' db file sequential read ' ela= 13031 file#=40 block#=167538 Blocks=1 obj#=-1


FETCH #5: c=19996,e=89548,p=25,cr=568,cu=0,mis=0,r=0,dep=1,og=4,tim=1418474357921006


As we can see here, the error SQL reads file 1 Block 218 and file Block 167538.


For file 1 Block 218, I found no active transactions on the dump, while file blocks 167538 was the undo chunk.


Sql> select name from V$datafile where file#=40;

NAME
--------------------------------------------------------------------------------
+dg/xxxx/datafile/undotbs4
At the same time dump the undo block and find that it does feel a bit unusual, as follows:




********************************************************************************


UNDO BLK:


xid:0x0009.01b.0014320a seq:0xa47 cnt:0x1 irb:0x1 icl:0x0 flg:0x0000





REC Offset rec Offset rec Offset rec Offset rec Offset


---------------------------------------------------------------------------


0x01 0x0014





*-----------------------------


* Rec #0x1 slt:0x1b objn:55417 (0x0000d879) objd:296039 tblspc:20 (0x00000014)


* LAYER:10 (Index) opc:21 RCI 0x00


Undo Type:regular Undo Last Buffer Split:no


Temp Object:no


Tablespace Undo:no


Rdba:0x0a028e71


*-----------------------------


Index general Undo (branch) operations


KTB Redo


op:0x05 ver:0x01


Op:r Itc:2


Itl Xid Uba Flag Lck SCN/FSC


0x01 0x0009.01b.0014320a 0x0a028e71.0a47.04----1 FSC 0x0000.00000000


0x02 0x0009.02b.001428b6 0x0a028e6f.0a47.06----112 FSC 0x0000.00000000


Dump kdige:block dba:0x05d630b3, Seghdr dba:0x06076e89


Restore block before image


Because all the file head SCN is backwards, the normal open error, can only advance SCN, and the SCN must be larger than the maximum SCN of the undo block to do, by adding the parameter *._minimum_giga_scn in the Pfile file.




Sql> Conn/as SYSDBA


Connected to a idle instance.


sql> startup Mount Pfile= '/tmp/p.ora ';


ORACLE instance started.





Total System Global area 2.1475E+10 bytes


Fixed Size 2122368 bytes


Variable Size 2399145344 bytes


Database buffers 1.9059E+10 bytes


Redo buffers 14651392 bytes


Database mounted.


Sql> Show Parameter Job





NAME TYPE VALUE


------------------------------------ ----------- ------------------------------


Job_queue_processes Integer 10


Sql> alter system set job_queue_processes=0;





System altered.





sql> ALTER DATABASE open;





Database altered.





Sql> drop tablespace undotbs3 including contents and datafiles;





Tablespace dropped.





Sql> drop tablespace undotbs4 including contents and datafiles;





Tablespace dropped.


Once you have successfully opened the database, drop and rebuild the existing undo table space immediately.


Although the database is open, however, there are 2 data files before we offline, and the middle of the resetlogs operation, so now can not be normal online.




Sql> select file#,checkpoint_change# from V$datafile;





file# checkpoint_change#


---------- ------------------------------


1 14759124431097


2 14759124431097


3 14759124431097


4 14759124431097


5 14759124431097


......


36 14759124431097


37 14759124431097


38 14759124431097


41 14759124431097


42 0


43 0


44 14759124431097


45 14759124431097





Selected rows.





sql> ALTER DATABASE datafile online;


ALTER DATABASE DataFile Online


*


ERROR at line 1:


Ora-01190:control file or data file is from before to last Resetlogs


Ora-01110:data file: ' +dg/xxxx/datafile/file_tab_xxx03.ora '





sql> ALTER DATABASE datafile online;


ALTER DATABASE DataFile Online


*


ERROR at line 1:


Ora-01190:control file or data file is from before to the last Resetlogs


Ora-01110:data file: ' +dg/xxxx/datafile/file_tab1_xxx05.ora '


Here with bbed the above 2 file header related information to modify, and then recover, you can smooth online file.




Sql> recover datafile 42;


Media recovery complete.





sql> ALTER DATABASE datafile online;





Database altered.





sql> recover datafile 43;


Media recovery complete.


sql> ALTER DATABASE datafile online;





Database altered.





Sql> Select file#,checkpoint_change#, status from V$datafile;





file# checkpoint_change# STATUS


---------- ------------------------------ -------


1 14759124821491 SYSTEM


2 14759124821491 SYSTEM


3 14759124821491 ONLINE


4 14759124821491 ONLINE


5 14759124821491 ONLINE


。。。。。。


14759124821491 ONLINE


Notoginseng 14759124821491 ONLINE


14759124821491 ONLINE


14759124821491 ONLINE


14759124831966 ONLINE


14759124832115 ONLINE


14759124821491 ONLINE


14759124821491 ONLINE





Selected rows.





Sql> alter system checkpoint;





System altered.





Sql> Select file#,checkpoint_change#, status from V$datafile;





file# checkpoint_change# STATUS


---------- ------------------------------ -------


1 14759124832224 SYSTEM


2 14759124832224 SYSTEM


3 14759124832224 ONLINE


4 14759124832224 ONLINE


5 14759124832224 ONLINE


......


14759124832224 ONLINE


14759124832224 ONLINE


14759124832224 ONLINE


14759124832224 ONLINE


14759124832224 ONLINE


14759124832224 ONLINE





Selected rows.


Finally, we recommend that you export and rebuild the database EXPDP. It's over!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.