Restoration note for a database with 15 TB of Elasticsearch data

Source: Internet
Author: User
Tags rollback

The customer's database experienced a fault before the Spring Festival. After several people attempt to recover the database, the database was opened. The following was reported when the database was opened:

Wed Jan 13 17:03:25 2016
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, Query Duration = 1452675805 sec, SCN: 0x0d6a. 46c6524f ):
Wed Jan 13 17:03:25 2016
Select ctime, mtime, stime from obj $ where obj # =: 1
Wed Jan 13 17:03:25 2016
Errors in file/u01/app/oracle/admin/xxxx/udump/xxxx1_ora_18274.trc:
ORA-00704: bootstrap process failure.
ORA-00704: bootstrap process failure.
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 20 with name "_ SYSSMU20 $" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704.
Instance terminated by USER, pid = 18274
ORA-1092 signalled during: alter database open resetlogs...
Wed Jan 13 17:06:34 2016

This error is actually very common and has been restored too much. We will not describe it too much here. But here I am surprised that the SQL Query Duration is too large.

Based on experience, in this case, the SCN can be promoted directly. However, when we perform the following operations, we find that it does not work:
Alter session set events '10015 trace name adjust_scn level 13740 & prime ;;
We further found that the SQL statement accessed the following blocks through the 10046 trace:


Parsing in cursor #5 len = 52 dep = 1 uid = 0 oct = 3 lid = 0 tim = 1422682994194207 hv = 429618617 ad = '395fa870'
Select ctime, mtime, stime from obj $ where obj # =: 1
END OF STMT
PARSE #5: c = 0, e = 225, p = 0, cr = 0, cu = 0, mis = 1, r = 0, dep = 1, og = 4, tim = 1422682994194205
BINDS #5:
Kkscoacd
Bind #0
Oacdty = 02 mxl = 22 (22) mxlc = 00 mal = 00 scl = 00 pre = 00
Oacflg = 08 fl2 = 0001 frm = 00 csi = 00 siz = 24 off = 0
Kxsbbbfp = 2b7694a4aea8 bln = 22 avl = 02 flg = 05
Value = 20
EXEC #5: c = 0, e = 398, p = 0, cr = 0, cu = 0, mis = 1, r = 0, dep = 1, og = 4, tim = 1422682994194653
WAIT #5: nam = 'DB file sequential read 'ela = 20378 file # = 1 block # = 218 blocks = 1 obj #=-1 tim = 1422682994215120
WAIT #5: nam = 'DB file sequential read 'ela = 480 file # = 1 block # = 219 blocks = 1 obj #=-1 tim = 1422682994215712
WAIT #5: nam = 'DB file sequential read 'ela = 18990 file # = 1 block # = 122 blocks = 1 obj #=-1 tim = 1422682994234841
.......
EXEC #6: c = 0, e = 141, p = 0, cr = 0, cu = 0, mis = 0, r = 0, dep = 2, og = 3, tim = 1422682994267351
FETCH #6: c = 0, e = 34, p = 0, cr = 2, cu = 0, mis = 0, r = 1, dep = 2, og = 3, tim = 1422682994267413
STAT #6 id = 1 cnt = 1 pid = 0 pos = 1 obj = 15 op = 'Table access by index rowid undo $ (cr = 2 pr = 0 pw = 0 time = 28 us)'
STAT #6 id = 2 cnt = 1 pid = 1 pos = 1 obj = 34 op = 'index unique scan I _UNDO1 (cr = 1 pr = 0 pw = 0 time = 16 us)'
WAIT #5: nam = 'DB file sequential read 'ela = 12312 file # = 7 block # = 4993 blocks = 1 obj #=-1 tim = 1422682994279882
WAIT #5: nam = 'DB file sequential read 'ela = 18776 file # = 7 block # = 4965 blocks = 1 obj #=-1 tim = 1422682994298789
WAIT #5: nam = 'DB file sequential read 'ela = 13157 file # = 7 block # = 4801 blocks = 1 obj #=-1 tim = 1422682994312081
WAIT #5: nam = 'DB file sequential read 'ela = 12519 file # = 7 block # = 4954 blocks = 1 obj #=-1 tim = 1422682994324726
WAIT #5: nam = 'DB file sequential read 'ela = 410 file # = 7 block # = 4952 blocks = 1 obj #=-1 tim = 1422682994325259
WAIT #5: nam = 'DB file sequential read 'ela = 5447 file # = 7 block # = 4778 blocks = 1 obj #=-1 tim = 1422682994330830
WAIT #5: nam = 'DB file sequential read 'ela = 12349 file # = 7 block # = 5184 blocks = 1 obj #=-1 tim = 1422682994343291
WAIT #5: nam = 'DB file sequential read 'ela = 11874 file # = 5 block # = 8645 blocks = 1 obj #=-1 tim = 1422682994355283
WAIT #5: nam = 'DB file sequential read 'ela = 4925 file # = 5 block # = 8595 blocks = 1 obj #=-1 tim = 1422682994360323
FETCH #5: c = 4999, e = 165865, p = 15, cr = 18, cu = 0, mis = 0, r = 0, dep = 1, og = 4, tim = 1422682994360535
ORA-00704: bootstrap process failure.
ORA-00704: bootstrap process failure.
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 20 with name "_ SYSSMU20 $" too small

By dumping the above blocks separately, we find that file 1 block 122 has a small problem, as shown below:


Seg/obj: 0x12 csc: 0xd6c. 1abf4d18 itc: 1 flg:-typ: 1-DATA
Fsl: 0 fnx: 0x0 ver: 0x01
 
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x003f. 02e. 001cc47d 0x01c012f4. 8562.29 -- U-1 fsc 0x0000. 1abf4d19
 
Data_block_dump, data header at 0x8b3da44
====================
Tsiz: 0x1fb8
Hsiz: 0xea
Pbl: 0x08b3da44
Bdba: 0x0040007a
76543210
Flag = --------
Ntab = 1
Nrow= 108

Copy the block to the file system by script. After bbed is modified, copy it back to asm diskgroup.
Then, the scn is pushed forward again, and the database can be opened smoothly.
It should be noted that although the database is opened, there are still a lot of aftercare work in the future. For example, if dbv finds that undo has bad blocks, it is necessary to reconstruct undo; check whether alert log is accompanied by other errors.
Secondly, for a forced open database, we recommend that you use the mos script to check whether there is any exception in the data dictionary. If there is a significant exception in the data dictionary, it is usually necessary to re-build the database through logic export; otherwise, you do not need to recreate the database.
I think there is no final conclusion about whether to re-build the database. To ensure security, we usually recommend that you re-build the database. You can also consider re-building the database when the database is very small. Otherwise, you should check the database for warnings, or if the database runs for a period of time without any other exceptions, you do not need to recreate the database.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.