Emergency fault handling method for Oracle database system

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

oracle| Data | database

An Oracle physical structure failure is a variety of database failures that result from corruption of each physical file that constitutes a database. These failures may be caused by a hardware failure or by a human error. So we first need to determine the cause of the problem, if it is a hardware failure first to solve the hardware problem. In the absence of hardware problems, we can follow the following processing side to further processing.

Control file corruption:
The control file records important configuration information about Oracle, such as database name, character set name, individual data files, location of log files, and so on. Controlling file corruption can cause the database to shut down abnormally. Once the control file is missing, the database does not start, which is a more serious error.
Can be checked by  the  of the badger to raise the ㄎ Phoenix 鸹 steal said head 刂 PU raise h cover guilt raise  wave? Oracle_base/admin/bdump/alert_orcl.ora.

To corrupt a single control file:
1. Make sure that the database is turned off without using the following command to close the database:
Svrmgrl>shutdown immediate;
2. View initialization file $oracle_base/admin/pfile/initorcl.ora, and determine the path to all control files.
3. Overwrite the wrong control file with the other correct control file with the operating system command.
4. Restart the database with the following command
svrmgrl>startup;
5. Use the appropriate method for database full backup.

Damage to all control files:
1. Make sure that the database is turned off without using the following command to close the database:
Svrmgrl>shutdown immediate;
2. Restore the most recent control files from the corresponding backup result set. For points that do not take a library backup, the closest control file backup can be recovered directly from the tape to the appropriate directory, and the appropriate Rman script is used to restore the most recent control file for points with a library backup.
3. Use the following command to create a script that produces a database control file:
Svrmgrl>startup Mount;
Svrmgrl>alter database backup controlfile to trace noresetlogs;
4. Modify the trace file generated in step three to copy and modify some of the statements about creating the control file so that it can reflect the latest database structure. Assume that the resulting SQL file name is createcontrol.sql.
Attention:
The exact path of the trace file can be determined by viewing the $oracle_base/admin/bdump/alert_orcl.ora file after performing the 3rd step.
5. Recreate the control file with the following command:
Svrmgrl>shutdown abort;
Svrmgrl>startup Nomount;
Svrmgrl> @createcontrol. sql;
6. Use the appropriate method for database full backup.

Redo log file corruption:
The database of all the increase, deletion, change will be recorded in the Redo log. If the currently active redo log file is corrupted, it causes the database to shut down abnormally. Inactive redo logs end up with log switching becoming active redo logs, so corrupted inactive redo logs can eventually cause the database to terminate abnormally. Each group of redo logs in Ipas/mswitch has only one member, so in the following analysis only the corruption of the Redo log group is considered, regardless of the corruption of individual redo log members.

Determine the location of the corrupted redo log and its status:
1. If the database is in a usable state:
SELECT * from V$logfile;
Svrmgrl>select * from V$log;
2. If the database is terminated abnormally:
Svrmlgr>startup Mount;
Svrmgrl>select * from V$logfile;
Svrmgrl>select * from V$log;
Where the logfile state is invalid to indicate that the log file has been corrupted; log state is inactive: The Redo log file is inactive; active: Indicates that the Redo log file is in the active state ; Current: Indicates that the redo log is the log file that is currently in use.

The corrupted log file is not active:
1. Delete the corresponding log group:
Svrmgrl>alter database drop logfile Group group_number;
2. Re-create the corresponding log group:
Svrmgrl>alter Database Add log file group Group_number (' Log_file_descritpion ', ...) ) Size log_file_size;

The corrupted log file is active and is not in the current log:
1. Clear the corresponding log group:
Svrmgrl>alter Database Clear unarchived logfile Group Group_number;

The corrupted log file is the current active log file:
To clear the appropriate log group with a command:
Svrmgrl>alter Database Clear unarchived logfile Group Group_number;
If the purge fails, you can only do incomplete recovery based on Point-in-time.
Open the database and use the appropriate method for full database backup:
Svrmgrl>alter database open;

Partial data file corruption:
If the corrupted data file belongs to a non-system tablespace, the database can still be open for operation, except that the corrupted data file is inaccessible. In this case, the corrupted data file can be recovered separately in the database open state. The database system terminates abnormally if the data file in the system table space is corrupted. The database can only be opened in Mount mode before the data file is restored. You can view the database log file to determine whether the currently corrupted data file belongs to the system tablespace.

Corrupted data file in non-system tablespace
1. Determine the name of the damaged file:
Svrmgrl>select name from V$datafile where status= ' INVALID ';
2. The corrupted data file is in the offline state:
Svrmgrl>alter database datafile ' datafile_name ' offline;

3. Restore the most recent backup of this data file from the corresponding backup result set. Points that do not take a library backup can be recovered directly from the tape, and the corresponding Rman script is used for points with a library backup.
4. Recover Data files:
Svrmgrl>alter database recover datafile ' file_name ';
5. Make database file online:
Svrmgrl>alter database datafile ' datafile_name ' online;
6. Use the appropriate method for database full backup.

Corrupted data file for the system tablespace:
1. Start the database by Mount method
Svrmgrl>startup Mount;
2. Restore the most recent backup of this data file from the corresponding backup result set. Points that do not take a library backup can be recovered directly from the tape, and the corresponding Rman script is used for points with a library backup.
3. Restore system table Space:
Svrmgrl>alter database recover datafile ' datafile_name ';
4. Open the database:
Svrmgrl>alter database open;
5. Use the appropriate method for database full backup.

Table Space Corruption:
If the system table space is corrupted, the database can still be open for operation, but the corrupted tablespace cannot be accessed. This allows for a separate recovery of the corrupted tablespace in the case of a database open state. The database system terminates abnormally if the system table space is corrupted. The database can only be opened in Mount mode, and then the table space is restored. You can view the database log file to determine whether the currently corrupted tablespace is the system tablespace.

Non-system table space corruption:
1. Place the damaged tablespace in a offline state:
Svrmgrl>alter tablespace ' tablespace_name ' offline;
2. Restore the most recent backup of this table space from the corresponding backup result set. Points that do not take a library backup can be recovered directly from the tape, and the corresponding Rman script is used for points with a library backup.
3. Restore the tablespace:
Svrmgrl>alter database recover tablespace ' tablespace_name ';
4. Make Table space Online:
Svrmgrl>alter tablespace ' tablespace_name ' online;
5. Use the appropriate method for database full backup.

system table space is corrupted:
1. Start the database by Mount method
Svrmgrl>startup Mount;
2. Restore the most recent backup of the system table space from the corresponding backup result set. Points that do not take a library backup can be recovered directly from the tape, and the corresponding Rman script is used for points with a library backup.
3. Restore system table Space:
Svrmgrl>alter database recover tablespace system;
4. Open the database:
Svrmgrl>alter database open;
5. Use the appropriate method for database full backup.

All files in the entire database are corrupted:
The corruption of all the files in the entire database typically occurs when a shared disk array fails to recover from a disaster, in which case only the database can be recovered. If the archive directory of the database has also been lost, the database will not be able to do a full recovery, with the loss of user data.

No site with a library backup:
1. Unpack the various files from the tape to the appropriate directory with the most recent backup.
2. To open the database by Mount method:
Svrmgrl>startup Mount;
3. Restore the database:
Svrmgrl>recover database until cancel;
4. Open the database:
Svrmgrl>alter database open resetlogs;
5. Use the appropriate method for database full backup.

Live with a library backup:
1. Open the database in Nomount mode:
Svrmgrl>startup Nomount;
2. Soft recovery of database through the corresponding Rman script.
$rman CMDFILE=HOT_DATABASE_RESTORE.RCV
3. Open the database:
Svrmgrl>alter database open resetlogs;
4. Use the appropriate method for database full backup.

There are some classic emergency processing scenarios with the most recent database full cold backups:
Data files, archive redo logs and control files are lost or corrupted at the same time:
Status when no new archives:
Conditions and assumptions: New archive log (s) has not been generated since the last mirrored backup; Archivelog Mode; Mirrored (cold) copies of DataFile (s) and control file (s) with synchronization
Recovery steps:
1. CC The mirrored copy of DataFile (s) and control file (s) back to the original location:
$ cp/backup/good_one.dbf/orig_loc/bad_one.dbf
$ cp/backup/control1.ctl/disk1/control1.ctl
2. Start the database with the Mount option:
$ svrmgrl
Svrmgrl> Connect Internal
Svrmgrl> Startup Mount
3. Restore the database with the old control file:
svrmgrl> Recover database using Backup controlfile until cancel;
Media Restore Complete
(Must cancel immediately)
4. Reset the logfiles (cannot be omitted for startup):
svrmgrl> ALTER DATABASE open resetlogs;
5. Close the database and do a whole-store cold backup.

[1] [2] Next page

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Emergency fault handling method for Oracle database system

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Emergency fault handling method for Oracle database system

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support