To save the case of ORA-600 4000 error caused by forced recovery

Source: Internet
Author: User

To save the case of ORA-600 4000 error caused by forced recovery

 

I wrote in a book:

It is more terrible than ignorance.

An unexpected disaster may occur to the database. Therefore, if you are not sure about the possibility of accessing a database, please be cautious and avoid putting yourself and your database in an unknown danger. The most basic thing is that if you want to perform some dangerous maintenance operations, you should back up the data before.

In the face of data, the ignorant cannot be fearless.

Disaster description

On July 6, December 30, 2011, a core database of a carrier customer failed to start. After the engineer's hasty intervention, the database encountered a disaster that could not be started.

The whole process is like this.

1. The customer accidentally deleted a data file and reported an error in the database.

2. The third-party service provider intervened to try to clear the missing file.

3. The engineer chooses to recreate the control file and does not perform full recovery in the current database.

4. Use the implicit parameter to forcibly reset the log to open the database.

5. A series of 600 internal errors occurred in the database and the database cannot be started.

6. Check the backup and find that the backup has not been performed recently.

7. disaster formation.

This case was not complicated and lost a data file. If there is no backup, you can perform the following operations.

1. Restore from the file descriptor, similar to the case in the previous chapter, before shutting down the database.

2. If the database is closed and cannot be recovered and the loss is acceptable, discard the file offline.

3. If the file is very important, you can restore it at the storage level.

The specific measures depend on the importance of the data, but rashly taking actions will eliminate some possibilities. For example, if you disable the database, Action 1 becomes invalid. If you perform file replication at the storage level and overwrite the storage content, Action 3 becomes invalid.

Therefore, it is particularly important to make correct judgments on different data disasters and find appropriate technical support. In this case, the above three possibilities are all covered up, and the database goes in the wrong direction and goes far in the wrong direction.

Case warning

After analyzing the entire process of this case, we have summarized the following lessons.

1.Ignorance is more terrible than ignorance.

In this case, the technical staff apparently had insufficient technical knowledge about Oracle and rashly took wrong measures and steps, which eventually led to a 108,000 difference between the database status and the user's original intention.

The three possibilities mentioned above become ineffective in the face of a series of misoperations, and the database must undergo a catastrophic test of inconsistency.

In this case, the technical staff should remember that before starting to handle the fault, they should follow such an important code: to protect the site, at least not to make the situation worse.

After protecting the site, we will attempt to recover the data that is destructive or has little grip. If we fail to back up the data in a timely manner, we have to recover the data in the current environment, therefore, engineers must have a high quality and accurate judgment, and be clear about the possible consequences of each command and step and the subsequent handling methods.

Every engineer should think about it: if the database cannot be started after a recovery attempt, what else can we do? Think too much about it to be responsible to users and themselves. Ignorance and fearlessness should not be attempted in the production environment.

2.Do not go beyond your own capabilities

In the field of technology, if the technical means and methods you use are beyond the scope of your capabilities and may have unpredictable consequences, it is best not to take such an adventure, taking risks means being irresponsible to users and suffering from their own troubles. If you decide to take the risk, perform similar tests in your own testing environment at least to identify the possible situations. Such tests will not take too much time.

In this case, the internal parameter usage of the Oracle database is inappropriate, resulting in worse problems. In fact, the use of internal parameters requires a very clear understanding of its meaning and possible consequences, and there are solutions to the resulting consequences.

Users should be able to properly supervise and confirm third-party service providers to ensure that unexpected consequences are not rashly introduced into the database.

3.Back up data before irreversible operations

Before performing an irreversible operation, you need to back up the data and make sure that the data can be rolled back to the previous state. This is also the concept of protecting the site.

In addition to database backup, backup should be performed before some modification operations. For example, for the backup of specific data blocks, file headers, ASM disk groups, log files, and so on, these backups can help us at critical times.

The log file must be highlighted here, because when force resetlogs is enabled, the log file will be cleared and refreshed. If no backup is performed, the content will always be lost, and we know that, sometimes you can retrieve data to the maximum extent through log parsing, so the backup of log files is also very important.

4.Managers need to participate in decision-making

For important data environments, managers should communicate with technical personnel even if they do not know detailed technical details before performing important operations, listen to technical solutions, operation plans, implementation steps, on-site security, and rollback plans.

Although managers may not understand the technical details, their overall situation and careful thinking should ensure technical decision-making. Managers should also perceive whether technical personnel understand the details, have a clear understanding of the solution, and have no confidence in the implementation. Communication, questioning and questioning are also a kind of promotion for technical personnel to improve their solutions and think carefully.

With this stage, many risks can be avoided. Therefore, the judgment and decision-making of managers should also be an important part of data management.

 

Protecting Data and protecting the site. It is the responsibility of users and engineers to think carefully and make careful decisions when handling faults. Only by making joint efforts can we ensure the persistent security of the data environment.

 

This article is excerpted from Oracle DBA Manual 4: Data Security Alerts

Gaiguoqiang

Published by Electronic Industry Publishing House

Book details: http://blog.csdn.net/broadview2006/article/details/7744623

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.