Replace the magnetic disk, database downtime

Source: Internet
Author: User

Yesterday IBM disk array replaced a damaged hard drive (11th HDD), the manufacturer changed the hard drive data synchronization failed (12 disk = = "11 disk), and found that the hot spare in the magnetic array does not show synchronization failure, a few hours later connected to the storage of the relevant Oracle database RAC1 began a large number of errors, Node 2 Direct DG cannot mount hangs up, Rac1 log:

Reread of RDBA:0X03801DBB (file XX, block xxxx) found same corrupted data
--XX and XXXX Each error is not the same

Restart RAC2 cannot read the Controlfile file. Restart Rac1,mount not on, has been stuck in

smon:enabling TX Recovery

will also appear

MMNL absent for 1211 secs; Foregrounds taking over

Check the trace file to see if the arch process timed out or was a hard drive problem.


To view the previous backup log, the backup is local and normal. DD Test read shared storage for SDE and SDD hard drives, No. Simply Rac1 restart, after the restart of the machine, Oracle normally open, hurriedly rman backup, it should be the host and storage connectivity issues.

seriously suspect that there is a bad fast in the 12 disk, a group of RAID volumes that have been working for a long period of time to produce bad lanes in areas that are not read or that have been read in the past is good because they have not been read and written, so the controller appears to be fine. The most direct hazard of this bad track is in the rebuild process. When a piece of physical hard problems, generally consider rebuild,rebuild is to do full synchronization, those bad word will be read, this time rebuild can not be completed, the new disk can not be on-line, because the old disk found a bad way, the situation of the upper and lower dilemma.

(1) Backup before replacing disk is the most important

(2) Do not make a raid on all the disks together. RAID5-"Log raid 10--" datafile, not very good

(3) Magnetic array problem, can not solve the backup after the restart test

(4) The words of hardware manufacturers can not be all-faith, according to the phenomenon has its own judgment

If you determine the offline hard drive, you can restore the data by forcing the line (some controllers have no option, there is no way out)

In addition, the synchronization of the host and the magnetic array connection should be a problem, look at the Application log synchronization failure after the slower, more slowly, the last 4 children can not read and write problems


Replace the magnetic disk, database downtime

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.