RAID5 two blocks of hard drive drop, inside is how the database recovers data

Source: Internet
Author: User

"RAID Data Recovery failure description"

Huawei S5300 Storage, storage for 16 FC drives, the entire storage space consists of a RAID5 disk array of 450GB FC disks (including a hot spare). The storage of the RAID5 array 3rd HDD offline due to unknown reasons, hot spare disk on-line and data synchronization, synchronization completed to 50% or so when the 8th disk due to unknown reasons offline, synchronization failure, RAID array paralysis, upper LUN failure, need to the storage RAID array for data recovery.

"RAID Data recovery process one: Detecting all RAID Array Disks"

All disks in the raid (including normal and dropped disks) are physically detected to determine whether there is physical failure of the offline disk, there is a physical failure of the test result of 3rd, and none of the other including 8th is physically faulted.

"RAID Data recovery Process II: Backup RAID array all disks"

After physical detection, use the DD command or data Recovery tool to mirror all the disks into a file backup (this is done to restore data in the mirror and protect the user's source data).

"RAID Data recovery process Three: Analyzing the array RAID structure"

The engineer analyzes all the disks in the RAID array to find the hot spare (the hot spare is theoretically distinct from the other data disks and can be distinguished directly) because RAID is striped, all data in the array is stored in a certain pattern, So the engineer analyzes the distribution of the database pages in the raid on each physical disk, and calculates the basic information about the RAID group's disk order, data orientation, stripe size, and so on.

"RAID Data recovery process four: Find out which bad disks are being synced in the RAID array"

Based on the raid information analyzed, an attempt was made to virtualize the original RAID group through a RAID virtual program. However, the entire RAID group was dropped by two disks and a piece of hard disk data was corrupted synchronously. Careful analysis of the data on each piece of hard disk, found that a piece of hard disk on the same strip of data and other hard disk is obviously different, so it is preliminary to determine that the hard disk may be synchronized with the damaged hard disk, through the RAID verification program to verify the stripe, so you can clearly be synchronized damage to the disk.

"RAID Data recovery process four: Analyzing RAID array LUN information"

Analyzing LUN Information First virtual out of the RAID array state, analyzing the LUN's allocation status in the array, analyzing the data blocks allocated by the LUN, and then exporting data from the LUN based on the data map.

"File system data Recovery process: Parsing EXT3 file System"

Due to the use of the hot spare virtual RAID structure, the EXT3 file system can not mount properly, so only the Oracle database files can be extracted, using the self-developed file system resolver to file system parsing, export Oracle database files, and transfer the database files to the database Engineer for verification and verification.

"Database repair Process one: Detect data file integrity"

Use the Oracle database File Detection Tool to detect the integrity of each database file and to find errors. The use of Oracle Database Detection Tool (more stringent), the discovery of some database files and log file errors, the system and the Sysaux table space has more than 100 bad blocks, 3 control files are bad block many bad blocks, control files are all corrupted Eschoolspace table space of 3 files of the bad block more, up to 1000; UNDOTBS02 lost; the database engineer fixes such files,
Figure One:

Figure II:

"Database repair Process one: Repair Database"

We created the control file, created the Undo table space, and started the database to mount. The system data file bad block makes the database not open. The various implicit parameters cannot bypass the system's bad blocks, and build the database environment. Use the DMP file to restore the database. Use the import after March 9, all error, about only about 10G of data import.
Might

"Data validation: Data recovery succeeded"

With the user side, start the Oracle database and install the OA client on the local virtual machine. The data records are verified through the OA client, and the user arranges for remote validation by different department personnel. Validation passed, data recovery successful

RAID5 two blocks of hard drive drop, inside is how the database recovers data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.