Part I: Data recovery programme
"User Unit" a pharmaceutical company
"Fault description"
IBM X3850 Server, 5 73G SAS hard Drive, where 4 pieces make up one RAID5, the other as a hot spare (hot-spare), 3rd disk is already offline, but the hot spare is not automatically activated rebuild (unexplained), after 2nd disk offline, RAID crashes.
Operating system for Linux Redhat 5.3, Application systems for the architecture of an Oracle OA, the data is important, time is very urgent. Because Oracle no longer provides follow-up support for this OA system, users require data recovery + operating system recovery as much as possible.
"Preliminary Examination conclusion"
Hot spares is completely not enabled, the hard disk has no obvious physical failure, no obvious synchronous performance. Data can usually be recovered
"Recovery Scenarios"
1, the protection of the original environment, shut down the server to ensure that the recovery process no longer open the server.
2, the failure of the hard drive marked serial number, to ensure that the slot can be fully recovered after the position.
3. Mount the failed hard drive to a read-only environment, complete mirroring for all failed hard drives (reference < How to complete a full mirrored backup of the disk >). After the backup is completed, the original failure disk is returned and the recovery operation is not related to the original failure disk until the data is confirmed.
4, the backup disk for RAID structure analysis, get its original RAID level, stripe rules, stripe size, calibration direction, meta area.
5, according to the raid information to build a set of virtual RAID5 environment.
6, the virtual disk and file system interpretation.
7, detect whether the virtual structure is correct, if not correct, repeat 4-7 process.
8, make sure the data is correct, fetch data according to user request. If you are still using the original disk, make sure that you have made a full backup of the original disk, rebuild the raid, and then make a fetch. When you fetch the operating system, you can use Linux LiveCD or win PE (usually not supported), or you can install a fetch operating system on the failed server with a different hard disk, and then perform sector-level fetches.
9, after the transfer of data, the North Asian Data Recovery Center to extend the custody of the data for 3 days to avoid possible omissions.
"Recovery Cycle"
Backup time, about 2 hours.
Explain and export data time, about 4 hours.
Fetches the operating system for about 4 hours.
"Recovery costs"
Slightly...
Part II: Data Recovery and system recovery process
1, the original hard disk for a full mirror, mirror found that 2nd disk has 10-20 bad sectors, the rest of the disk, there is no bad way.
2, Analysis structure: The best structure is the 0,1,2,3 plate sequence, the lack of 3rd disk, block size 512 sectors, backward parity (ADAPTEC), the structure of the following figure:
See more highlights of this column: http://www.bianceng.cn/database/storage/