Data disaster recovery due to mutual exclusion failure in optical fiber sharing storage under sun platform

Source: Internet
Author: User
Tags file system table name root directory backup

[Data recovery failure description]

Two SPARC Solaris systems share the same storage through a fibre switch, intended to be used as a cluster, but improperly configured, the two servers are not very good for storage mutex, designed to: normal a server normally work, when a server down, turn off a, open B take over the service.

Accidental opportunity, a manager opened B server, found that B server connected a large group of disks (in fact, that shared storage), because B server has been idle, the administrator thought the disk is also idle, so the whole disk of a partition done NEWFS.

A server quickly alarm and downtime, restart a server, found that all the file system can not mount, after the implementation of FSCK, most of the partition data are repaired successfully, only in B-machine NEWFS file system results are not ideal, the root directory only a lost+found folder, There are a lot of digital labels in the file.

The fault file system stores two sets of Oracle instances, the original structure is UFS, and approximately 200~400 data files need to be restored.

[Data Recovery analysis]

There are many cases of sharing conflicts between fiber optic devices, which are due to the flexibility of fiber exchange. In this case, the A and B machine at the same time to the UFS of the stand-alone file system access is very bad, two servers are taken for granted the exclusive way of storage management, a machine normal management of the file system in fact the bottom layer has been done by B machine File system initialization, The data that a machine writes to the file system from the buffer also destroys the results of the B-machine initialization.

B-Machine Newfs in effect directly on the original file system, but this example and simple newfs will be somewhat different, before a machine downtime, a small amount of data (including metadata) back to the file system. NEWFS If the structure is the same as before, the data area will not be destroyed, and if a small amount of metadata exists, some of the data recovery possibilities exist.

UFS is a traditional UNIX file system, which is cut with block groups, and each group allocates several fixed inode areas. File System NEWFS, if the structure is the same as before, the file system most important Inode area will be all initialized, before the retention, inode management of all the important attributes of files, so simply from the file system perspective, data recovery is very difficult.

Fortunately, the structure of Oracle data file is very strong, at the same time UFS file system still has certain storage regularity, can through the reorganization of Oracle data file, the data file, control file, log and so on are restored directly. Also, the Oracle data file itself has a table name description, or the original disk file name can be inferred backwards.

[Data recovery process]

Make a DD backup of the failed file system.

Complete Oracle data structure analysis and reorganization for the entire image file.

For the partial structure is too chaotic, cannot reorganize the file, refer to the UFs file system structure characteristic to carry on the auxiliary analysis.

Recover the database on the Oracle platform using recovered data files and control files.

[Data Recovery conclusion]

All databases are fully restored.

PostScript

Fsck is a very lethal operation, preferably a backup (DD) before fsck.

The mutual exclusion of optical storage is a very many data disaster reasons, the scheme should be carefully deployed and implemented.

This article is from the "Tommy (Data Recovery)" blog, please be sure to keep this source http://zhangyu.blog.51cto.com/197148/152186

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.