Dell R710 Series servers (for VMware virtual hosts), Dell MD 3200 Series storage (for virtual machine files), VMware ESXi version 5.5, the accidental power outage that prevents a virtual machine from starting properly, When you view the configuration file for a virtual machine, you find that the configuration files for this virtual machine are all missing except for the disk files. The XXX-FLAT.VMDK disk file and the XXX-000001-DELTA.VMDK snapshot file also exist at this time. After looking for a VMware engineer to diagnose, try creating a new virtual machine to resolve the failure, but find that ESXi storage is running out of space. As a result, the XXX-FLAT.VMDK disk file under the failed virtual machine is deleted, and the ESXi storage has more than 200 grams of space left, and then the VMware Engineer has re-built a 40G virtual machine and assigned a fixed-size virtual disk, Windows Server 2008 (Virtual machine operating system), database application environment SQL Server 2008 database server (Manage Macro bridge and Sophie two sets of application databases), virtual machine disk capacity 200G data Disk (compact mode) + 160G snapshot data disk.
Fault analysis
1. Backup Data
The VMFS volume in the rd220i storage that is mounted on the VMware VSphere client is unloaded in the normal manner. The VMFS volume on the rd220i storage is then connected to the backup server by means of a network cable, followed by a professional tool to mirror the entire VMFS volume to the prepared backup space as a sector to ensure the customer's data is secure, and subsequent analysis and recovery operations are performed on the backed up data.
2, analyze the cause of the failure
Careful analysis of the underlying data of VMFS volumes found that the sudden power outage of the ESXi host resulted in the destruction of directory entries under the failed virtual machine directory, but this destruction does not affect the virtual machine's important data, but it destroys the file directory entries, can be solved by artificial repair. When a file is deleted manually, the corresponding data area index of the catalog item is cleared, and the actual data of the deleted file is not affected. This situation can be fragmented and merged in the VMFS volume free space based on the file system in the deleted virtual disk file and the file type on the virtual disk, and eventually the deleted virtual disk file can be recovered. However, in both cases, a new virtual machine was created and a virtual disk was assigned. After careful analysis, it is found that the allocated 40G virtual disks are all zeroed (the type of disk created when the virtual disk is created) and that the disk space occupied by this new virtual machine is zeroed out. This portion of the space will not be recoverable if the new virtual disk occupies the space freed by the deleted virtual machine disk.
such as: (Is the directory entry area of the failed virtual machine)
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M00/8C/76/wKioL1htv93QkCLqAATLLZ_6CoU621.jpg-wh_500x0-wm_3 -wmp_4-s_3571207353.jpg "title=" 1.jpg "alt=" Wkiol1htv93qkclqaatllz_6cou621.jpg-wh_50 "/>
Direction of implementation
1. Implementation direction One: Recover deleted vmdk files
Depending on the file system in the deleted virtual disk file and the file type in the virtual disk, fragment matching and merging in the free space of the VMFS volume, eventually recovering the deleted virtual disk file, and then using the snapshot merge program to merge the snapshot files and the recovered virtual disk files into a full virtual disk file, Then use the Professional File system interpretation tool to interpret all the files in the virtual disk file.
2, Implementation Direction II: Restore MSSQL Database File
If the effect of direction one implementation is not ideal, then according to the structure of the SQL Server database file, the data region that conforms to the SQL Server page structure in the VMFS volume free space can be counted, analyzed and aggregated, resulting in a normal use. MDF-formatted files.
3, Implementation Direction Three: Restore MSSQL database backup file
Since the database is being backed up every day, although an incremental backup every day, all backups are made 15 days at a time. However, if there are some databases that cannot be recovered after the implementation of the above two schemes, the database can only be recovered by recovering the backup files. Based on the structure of the backup file. bak, the data regions in the VMFS volume free space that conform to the structure of the SQL Server backup file are counted, analyzed, and aggregated, resulting in a file that can be imported normally into the. bak format of the SQL Server database.
Recovery process
1, direction one implementation process
According to direction one of the underlying analysis, based on the structure of VMFS volume and delete the virtual disk file system information, in the bottom of the free space to scan the region to delete the virtual machine disk, and statistics on the number and size of the size of the deleted virtual disk. Then, based on the information from the file system on the virtual disk, these scanned fragments are arranged together, the result is that there are a lot of fragments missing in the middle, re-scanning the missing fragments carefully, and discovering that the fragments are not actually found. The scanned fragments are then reconstructed in the original order, leaving blank for the fragments that were not found. Next, use the virtual disk snapshot program to merge the reorganized parent and snapshot disks to generate a new virtual disk. Then use professional tools to explain the file system in the virtual disk, due to missing a lot of data, file system interpretation process reported many errors, prompting some file corruption.
The file systems explained are as follows:
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M01/8C/7A/wKiom1htwBXQEBudAAKcbB04Q3Y373.jpg-wh_500x0-wm_3 -wmp_4-s_2574627434.jpg "title=" 2.jpg "alt=" Wkiom1htwbxqebudaakcbb04q3y373.jpg-wh_50 "/>
After parsing the file system, it was found that the original database files were not found, and the directory structure of both the macro-bridge backup and the Sophie Backup was normal. However, the database importer prompts an error when attempting to import the backup into the database.
The partial directory structure of the macro bridge backup and Sophie Backup is as follows:
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/8C/7A/wKiom1htwP3CttpTAADxhfQg4Ao969.jpg-wh_500x0-wm_3 -wmp_4-s_1754510253.jpg "style=" Float:none; "title=" 3.jpg "alt=" Wkiom1htwp3cttptaadxhfqg4ao969.jpg-wh_50 "/>
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/8C/76/wKioL1htwP7jg6B1AAFIMSZen_A993.jpg-wh_500x0-wm_3 -wmp_4-s_1355514650.jpg "style=" Float:none; "title=" 4.jpg "alt=" Wkiol1htwp7jg6b1aafimszen_a993.jpg-wh_50 "/>
Import. BAK file error message is as follows:
650) this.width=650; "Src=" Http://s2.51cto.com/wyfs02/M02/8C/7A/wKiom1htwSHRbbjWAAGj5TlfHbE804.jpg-wh_500x0-wm_3 -wmp_4-s_2027702564.jpg "title=" 5.jpg "alt=" Wkiom1htwshrbbjwaagj5tlfhbe804.jpg-wh_50 "/>
2, direction two implementation process
Since the original database file is not recovered in direction one, many of the backup files are not working properly. Therefore, a second set of scenarios is required to recover the database files that have not yet been recovered. Based on the structure of the SQL Server database, go to the free space to find the starting position of the database. In the structure of the database, the 9th page of the database will record the database name. Therefore, according to this feature, you can check if the header page of this database is being searched. and the database page number and file number are recorded on each page of the database, so the database scanner is written according to these characteristics, and then the program is used to scan all data fragments that conform to the database page. The scanned fragments are then re-formed into a complete MDF file in order, and the MDF verification program detects the integrity of the entire MDF file. During the whole verification process, only cl_system3.dbf and erp42_jck.dbf were found to be partially fragmented, and the remaining databases were verified successfully. The finished MDF file is as follows:
650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M00/8C/7A/wKiom1htwVHgRjfmAAH8_5ZzqGk475.jpg-wh_500x0-wm_3 -wmp_4-s_3604252543.jpg "title=" 6.jpg "alt=" Wkiom1htwvhgrjfmaah8_5zzqgk475.jpg-wh_50 "/>
CL_SYSTEM3.DBF and ERP42_JCK.DBF because there are a lot of debris at the bottom that can not be found (preliminary suspicion may be overwritten), so the checksum does not pass. The following is a lost area of a fragment in the cl_system3.dbf file:
650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M02/8C/76/wKioL1htwWvRAJokAAMDwIa-MCY996.jpg-wh_500x0-wm_3 -wmp_4-s_778097376.jpg "title=" 7.jpg "alt=" Wkiol1htwwvrajokaamdwia-mcy996.jpg-wh_50 "/>
3, the direction of the three implementation process
Since the above two directions have been implemented, not all of the database files have been restored, and cl_system3.dbf and erp42_jck.dbf files here because of the missing part of the page causes it to not work properly. Therefore, a backup is required to recover the two database files, but after checking the backup of these two files, it is found that all of the March 30 backups of CL_SYSTEM3.DBF were not backed up due to a backup mechanism failure, and erp42_ JCK.DBF March Backup all not, only April of all incremental backups, such as: 650) this.width=650; "Src=" http://s2.51cto.com/wyfs02/M00/8C/76/ Wkiol1htwyhx91jyaaqpvigydks007.jpg-wh_500x0-wm_3-wmp_4-s_2591554806.jpg "title=" 8.jpg "alt=" Wkiol1htwyhx91jyaaqpvigydks007.jpg-wh_50 "/>
Because only a small number of pages are missing from the erp42_jck.dbf file, you can find them in an incremental backup based on the missing page number, and then fill the found pages into the erp42_jck.dbf file, which restores a portion of the missing database pages. After the final completion or missing part of the page, not normal use. However, the self-developed database resolver can successfully export the dozens of tables in the erp42_jck.dbf file that are more important to the user and successfully import into the new database.
Validating data
Build the same database environment as the original environment (SQL Server 2008) in the local server, connect to the authentication server through the TeamViewer Remote tool, and install the upper-level macro-bridge application software. Again by the customer to arrange the project verification database is complete, after careful verification, the database recovery is basically no problem. The upper application can run normally, the data record is basically not missing, the data recovery is successful.
The database was successfully mounted, such as:
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M01/8C/7A/wKiom1htwZui6Z8TAAOt7EZ7huk047.jpg-wh_500x0-wm_3 -wmp_4-s_4288288665.jpg "title=" 9.jpg "alt=" Wkiom1htwzui6z8taaot7ez7huk047.jpg-wh_50 "/>
Recovery summary
Because of the knowledge of the underlying structure of the SQL Server database and the experience of dealing with similar types of failures. So the whole recovery process is fairly smooth. The database is restored normally, and the entire data is restored successfully, with no problem being verified.
Recovery method for deleting VMware virtual machine vmdk files by mistake