(OneFS) data recovery case: how to deal with hacker intrusion, leading to the deletion of important data, onefs hacker intrusion
[Fault description]
A university's important data in its "Teaching System" is deleted due to hacker intrusion. These include the MSSQL database in the "Teaching System" and a large number of MP4, ASF, and TS video teaching files. The overall Storage Architecture Uses EMC high-end network NAS (Isilon S200), with three nodes. Each node is configured with 12 3 tb stat Hard Disks without SSD. All data is divided into two parts. One part of the data is a vmwarevm (WEB Server), shared to the ESX host through the NFS protocol, and the other part of the data is a video teaching file, shared to virtual machines (WEB servers) through the CIFS protocol ). Hackers only delete all data shared by NFS (that is, all virtual machines), while data shared by CIFS is not deleted.
[Data backup]
To avoid secondary damage to data due to data security, all hard disks must be backed up. However, because there are too many disks (12 disks per node, 36 disks per 3 nodes), and the capacity of a single disk is too large (3 TB per disk, 108 TB in total ), therefore, the backup cycle is long. The end customer decided to back up the existing data in the storage only once, and back up the data again from North Asia to ensure the security of the existing data.
[Data Analysis]
After all data is backed up, shut down Isilon on the Isilon web management interface. Then, tag all hard disks on all nodes and extract them to the data recovery platform to analyze the data on all hard disks.
At this point, I will briefly introduce the storage structure of Isilon, which uses the Distributed File System OneFS. In an Isilon storage cluster, each node is a single OneFS file system. Therefore, Isilon supports horizontal scaling without affecting the data in use. When the storage cluster is working, all nodes provide the same functions, and there is no master-slave relationship between the node and the node. When a user stores files in a storage cluster, the OneFS layer stores the files into k segments and saves them to different nodes, on the node layer, K segments are divided into 8 K segments and stored in different hard disks of the node. The Indoe information, directory items, and Data MAP of user files are stored in all nodes respectively. This ensures that the user can access all data regardless of the node. Isilon will allow users to select the corresponding storage redundancy mode during initialization. Different redundancy modes provide different data security levels (the default three nodes adopt the N + 2:1 mode ).
Because the customer data is deleted, there is no need to worry too much about the storage redundancy level. It is important to analyze whether the file Indoe and Data MAP change after the file is deleted. After communicating with the customer, the deleted Virtual Disk Files are 64 GB or above, and there are no other types of large files in the storage. Write a program to scan all the file Indoe and scan all the Indoe with 64 GB or higher file sizes. Then, we carefully analyzed the Indoe and found that the position of the Data MAP recorded in the Indoe is no longer normal, and the Indoe on all nodes is the same. After careful analysis of Inode, it is found that the data MAP of a large file has multiple layers (tree structure), and the unique ID of the file will be recorded in the Data MAP. Therefore, we can try to find the Data MAP at the bottom of the file. We were lucky enough to traverse and track the Data MAP at the bottom of the file, and found that the data MAP at the lowest layer was still there.
[Data recovery]
Write a program to retrieve the unique ID of the file from the Inode of the file, and then aggregate all data maps that match the ID. Sort by VCN number in the Data MAP and find that the first 17088 data maps of each file do not exist, this means that the first 17088 pieces of data in each file cannot be recovered ).
After careful calculation, it is found that the lost MAP items contain less than 1 GB of data, and the deleted files are all vmdk files of virtual machines, all of which are NTFS file systems, the NTFS file system's MFT is basically located at 3G, that is to say, you only need to manually forge an MBR and DBR in the header of each vmdk file to explain the data in vmdk (I don't know if it's a coincidence! It's a coincidence !). Quickly write the code, explain the MAP of the scanned data, and export the data according to the order of VCN. If no MAP is available, it is retained to zero.
After Continuous testing, the program was finally compiled and a vmdk file was first prepared. I was surprised by the result. The exported vmdk file is smaller than the actual situation, and the MFT location in the vmdk is not consistent with its own description. Is it a program problem? Or is the Data MAP itself damaged? Manually randomly verified several MPA findings can point to the data zone, and the program explains the MAP method is no problem. When I was puzzled, I suddenly thought that Isilon's high-end storage could not have no sparse files! Otherwise, how much space will be wasted! Immediately verified Based on the Data MAP and found that the files are sparse.
Modify the code and re-export the vmdk. This time, the vmdk size conforms to the actual size, and the MFT location is also in the corresponding position. Manually forge an MBR, partition table, and DBR, and use the File System interpretation tool developed in North Asia to explain its file system and export the database and video files in vmdk.
After verifying that the database and video files in the vmdk are correct, batch export all important vmdk files and manually modify each vmdk file one by one.
[Data acceptance]
After all important data recovery is completed, the customer arranges engineers to perform integrity and accuracy tests on all the recovered data, and the verification will take up to one day. The data is finally determined to be completely correct, and the data is restored successfully.
[Data Recovery Summary]
Although the entire recovery process was tortuous, the results were satisfactory.