Recovery Policy for deleted files in UNIX systems

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Author: Li Guilin Chen chaohui

Unlike DOS/Windows, it is difficult to restore a UNIX file after it is deleted, which is determined by the unique UNIX file system structure. Unlike DOS/Windows, the UNIX file directory contains the complete file name, file length, and cluster number even after it is deleted (that is, the first disk block number occupied by the file) on the contrary, all its file information is described by a data structure called the I node, and the I node is cleared after the corresponding file is deleted, it is almost impossible to restore the deleted file directly. This article discusses several file recovery policies and the implementation of key steps based on actual conditions.

I. UNIX File System Structure

We know that UNIX uses a file volume as the storage format of its file system, while different UNIX systems have different file volume formats, even for different versions of the same UNIX operating system, its file systems may not be exactly the same. For example, the structure of sco unix 4.1 and 5.0 file systems is significantly different, but as long as it is a UNIX system, the basic structure of its file volumes is consistent. The analysis is as follows:

Regardless of the UNIX system or version, the file volume includes at least boot block, super block, I node table, data zone, and so on. In addition, different UNIX versions may be different. For example, sco unix system bitmap index blocks and bitmap block AIX logical Volume Tables. The special nature of these systems does not affect the recovery policies described below, so we will not discuss them here. We will only introduce the structure of standard UNIX file volumes.

1. Boot Block

Located in the first sector of the file volume, the 512 bytes are the Boot Code of the file system, which is unique to the root file system. The 512 bytes are empty for other file systems.

2. Super Block

Located in the second sector of the file system, followed by the boot block, used to describe the structure of the file system. For example, the I node length and file system size are stored in/usr/include/sys/filsys. h. The structure is as follows:

Struct filsys

{

Ushort s_isize;/* Number of data blocks occupied by the disk index node area */

Daddr_t s_fsize;/* Number of data blocks in the entire file system */

Short s_nfree;/* Number of idle blocks currently registered in the idle block logon table */

Daddr_t s_free [NICFREE];/* free block Registration Form */

Short s_ninode;/* Number of idle index nodes */

Ino_t s_inode [NICINOD];/* idle node Registration Form */

Char s_flock;/* Lock flag space */

Char s_ilock;/* node lock flag */

Char s_fmod;/* super block modification flag */

Char s_ronly;/* file system read-only flag */

Time_t s_time;/* time when the super block was last modified */

Short s_dinfo [4];/* device information */

Daddr_t s_tfree;/* Total Number of idle blocks */

Ino_t s_tinode;/* Total Number of idle nodes */

Char s_fname [6];/* file system name */

Char s_fpack [6];

Long s_fill [13];/* fill in Space */

Long s_magic;/* indicates the magic number of the file system */

Long s_type;/* New file system type */

};

3. I node table

After the I node table is stored in the super block, its length is determined by the s_isize field in the super block, it is used to describe the attributes, length, owner, group, and data block table of a file. Its data structure is in/usr/include/sys/ino. h, as follows:

Struct dinode

{

Ushort di_mode;

Short di_nlink;

Ushort di_uid;

Ushort di_gid;

Off_t di_size;

Char di_addr [40];

Time_t di_atime;

Time_t di_mtime;

Time_t di_ctime;

};

4. directory structure

All UNIX files are stored in the directory. The directory itself is also a file. The directory storage mechanism is as follows: first, the directory file occupies an index node like a common file. Second, the index node obtains the location where the directory content is stored, extract the file names and their corresponding node numbers from the content to access a file. The directory structure is as follows:

Index node number (2 bytes). (local directory) (14 bytes)

Index node number (2 bytes) .. (parent directory) (14 bytes)

Index node number (2 bytes) file name (14 bytes)

It can be seen that the file name is described by a directory, and the content and other information of the file are described by the index node.

Ii. File Deletion Process

The process of deleting a file in UNIX is very simple, that is, releasing the data blocks occupied by the index node table and file, clearing the index nodes occupied by the file, but not clearing the file content. However, the process of deleting a file is different from that of deleting a directory, and the process of deleting a file by using different commands is also different.

1. delete an object

The specific steps for deleting a file in UNIX are as follows: Release the disk data blocks occupied by the file one by one according to the address table of the file I node, then clear the corresponding nodes, and finally release the I node.

2. delete a directory

The process of deleting a directory: First delete all the files in the directory one by one, and then delete the directory. The directory itself is also a file, so the deletion method is the same as the deletion method.

3. Several Different delete commands

. Rm command

Generally, the DELETE command is described above.

. Mv command

Format: mv file 1 file 2

The process is to release the data block of file 2, change the name of file 1 to file 2, and then release the I node occupied by file 2.

.> Command

Format:> file name

If a new file is generated, the> command applies for only one I node without writing any file content. If an existing file is cleared, the data block occupied by the file is released, and the file length is cleared.

3. Recovery Policy for deleted files

To restore the deleted file, you can only post the deleted file based on the items left behind. What is left after the file is deleted? From the above analysis, we can see that: first, the content of the file is left; second, the "Site" is left ". The file recovery policy can only be analyzed from these two aspects. The following describes several recovery strategies.

1. Perform Disk Recovery on site

If the file is deleted and has not been destroyed on site (that is, the hard disk has not been written after the file is deleted), and if only one file is deleted, the file can be restored based on the system's allocation algorithm. When a file is created, the system determines the location of the data block occupied by the file based on a specific allocation algorithm. When the file is deleted, the data blocks it occupies are released and returned to the system allocation table. If a file is created again, the data block allocated by the system based on the original allocation algorithm must be consistent with the original data block occupied by the file. We know that, the extra bytes at the end of the last data block of a UNIX file are all set to 0. Therefore, you only need to call the data allocation algorithm of the system to apply for a data block in the system, because all the extra bytes at the end of the last data block of a UNIX file are 0, the end of the file can be considered as the end of the file as long as the end of the allocated data block is 0, the file length and content can be determined to restore the file. The method is as follows:

(1) apply for an index node, that is, apply to the system to create a new file name without writing any content. For example: #>/tmp/xx

(2) Call the system distributed data block algorithm getnextfreeblock () to obtain a data block number, which is recorded in an address table variable.

(3) read the data block and determine whether all its tails are consecutive 0. If not, return to (2). If not, perform (4 ).

(4) first use the system function fstat to get the I node number of/tmp/xx, and then write the address table obtained in step (2) to the address table of the index node (pay attention to inter-address issues ), calculate the file size based on the number of data blocks and the valid data length in the last part, and write the di_size field of the I node.

⑸ Write back the index node table of the system.

It should be noted that, first, the algorithms used to allocate data blocks vary according to different UNIX versions. Second, Some UNIX versions, such as sco unix 5.0, the allocation and recovery of idle data blocks are implemented using a dynamic linked list data structure, and their file recovery is easier, as long as they are searched at the end of the table in the idle linked list, I will describe it separately.

2. Restore Based on the content.

If the site has been damaged, that is, the hard disk has been written, so you have to recover it based on the content. Moreover, because UNIX is a multi-process, multi-user system, it records system logs,. sh_history, etc. every time it switches on or off the host or hardware, communication faults, etc. It is highly likely that the hard disk is damaged on site. Therefore, discussing the content-based recovery method has greater practical value. The author has obtained the following four recovery strategies for reference.

(1) keyword search

If you know that the content of the deleted file contains several bytes and the file length does not exceed one disk block, you can search for this Byte string in the entire file system, obtain the data block where a file is located and enter its block number in an I node to restore the file. The algorithm for searching a file system is simple, as shown below:

A. # df-k determine the device file name of the file system (for example,/dev/root)

B. Use the following function to search. If the search succeeds, the data block number is returned. If the search succeeds,-1 is returned. Fsname indicates the device name of the file system. For example, the/dev/root and comp () parameters are functions used to implement search conditions.

Long searchfs (char * fsname, int comp ())

{

FILE * fp;

Char buf [1024];

Long I = 0;

Fp = fopen (fsname, "r ");

While (! Feof (fp ))

{

Fread (buf, 1024,1, fp );

If (comp ()/* check whether the search criteria are met */

Return I;/* If the block number is returned successfully */

I ++;

}

Fclose (fp );

Return-1;/* if no matching block is found,-1 */is returned */

}

(2) exact length search

If you know the exact length (number of bytes) of the deleted file, you can calculate the exact length of data in the last data block of the file based on the size of a data block, all other bytes in the data block must be 0. Based on this condition, you can search the entire file system to find the data blocks that meet the condition. If multiple data blocks meet the requirements, you need to distinguish them based on other conditions. However, precise length analysis is also a policy for data recovery.

(3) content Correlation Method

If you know that there is an executable Association in the file content, such as the file checksum, or the context of the file content, you can also search the entire file system, by repeatedly searching for disk data blocks that meet the association conditions, you can restore a file.

(4) Environment Comparison Method

If you know the installation process of the file system where the file is deleted, find a completely identical machine and install the same version of UNIX and other software according to the original steps, as you can imagine, the new machine environment will be basically the same as the original environment. By comparing the content of the same file system on the two machines, you can infer the approximate location of the deleted file, at least the search range can be greatly reduced. Once the search range is small enough, you can use one-by-one observation and try methods combined with other conditions to restore data, reducing the difficulty of recovery and increasing the reliability of recovery.

The specific implementation of File System Recovery in UNIX systems depends on the specific file system structure and disk block allocation algorithms of different operating systems and versions. This article attempts to summarize a general idea and Strategy, which is limited by space and cannot be discussed in detail.

From: ccpi.gov.cn

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Recovery Policy for deleted files in UNIX systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Recovery Policy for deleted files in UNIX systems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support