Use lsof to handle file recovery, handles, and space-release issues

Source: Internet
Author: User

Problem Description: After deleting a updatedb generated file, it is found that the disk space has not been recycled, such as:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/54/8C/wKioL1SGW2vgFOEUAAD_izU3sEA533.jpg "t Itle= "H%h1uq (gcbg6v6~[_}{292j.jpg" alt= "Wkiol1sgw2vgfoeuaad_izu3sea533.jpg"/>

Du/var size 8.8G, but DF disk Discovery uses 18G, remaining 119M. The last discovery is that the file handle that the program occupies is not released.


Cause analysis: such as writing a program, open a file:


FH = open (' A.txt ', ' W ')
Fh.readlines ()
Fh.close ()

There is open file operation, there is an end close operation, if the program is not active close, this operation more words, must have a problem.
This does not release the file handle, itself is a bug in the program, so the end of the program is useless. In this case, the server is normally restarted, and the file handles that were not released are released and then normal ...



Here's how to find a solution online:


Once encountered in the production of DF and du appeared inconsistent results, in order to find out which process is taking the file handle, resulting in space is not released, first on Linux, everything is file, This problem can be handled using lsof this BT command (this can also be queried for file handle disclosure problems, the application process does not close the file handle)

1. File handles and space release issues
    • Note: A common problem in a production environment is that a maintenance person or development colleague uses the tail command to view the logs in real time. Then another person uses the RM command to delete, which has the good result that the disk space will not be really released, because you want to delete the file, there are processes in use, the file handle is not released, that is tail

Simulation Scenario 1:

You create a file testfile

Touch testfile

Then use the tail command to always view

Tail Testfile

This time another colleague used the RM command to delete the file

RM testfile
Formally use the lsof command to troubleshoot

If you know the file name, you can use the following command directly

Lsof |grep testfile

But if you don't know which file it is, or if it's a lot of files, you need to use the following command

Lsof |grep deleted Note: This deleted represents the deleted file, but the file handle is not released, and this command lists all the processes that have not disposed of the file handle.

Note: Some systems you do not configure the environment variables, direct lsof will be an error without the command, you can directly/usr/bin/lsof or/usr/sbin/lsof, according to your system environment to view their own

And the result of the above command will come out as follows

Root 123 12244 0 14:47 pts/1 01:02:03 tail testfile

You can then use the KILL command to release the file handle to free up space

Kill 123
2. File Recovery Issues

Before you explain the problem, introduce the basic concepts of the following files:

    • The file is actually a link to the Inode, which contains all of the properties of the file, such as permissions and owners, and block addresses (files are stored in these chunks of disk). When you delete (RM) A file, the link to the inode is actually deleted, and the inode content is not deleted. The process may still be in use. Only when all the links to the inode are completely removed, then the data blocks will be able to write new data.

    • The proc file system can help us recover data. Each process on the system has a directory and its own name in/proc, which contains an FD (file descriptor) subdirectory (all links to the file that the process needs to open). If you delete a file from the file system, there is a reference to the inode here:

/proc/Process number/fd/file descriptor
    • You need to know the process number (PID) and file descriptor (FD) of the open file. These can be easily obtained through the lsof tool, lsof means "list open files, which lists (processes) the file opened." You will then be able to copy the data that needs to be recovered from the/proc.

1. Create a test file and back up, the aspects of subsequent validation
Touch TESTFILECP testfile testfile.backup.2014
2. View information about a file
Stat testfilefile: ' testfile ' size:343545 blocks:241 IO block:4096 regular filedevice:fd00h/64768d inode:361579 Links: 1Access: (0664/-rw-rw-r–) Uid: (505/zhaoke) Gid: (505/zhaoke) access:2014-11-09 15:00:38.000000000 +0800modify:2014- 11-09 15:00:34.000000000 +0800change:2014-04-09 15:00:34.000000000 +0800

No problem, continue with the following work:

3. deleting files
RM testfile
4. View Files
Ls-l testfilels:testfile:No such file or directory
Stat testfilestat:cannot stat ' testfile ': No such file or directory

The testfile file is deleted, but do not terminate the process that is still using the file, because once terminated, the file will be difficult to recover.

Now we start to retrieve the data journey, first use the lsof command to see
lsof | grep testfiletail 5317 root 4r REG 253,0 343545 361579/root/testfile (Deleted)
    • The first vertical line is the name of the process (command name), the second vertical line is the process number (PID), and the fourth vertical line is the file descriptor

    • Now you know that the 5317 process still has an open file, and the file descriptor is 4. Then we start copying the data from the/proc.

    • you might consider using CP-A, but it doesn't actually work, and you're going to copy a symbolic link to the deleted file:

Ls-l/proc/5317/fd/4lr-x--1 root root 15:00/proc/5317/fd/4/root/testfile (deleted)

Test recovery with the CP-A command

Cp-a/PROC/5317/FD/4 Testfile.backup

Use the LS command to view

Ls-l testfile.backuplrwxrwxrwx 1 root root 15:02 testfile.backup-/roor/testfile (Deleted)

By the above command we find that using the CP-A command, it restores a symbolic link to the deleted file

View files and file descriptors separately using the file command

    • 1. View Files

File Testfile.backuptestfile.backup:broken Symbolic link to '/root/testfile (deleted) '
    • 2. View file descriptors

FILE/PROC/5317/FD/4/PROC/5317/FD/4: Broken symbolic link to '/root/myfile (deleted) '

Depending on the file result above, you can use CP to copy the data from the filename descriptor to a file, as follows:

CP/PROC/5317/FD/4 testfile.new

After using the above command to recover, we need to finalize the file recovery, and the file content is correct:

Ls-l testfile.new

Then compare the old and new two files

Diff Testfile.new Myfile.backu

Original link:http://segmentfault.com/blog/yexiaobai/1190000000461077

Use lsof to handle file recovery, handles, and space-release issues

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.