Using Lsof to find open files
Learn more about the system by looking at the open file. Knowing which files the application opens or which application opens a particular file, as a system administrator, will allow you to make better decisions. For example, you should not uninstall a file system that has an open file. With lsof
, you can check open files and abort the process as needed before uninstalling. Similarly, if you find an unknown file, you can find out exactly which application opened the file.
In the UNIX® environment, files are everywhere, which creates a maxim: "Everything is a document." Files are not only accessible to regular data, but also often have access to network connectivity and hardware. In some cases, when you use the ls
request catalog manifest, the corresponding entry appears. In other cases, such as Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) sockets, there is no corresponding directory listing. However, in the background, a file descriptor is assigned to the application, regardless of the nature of the file, which provides a common interface for the interaction between the application and the underlying operating system.
Because the application opens a descriptor list of files that provides a lot of information about the application itself, it is helpful to be able to view the list. The utility that completes this task is called lsof
, it corresponds to "List open files" (lists the opened file). This utility is available in almost every UNIX version, but oddly enough, most vendors do not include it in the initial installation of the operating system. For more lsof
information, see the Resources section.
Lsof Introduction
lsof
you can generate a lot of information just by typing, as shown in Listing 1. Because of the lsof
need to access core memory and various files, it must be run as the root user to fully perform its functions.
Listing 1. Example output for lsof
bash-3.00# lsof COMMAND PID USER FD TYPE DEVICE size/off NODE namesched 0 Root CWD VDIR 136,8 1024x768 2/init 1 root cwd VDIR 136,8 1024x768 2/init 1 root txt VREG 136,8 49016 1655/sbin/initinit 1 root txt VREG 136,8 51084 3185/lib/libuutil.so.1vi root 3u VREG 136,8 0 8501/var/tmp/exxdao7d ...
Each row displays an open file, and unless otherwise specified, all files opened by all processes are displayed. Command
, PID
and User
columns represent the name of the process, the process identifier (PID), and the owner name, respectively. Device
, SIZE/OFF
,, Node
and Name
columns refer to the file itself, representing the name of the specified disk, the size of the file, the index node (the file's identity on disk), and the exact name of the file. Depending on the UNIX version, the size of the file may be reported as the current position (offset) of the application being read in the file. Listing 1 comes from a Sun Solaris 10 computer that can report this information, and Linux® does not have this capability.
FD
and Type
columns are the most ambiguous, and they provide more information about how files are used. FD
column represents a file descriptor, and the application recognizes the file through a file descriptor. Type
column provides a more descriptive description of the file format. Let's take a look at the file descriptor column, and there are three different values in Listing 1. The cwd
value represents the current working directory of the application, which is the directory that the application launches, unless it makes changes to the directory itself. A txt
file of type is program code, such as the application binary itself or a shared library, such as the program shown in the list of this example init
. Finally, the numeric value represents the file descriptor of the application, which is an integer that is returned when the file is opened. In the last line of the listing 1 output, you can see that the user is using vi
edit/var/tmp/exxdao7d with a file descriptor of 3. u
indicates that the file is open and is in read/write mode instead of read-only ( r
) or write-only ( w
) mode. It's not important, but it's very helpful. When you initially open each application, you have three file descriptors, from 0 to 2, representing standard input, output, and error streams, respectively. Because of this, most applications open files with FD starting from 3.
FD
Type
columns are more intuitive than columns. Depending on the operating system, you will find files and directories called REG
and DIR
(in Solaris, called VREG
and VDIR
). Other possible values are the CHR
and BLK
, respectively, the character and block devices, or UNIX
, FIFO
and IPv4
, respectively, the UNIX domain Sockets, FIFO (FIFO) queues, and Internet Protocol (IP) sockets.
Go to/proc Directory
Although there is lsof
no direct relationship to use, a brief introduction of the/proc directory is necessary. /proc is a directory that contains various files that reflect the kernel and process tree. These files and directories do not exist on disk, so when you read and write these files, you are actually getting relevant information from the operating system itself. Most of the lsof
relevant information is stored in a directory named after the PID of the process, so the/proc/1234 contains information about the process with PID 1234.
There are various files in each process directory of the/proc directory that allow the application to simply understand the memory space of the process, the file description list characters, the symbolic link to the file on disk, and other system information. lsof
The utility uses this information and other information about the internal state of the kernel to produce its output. I'll lsof
link the output to the information in the/proc directory later.
Common usage
Earlier, I showed you how to simply run without any parameters to lsof
display information about the files opened by each process. The remainder of this article focuses on how to use lsof
it to display the information you need and how to interpret it correctly.
Find open files for Applications
lsof
A common use is to find the name and number of files opened by the application. You may want to try to find out where a particular application logs the log data, or if you are tracking an issue. For example, UNIX restricts the number of files that a process can open. This is usually a large number, so there is no problem, and when needed, the application can request a larger value (up to a certain limit). If you suspect that your application is running out of file descriptors, you can use lsof
statistics to open the number of files for validation.
To specify a single process, you can use the -p
parameter, followed by the PID of the process. Because doing so will not only return files opened by the application, it will also return shared libraries and code, so it is often necessary to filter the output. To complete this task, you can use -d
flags to FD
Filter by columns, and -a
flags indicate that two parameters must be met (and). If there is no -a
flag, the default is to display a file that matches any one of the parameters (OR). Listing 2 shows the sendmail
files opened by the process and uses TXT to filter the files.
Listing 2. Lsof output with PID filter and TXT file descriptor filtering
sh-3.00# lsof-a-P 605-d ^txtcommand PID USER FD TYPE DEVICE size/off NODE namesendmail 605 root CWD VDIR 136,8 1024x768 23554/var/spool/mqueuesendmail 605 root 0r vchr 13,2 6815752/devices/pseudo/[email&nbs P;protected]:nullsendmail 605 Root 1w vchr 13,2 6815752/devices/pseudo/[email protected]:nullsendmai L 605 Root 2w vchr 13,2 6815752/devices/pseudo/[email protected]:nullsendmail 605 Root 3r DOOR 0t0 58/var/run/name_service_door (door to nscd[81]) (fa:->0x30002b156c0) SendMail 605 root 4w VCHR 21,0 11010052/devices/pseudo/[email protected]:conslog->logsendmail 605 Root 5u IPv4 0x300010ea640 0t0 TCP *:smtp (LISTEN) sendmail 605 root 6u IPv6 0x3000431c180 0t0 TCP *:smtp (LISTEN) SendMail 605 Root 7u IPv4 0x300046d39c0 0t0 TCP *:submission (LISTEN) sendmail 605 root 8wW VREG 281,3 3 2 8778600/var/run/sendMail.pid
Listing 2 lsof
Specifies a total of three parameters. The first is -a
that it indicates that the file is displayed when all arguments are true. The second parameter is -p 605
that it restricts the output of only the process with PID 605, which can be ps
obtained by command. The last parameter -d ^txt
, which represents the record in which the TXT type is filtered (the caret [^] denotes exclusion).
The output from Listing 2 provides information about the behavior of the process. As the cwd
line shows, the working directory for the application is/var/spool/mqueue. File descriptors 0, 1, and 2 are assigned to/dev/null (Solaris uses symbolic links extensively, so the corresponding pseudo-devices are shown here). FD 3 is a Solaris gate (high-speed Remote procedure call (RPC) interface) that opens in read-only mode. The content in FD 4 is interesting because it is a read-only handle to a character device, essentially a/dev/log. From this file, you can collect the records that the application makes to the UNIX syslog daemon, so/etc/syslog.conf specifies the location of the log files.
As a network application, sendmail
the network port is monitored. File descriptors 5, 6, and 7 tell you that the application is listening on the Simple Mail Transfer Protocol (SMTP) port in IPv4 and IPV6 mode and listens to the submit port in IPV4 mode. The last file descriptor is write-only and points to/var/run/sendmail.pid. The FD
uppercase in the column W
indicates that the application has a write lock on the entire file. This file is used to ensure that only one instance of the application can be opened at a time.
Find an application that opens a file
In other cases, you have a file or directory, and you need to know which application controls the file (the file is opened). Listing 2 shows the sendmail
/var/run/sendmail.pid opened by the process. If you do not know this information, you lsof
can provide that information in the case of a given file name. Listing 3 shows the corresponding output.
Listing 3. Require lsof to display information about a file
bash-3.00# lsof/var/run/sendmail.pidcommand PID USER FD TYPE DEVICE size/off NODE namesendmail 605 Root 8wW VREG 281,3 8778600/var/run/sendmail.pid
As the output shows, the process sendmail
(PID 605) controls the file/var/run/sendmail.pid and opens the file for writing by locking it in. If, for some reason, you need to delete this file, it is a good practice to abort the process instead of deleting the file directly. Otherwise, the daemon might not start properly the next time, or it might start another instance later, causing contention.
Sometimes you only know that a file is open somewhere on the file system. If there are any open files in the file system when uninstalling the file system, the operation will fail. By specifying the name of the mount point, you can use to lsof
display all open files in a file system. Listing 4 shows how to try uninstalling/export/home and then use to lsof
find out who is using the file system.
Listing 4. Use Lsof to find out who is using the file system
bash-3.00# umount/export/homeumount:/export/home busybash-3.00# lsof/export/homecommand PID USER FD TYPE DEVICE Size/off NODE namebash 1943 Root cwd VDIR 136,7 1024x768 4/export/home/seanbash 2970 Sean cwd VDIR 136,7 1024x768 4/export/home/seanct 3030 Sean CWD VDIR 136,7 1024x768 4/export/home/seanct 3030 Sean 1w VREG 136,7 0 25/ Export/home/sean/output
In this example, user Sean is doing some of the work in his home directory. There are two bash
(one Shell) instances running, and the current directory is set to Sean's home directory. There is also a named ct
application that is running in the same directory and whose standard output (file descriptor 1) is redirected to a file named output. To successfully uninstall/export/home, you should abort these processes after notifying the user to ensure that the situation is correct.
This example illustrates the importance of the current working directory of the application because it retains the file resources and prevents the file system from being unloaded. This is why most daemons (background processes) change their directories to the root directory, or service-specific directories (such as sendmail
/var/spool/mqueue in the example), to prevent the daemon from preventing the uninstallation of unrelated file systems. If you sendmail
start from the/export/home/sean directory and do not change its directory to/var/spool/mqueue, you must abort it before uninstalling the/export/home.
If you are interested in a file that is open in a non-mount point directory, you must pass +d
or specify the name of the +D
directory, whichever of these flags depends on whether you need to recursively go to a subdirectory ( +D
) or do not need recursion to a subdirectory ( +d
). For example, to view all open files in/export/home/sean, you can use the lsof +D /export/home/sean
. In the previous example, the related directory is a mount point, which is slightly different from the previous example and restricts lsof
interaction with the kernel. This can also cause potential problems, which are lsof /export/home
lsof /export/home/
different from (note the trailing slash). The first method works correctly because it points to the mount point. The second method does not produce any output because it points to the directory. If you use the Tab key to automate commands in the Shell, you may encounter this problem, which will help you add the trailing slash. In this case, you can either delete the slash or use the +D
specified directory. The former is the preferred method because it executes faster than specifying any directory.
Less common usage
In the previous section, we examined lsof
the basic usage of showing the relationships between open files and the processes that control them. This is helpful when you want to do some cumbersome work on your system, and you don't want to break other people's important documents. You can also perform some of the most difficult UNIX operations using the same method.
Recovering deleted files
When a UNIX computer is compromised, it is common for the log files to be deleted to conceal the attacker's traces. Administrative errors can also cause accidental deletion of important files, such as the active transaction log of the database is accidentally deleted when the old log is cleaned up. These files can sometimes be recovered and lsof
can be helpful to you.
When a process opens a file, it remains on disk as long as the process remains open for that file, even if it is deleted. This means that the process does not know that the file has been deleted, and it can still read and write to the file descriptor that was provided to it when the file was opened. In addition to this process, this file is not visible because its corresponding directory entry has been deleted.
As previously mentioned in the Go to/proc directory section, you can access the file descriptor of a process by looking in the appropriate directory. In the content that follows, you see the lsof
file descriptor and the associated file name that you can display for the process. Can you see what I mean?
I wish it were so simple! When you lsof
pass a file name, such as in lsof /file/I/deleted
, it first uses the stat()
system call to get information about the file, unfortunately, the file has been deleted. In different operating systems, lsof
the name of the file may be captured from core memory. Listing 5 shows a Linux system that has deleted Apache logs from the field, and I'm using grep
the tool to find out if someone has opened the file.
Listing 5. Use lsof to find deleted files in Linux
# Lsof | grep error_loghttpd 2452 root 2w REG 33,2 499 3090660/var/log/httpd/error_log ( deleted) httpd 2452 root 7w REG 33,2 499 3090660/var/log/httpd/error_log ( deleted) ... more httpd processes ...
In this example, you can see that the file descriptor for the PID 2452 open file is 2 (standard error) and 7. Therefore, you can view the appropriate information in/PROC/2452/FD/7, as shown in Listing 6.
Listing 6. To find deleted files by/proc
# Cat/proc/2452/fd/7[sun Apr 04:02:48 2006] [notice] digest:generating secret for Digest Authentication[sun Apr 30 04 : 02:48 2006] [notice] Digest:done[sun APR 04:02:48 2006] [notice] ldap:built with OpenLDAP LDAP SDK
The advantage of Linux is that it saves the name of the file and can even tell us that it has been deleted. This is useful when looking for related content in compromised systems, as attackers typically delete logs to hide their traces. Solaris does not provide this information. However, we know that the httpd
daemon uses the Error_log file, so you can use the ps
command to find the PID, and then you can see all the files that the daemon opened.
Listing 7. Find deleted files in Solaris
# lsof-a-P 8663-d ^txtcommand PID USER FD TYPE DEVICE size/off NODE namehttpd 8663 Nobody cwd VDIR 136,8 1024x768 2/httpd 8663 Nobody 0r vchr 13,2 6815752 /DEVICES/PSEUDO/[EMAIL PROTECTED]:NULLHTTPD 8663 Nobody 1w vchr 13,2 6815752/devices/pseudo/ [Email protected]:nullhttpd 8663 Nobody 2w VREG 136,8 185 145465/(/DEV/DSK/C0T0D0S0) httpd 866 3 Nobody 4r DOOR 0t0 58/var/run/name_service_door (DOOR to nscd[81]) (FA:->0X30002B156C0) HT TPD 8663 Nobody 15w VREG 136,8 185 145465/(/DEV/DSK/C0T0D0S0) httpd 8663 nobody 16u IPv4 0x300046 D27C0 0t0 TCP *:80 (LISTEN) httpd 8663 Nobody 17w VREG 136,8 0 145466 /VAR/APACHE/LOGS/ACCESS_LOGHTTPD 8663 Nobody 18w VREG 281,3 0 95180 13/var/run (Swap)
I use the -a
and -d
parameters to filter the output to exclude code snippets because I know what files need to be looked up. Name
column shows that two of the files (FD 2 and 15) use the disk name instead of the file name, and they are of the type VREG
(regular file). In Solaris, the deleted file displays the name of the disk on which the file resides. With this clue, you know that the FD points to a deleted file. In fact, you /proc/8663/fd/15
can get the data you're looking for by looking at it.
If you can view the data through a file descriptor, you can use I/O redirection to copy it to a file, such as cat /proc/8663/fd/15 > /tmp/error_log
. At this point, you can abort the daemon (which will delete the FD, delete the corresponding file), copy the temporary file to the desired location, and then restart the daemon.
This method of recovering deleted files is useful for many applications, especially log files and databases. As you can see, some operating systems (and different versions of them lsof
) are more likely than others to find the appropriate data.
Find Network Connections
Network connections are also files, which means that you can use the information that is available lsof
about them. You have seen such an example in Listing 2. This example assumes that you already know the PID, but sometimes it is not. If you only know the appropriate ports, you can use the -i
parameters to search using the socket information. Listing 8 shows a search for TCP port 25.
Listing 8. Find a process that listens on port 25
# lsof-i: 25COMMAND PID USER FD TYPE DEVICE size/off NODE namesendmail 605 root 5u IPv4 0x300010ea640 0t0 tcp *:smtp (LISTEN) sendmail 605 root 6u IPv6 0x3000431c180 0t0 TCP *: SMTP (LISTEN)
You need to protocol:@ip:port
pass related information to the utility in the form of lsof
TCP or UDP (which can use 4 or 6 as the prefix, the version of the IP), IP as a resolvable name or IP address, and port as a number or the name of the service protocol (from/etc/services). Requires one or more elements (port, IP, protocol). In Listing 8, it :25
represents port 25. The output shows that process 605 is using IPV6 and IPv4 to listen on port 25. If you are not interested in IPv4, you can change the filter to 6:25
represent a IPV6 socket listening on port 25, or to use a direct 6
representation of all IPv6 connections.
In addition to showing the objects that these daemons are listening to, lsof
you can also find the connection that occurred, as well as using -i
parameters. Listing 9 shows all the connections between the search and the 192.168.1.10.
Listing 9. Search for active connections
# lsof-i @192.168.1.10command PID USER FD TYPE DEVICE size/off NODE namesshd 1934 Root 6u IPv6 0x300046d21c0 0t1303608 TCP sun:ssh->linux:40379 (established) sshd 1937 root 4u IPv6 0x300046d21c0 0t1303608 TCP sun:ssh->linux:40379 (established)
In this example, sun
linux
There are two IPv6 connections between and. A closer look at it shows that these connections come from two different processes, but they are the same, because the two hosts are the same and the ports are the same (SSH and 40379). This is because the connection that enters the main process forks out a handler and passes the socket to it. You can also see that sun
the computer that is named is using port (SSH) and linux
has port 40379. This means that it sun
is the recipient of the connection because it is associated with a well-known port for the service. 40379 is a source or ephemeral port and is meaningful only for this connection.
Because, at least in UNIX, sockets are another type of file, lsof
you can get detailed information about these connections and find out who is responsible for them.
Conclusion
UNIX uses a lot of files. As a system administrator, lsof
you are allowed to view core memory to find out how these files are currently being used by your system. lsof
The simplest usage can tell you which processes are opening which files, and which files are opened by which processes. This is especially important if you are collecting information about the application's work, or if you are making sure that the file is not being used before you do something that might corrupt the data, this is particularly significant lsof
. More advanced usage can help you find deleted files and get information about network connections. This is a powerful tool that can be used almost anywhere.
Original link: http://www.ibm.com/developerworks/cn/aix/library/au-lsof.html
Linux/unix lsof Usage