According to the analysis of the 100% alarm on disk space, df and 100% df are generated.
Preface:
Morning disk alarm just cleared tomcat and nginx logs, the command is similar to echo ""> show_web-error.log or> show_web-debug.log clear statement, then rm-rf drops a few tar.gz package, empty 30 GB space. The debug information of tomcat is also disabled. I just received an alarm again, and the disk is 100%. What's going on?
1. Enter df-h, which is 100%, as shown below:
[Root @ localhost ~] # Df-h
File System capacity used available % mount point
/Dev/mapper/VolGroup00-LogVol00
113G 113G 0 100%/
/Dev/sda1 99 M 13 M 82 M 14%/boot
Tmpfs 8.8G 0 8.8G 0%/dev/shm
100% already exists, and then go to/to check
2. Go to the/root directory check, du-sh *
[Root @ localhost ~] # Cd/
[Root @ localhost/] # du-sh *
7.8 Mbin
6.9 Mboot
131 Mdata
196 Kdev
111 Metc
178 Mhome
131 Mlib
23Mlib64
119 Mlogs
16 Klost + found
8.0 Kmedia
0 misc
8.0 Kmnt
0net
0nohup. out
3.8 Gopt
15Mpcre-8.33
2.1Mpcre-8.33.zip
Du: unable to access "proc/11575/task/11575/fd/1565": No file or directory
Du: unable to access "proc/15403/task/14464/fd/625": No file or directory
0 proc
1.4 Gproduct
153 Mrepo
143 Mroot
37 Msbin
8.0 Kselinux
363 Msoft
8.0 Ksrv
0sys
20 Ktemp
100 Ktftpboot
2.1 Gtmp
8.6 Gusr
184 Mvar
30Mvarnish-3.0.3
56Mzabbix-2.0.8
[Root @ localhost/] #
We can see that the total disk space occupied is less than 30 GB, but what about df-h, 100%? Where is the difference?
3, baidu, google information, find the http://www.chinaunix.net/old_jh/6/465673.html inside there are such 2 paragraph words:
(1 ):
When you open a file, you get a pointer. Subsequent writes to this file
References this file pointer. The write call does not check to see if the file
Is there or not. It just writes to the specified number of characters starting
At a predetermined location. Regardless of whether the file exist or not, disk
Blocks are used by the write operation.
The df command reports the number of disk blocks used while du goes through
File structure and reports the number of blocks used by each directory.
Far as du is concerned, the file used by the process does not exist, so it does
Not report blocks used by this phantom file. But df keeps track of disk blocks
Used, and it reports the blocks used by this phantom file.
And the reply from leolein's friend:
Thank you for that.
I deleted some expired files as soon as the disk is full. Maybe the application is still using these file handles, which leads to my problem.
After I stop all applications, the results of du and df are roughly the same.
(2 ):
This section gives the technical explanation of why du and df sometimes report
Different totals of disk space usage.
When a program that is running in the background writes to a file while
Process is running, the file to which this process is writing is deleted.
Running df and du shows a discrepancy in the amount of disk space usage.
Df command shows a higher value.
If the file has been deleted, but there are still residual processes that reference it (I do not know how to express it), the space used by df does not subtract the deleted files. When creating and writing a file, you can determine whether the space is sufficient Based on df (I think). Therefore, you cannot write the file when df 100% is used. -- But creating a file is acceptable. I have tested it. The lsof method is used to view the residual processes.
# Lsof/home | grep/home/oracle/osinfo | sort + 8 | grep '^. * 070920. * $'
Sadc 17821 root 3 w REG 253,1 326492112 926724/home/oracle/osinfo/070920sar. data (deleted)
Sadc 17861 root 3u REG 253,1 326492112 926724/home/oracle/osinfo/070920sar. data (deleted)
Sadc 17981 root 3u REG 253,1 326492112 926724/home/oracle/osinfo/070920sar. data (deleted)
Top 17858 root 1 w REG 169919916, 1 927111/home/oracle/osinfo/070920top. data (deleted)
Top 17977 root 1 w REG 169919916, 1 927111/home/oracle/osinfo/070920top. data (deleted)
Pay attention to the following deleted
Then kill all these processes to free up space.
I remembered that when I performed echo ""> shop_web.log similar operations in the morning, the tomcat application was not stopped, so the application kept writing data to the log, at the moment I>, du-sh * may see that the disk space is available, and df-h can also see that the disk is released. However, when the tomcat application continues to write logs to shop_web.log, the cached files that occupied a large disk space before the execution> shop_web.log that was originally opened are loaded. In fact, I was able to release it without warning for a day. It was because I had to pay attention to the space released by some tar.gz packages.
4. Restart tomcat and nginx applications.
Therefore, I should restart tomcat and nginx, the application will not load the old cache files, and execute the restart tomcat command. Because there are many tomcat applications, I wrote a script to execute
[Root @ localhost local] # cat/root/start_tomcat_port.sh
#! /Bin/bash
PID = 'ps-eaf | grep apache-tomcat-6.0.37 _ $1 | grep-v grep | grep-v start_tomcat_port | awk '{print $2 }''
Echo $1
Echo $ PID
Kill-9 $ PID
Rm-rf/var/tomcat/$ 1.pid
/Usr/local/apache-tomcat-6.0.37 _ $1/bin/startup. sh
[Root @ localhost local] #
Restart tomcat:
Sh/root/start_tomcat_port.sh 6100;
Sh/root/start_tomcat_port.sh 6200;
Sh/root/start_tomcat_port.sh 6300;
Sh/root/start_tomcat_port.sh 6400;
Sh/root/start_tomcat_port.sh 6500;
Sh/root/start_tomcat_port.sh 6700;
Sh/root/start_tomcat_port.sh 7100;
Sh/root/start_tomcat_port.sh 7200;
Sh/root/start_tomcat_port.sh 7300;
Restart nginx:
Service nginx restart
5. check the disk space again.
[Root @ localhost local] # df-h
File System capacity used available % mount point
/Dev/mapper/VolGroup00-LogVol00
113G 18G 90G 17%/
/Dev/sda1 99 M 13 M 82 M 14%/boot
Tmpfs 8.8G 0 8.8G 0%/dev/shm
[Root @ localhost local] #
The df-h command is normal and 90 GB of disk space has been released. The disk usage is only 17%, and the nagios alarm is cleared.
6. Summarize some Principle Analysis
Implementation principle:
The du-s command accumulates the total number of directories, symbolic links, and blocks used by the specified file system;
The df command displays the disk block distribution graph of the file system to obtain the total number of blocks and the remaining number.
Du is a user-level program and does not consider Meta Data (some disk blocks allocated by the system)
Ps: if the file handle opened by the application is not closed, the df command displays less disk space. Du won't.
Example:
# Include <unistd. h>
# Include <stdlib. h>
# Include <stdio. h>
# Include <fcntl. h>
Int main (int argc, char ** argv)
{
If (open ("tempfile", O_RDWR) <0 ){
Fprintf (stderr, "open error ");
Exit (-1 );
}
If (unlink ("tempfile") <0 ){
Fprintf (stderr, "unlink error ");
Exit (-1 );
}
Printf ("file unlinked \ n ");
Sleep (15 );
Printf ("done \ n ");
Exit (0 );
}
In linux, the disk space is displayed as available, but it is still in use or 100%, which causes the business system to run abnormally. How can this problem be solved? Everybody
The linux system has a mechanism to reserve about 5% of the partition size for the root user to prevent the disk from being processed when it is full:
Df-h to view the overall situation
Dumpe2fs/dev/sda6 | grep-I "block coun" check the number of full disks and the number of reserved disks.
For example:
Block count: 3755264
Reserved block count: 187763
So calculate
187763/3755264 = 0.05
Solution:
Cd/dev/sda6
Du-h
Check that the directory is big. You can continue to enter the corresponding directory du-h and find out the large and useless files.
How does one get the df-I/directory 100% in linux?
There is no way to delete the written files mounted to the/root directory.
Check whether there are unnecessary files in/tmp and whether there are large log files in/var/log.