Recently found that the company server set up a slow website access, server Input command also slow response, processing steps are as follows:
1, through the top command to view the server CPU, memory, IO and other usage
It is found that the CPU is basically above 80%, the memory is good, there is surplus; CPU average load rate load Average is about 40.
2, through the Vmstat, iostat see the relevant parameters, confirm that the CPU is high, CPU is not enough, then thought the server CPU is used up, but the application is not many, two CPUs are enough
3, then slowly look at the process and service threads and port number occupation and packet sending, (W, ProcInfo, PS, uptime, netstat), only see the application of the log consumes most of the CPU resources
4, later Baidu a bit, there is a similar post, "To solve the CentOS deleted files after the release of Space" ( Source:blog.51cto.com Author:cj397428869)
4.1. its posts are as follows:
Phenomenon:
Discover Current disk space usage:
[Email protected] ~]# df-h
Filesystem Size used Avail use% mounted on
/DEV/SDA1 981M 203M 729M 22%/
None 16G 0 16G 0%/dev/shm
/DEV/SDA9 2.9G 37M 2.7G 2%/tmp
/dev/sda7 4.9G 1.9G 2.7G 42%/usr
/dev/sda8 2.9G 145M 2.6G 6%/var
/dev/mapper/vghome-lvhome
20G 19G 11M 100%/Home
/dev/mapper/vgoradata-lvoradata
144G 48G 90G 35%/u01/oradata
/dev/mapper/vgbackup-lvbackup
193G 7.8G 175G 5%/u01/backup
Use the following command to locate the useless file, and then delete the
[[email protected] ~]# find/home/oracle/admin/dbticb/udump/-name "dbticb_*.trc"-mtime +50 | Xargs RM-RF
then, when you look at disk space usage, you find that no /Home no change in space
[Email protected] ~]# df-h
Filesystem Size used Avail use% mounted on
/DEV/SDA1 981M 203M 729M 22%/
None 16G 0 16G 0%/dev/shm
/DEV/SDA9 2.9G 37M 2.7G 2%/tmp
/dev/sda7 4.9G 1.9G 2.7G 42%/usr
/dev/sda8 2.9G 145M 2.6G 6%/var
/dev/mapper/vghome-lvhome
20G 19G 11M 100%/Home
/dev/mapper/vgoradata-lvoradata
144G 48G 90G 35%/u01/oradata
/dev/mapper/vgbackup-lvbackup
193G 7.8G 175G 5%/u01/backup
This depressed ah, obviously deleted files, how the space has not been released Ah, RM The command should be deleted directly, under View /Home What else is the space?
[Email protected] ~]# du-h--max-depth=1/home
16k/home/lost+found
2.6g/home/oracle
2.6g/home
But this shows that the space has been released, so Google,
Disk space not freed reason:
In Linux or Unix systems, deleting files via RM or File Manager will unlink the file system from the directory structure (unlink). However if the file is
Open (a process is in use), the process will still be able to read the file and disk space is always occupied. And I deleted the Oracle's alarm log file.
Files should be in use when deleted
Workaround
First get a list of files that have been deleted but are still occupied by the application, as follows:
[Email protected] ~]# lsof |grep deleted
Oracle 12639 Oracle 5w REG 253,0 648 215907/home/oracle/admin/dbticb/udump/dbticb_ora_12 637.TRC (Deleted)
Oracle 12639 Oracle 6w REG 253,0 16749822091 215748/HOME/ORACLE/ADMIN/DBTICB/BDUMP/ALERT_DBTICB . log (Deleted)
Oracle 12639 Oracle 7u REG 253,0 0 36282/home/oracle/oracle/product/10.2.0/db_1/dbs/l KINSTDBTICB (Deleted)
Oracle 12639 Oracle 8w REG 253,0 16749822091 215748/HOME/ORACLE/ADMIN/DBTICB/BDUMP/ALERT_DBTICB . log (Deleted)
Oracle 12641 Oracle 5w REG 253,0 648 215907/home/oracle/admin/dbticb/udump/dbticb_ora_12 637.TRC (Deleted)
Oracle 12641 Oracle 6w REG 253,0 16749822091 215748/HOME/ORACLE/ADMIN/DBTICB/BDUMP/ALERT_DBTICB. Log (Deleted)
.
.
Oracle 23492 Oracle 6w REG 253,0 16749822091 215748/HOME/ORACLE/ADMIN/DBTICB/BDUMP/ALERT_DBTICB . log (Deleted)
Oracle 23492 Oracle 7u REG 253,0 0 36282/home/oracle/oracle/product/10.2.0/db_1/dbs/lk INSTDBTICB (Deleted)
Oracle 23492 Oracle 8w REG 253,0 16749822091 215748/HOME/ORACLE/ADMIN/DBTICB/BDUMP/ALERT_DBTICB . log (Deleted)
Oracle 23494 Oracle 10u REG 253,0 0 36307/home/oracle/oracle/product/10.2.0/db_1/dbs/l KINSTRMANDB (Deleted)
From the output, you can see that the/home/oracle/admin/dbticb/bdump/alert_dbticb.log is still being used, and the space has not been freed
How do I get the process to release it?
One way is to kill the corresponding process, or stop the application that uses the file, and let the OS automatically reclaim disk space
I have a lot of process in this environment to use this file, stop the process a little trouble, then there is a great risk
When Linux opens a file, the Linux kernel builds a PID for each process in the/proc/"/proc/nnnn/fd/directory (nnnn pid)"
The named directory is used to hold information about the process, and its subdirectory, FD, holds the FD (fd:filedescriptor) of all files opened by the process.
The kill process is by truncating files in the proc file system that can force the system to be reclaimed for allocation to the files being used.
This is an advanced technique that is used only when the administrator determines that the running process will not be impacted. Applications to such Parties
Support is not good, and when a file being used is truncated it can cause unpredictable problems
So I still use the stop application to solve
Restart Oracle database, discovering /home/oracle/admin/dbticb/bdump/alert_dbticb.log The corresponding space is released.
When viewing disk space usage, discover that the space has been recycled
[Email protected] ~]# df-h
Filesystem Size used Avail use% mounted on
/DEV/SDA1 981M 203M 729M 22%/
None 16G 0 16G 0%/dev/shm
/DEV/SDA9 2.9G 37M 2.7G 2%/tmp
/dev/sda7 4.9G 1.9G 2.7G 42%/usr
/dev/sda8 2.9G 145M 2.6G 6%/var
/dev/mapper/vghome-lvhome
20G 2.6G 16G 15%/Home
/dev/mapper/vgoradata-lvoradata
144G 48G 90G 35%/u01/oradata
/dev/mapper/vgbackup-lvbackup
193G 7.8G 175G 5%/u01/backup
OK , solve the problem, and then do the finishing work .
-------------------------------------------------------------------------------------------------
4.2, I use: ll/proc/pid/fd, view the next directory of files, many are all red highlighted files, and is marked as has been deleted
4.3, I also use the command: lsof | grep deleted queries for files that have been deleted and not recovered in time
4.4, in the case of confirmation is not used kill-9 pid directly kill process, delete part of the process, the top system, CPU usage decreased a lot, continue to clean up the other deleted files process.
4.5, the CPU peace, down to 1%, the value of the load from the original 40 to 0. X.
However, there is a problem is: There are two files marked deleted, but pointed to the soft connection is now used under the application of the file, check the application of the process number and the deletion process is good contrast, the two different, I will kill it directly, but found that the running application hangs, restart the application after the Lsof | grep deleted still have that file, is also labeled deleted files, process numbers are consistent, close the application, see this file is not, start the application and appear, this file is automatically generated by the application, no way to delete, really did not solve the problem??????
This article is from the cloud blog, so be sure to keep this source http://weimouren.blog.51cto.com/7299347/1846835
Processing of excessive server CPU utilization