Solution to insufficient space for/var/spool/clientmqueue in Linux
Today, I received an alert email with the following content:
------------------------------------
Alert content: Free disk space is less than 15% on volume/var
------------------------------------
Alarm level: PROBLEM
------------------------------------
Monitoring item: Free disk space on/var (percentage): 10%
------------------------------------
Alert time:
The information of this alert email is clear, but the space in the/var directory is insufficient. Let's take a look at what is going on.
When df-h is viewed in the directory, the remaining space is 9%, indicating that the space is still shrinking.
# Df-h
Filesystem Size Used Avail Use % Mounted on
/Dev/sda3 7.8G 908 M 6.5G 13%/
/Dev/sda6 7.8G 6.7G 746 M 91%/var
/Dev/sda5 7.8G 2.0G 5.5G 27%/usr
/Dev/sda1 122 M 12 M 104 M 10%/boot
Tmpfs 48G 36 K 48G 1%/dev/shm
/Dev/shm 48G 36 K 48G 1%/tmp
/Dev/sda7 497G 391G 81G 83%/home
Then a large number of files are found in/var/spool/clientmqueue, and most of the space is consumed here.
Let's just take a look at what the content is. It is found that a script is checking the listener log.
# More dfs32Ct1KE012443
Start: 20140402205501
Checking listener... OK
Checking listener listener_1525... OK
Checking listener listener_1528... OK
Checking listener listener_1523... OK
Checking listener listener_1522... OK
Last: 20140402205516
Further verification is performed to obtain the latest five files and view the file content.
Clientmqueue] # ls-lrt | tail-5
-Rw ---- 1 Oracle smmsp 228 Oct 7 dft97221lc0000015
-Rw ---- 1 oracle smmsp 919 Oct 7 qft97231cW026036
-Rw ---- 1 oracle smmsp 228 Oct 7 dft97231cW026036
-Rw ---- 1 oracle smmsp 919 Oct 7 qft97241rm007778
-Rw ---- 1 oracle smmsp 228 Oct 7 dft97241rm007778
Clientmqueue] # more dft97241rm007778
Start: 20151007100401
Checking listener... OK
Checking listener listener_1525... OK
Checking listener listener_1528... OK
Checking listener listener_1523... OK
Checking listener listener_1522... OK
Last: 20151007100416
The description is basically explained because the listener script generates a large number of log files.
Because this log file does not have much use for us, you can consider deleting it. Of course, directly deleting it will still report an error. You can slowly Delete ls in batches | xargs-n 10 rm
The deleted space is immediately released. Released nearly 6 GB of files.
# Df-h
Filesystem Size Used Avail Use % Mounted on
/Dev/sda3 7.8G 908 M 6.5G 13%/
/Dev/sda6 7.8G 1.1G 6.4G 15%/var
/Dev/sda5 7.8G 2.0G 5.5G 27%/usr
/Dev/sda1 122 M 12 M 104 M 10%/boot
Tmpfs 48G 36 K 48G 1%/dev/shm
/Dev/shm 48G 36 K 48G 1%/tmp
/Dev/sda7 497G 391G 81G 83%/home
Now the problem is solved. Let's take a look at the problem. If the job set in crontab has output content, the content will be sent to the corresponding cron job user in the form of mail, if sendmail is not started at this time, these log files will be generated in this path.
First, capture the latest file content. We can see that the file generation frequency is very high, almost one file per minute.
Clientmqueue] # ll
Total 64
-Rw ---- 1 oracle smmsp 228 Oct 7 dft97281ag005351
-Rw ---- 1 oracle smmsp 228 Oct 7 dft97292uq011660
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972c1x201725752
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972D11d025507
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972E1IS008404
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972F1Oi023669
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972G1Xr006590
-Rw ---- 1 oracle smmsp 228 Oct 7 dft972H1I8022068
-Rw ---- 1 oracle smmsp 919 Oct 7 qft97281ag005351
-Rw ---- 1 oracle smmsp 919 Oct 7 qft97292uq011660
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972c1x201725752
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972D11d025507
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972E1IS008404
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972F1Oi023669
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972G1Xr006590
-Rw ---- 1 oracle smmsp 919 Oct 7 qft972H1I8022068
Check crontab-l. You can see that the script execution frequency is very high.
2-9,12-29,31-59 *****. $ HOME/. xxxxprofile; $ HOME/dbadmin/scripts/lsnr_check.sh
In fact, the listener check does not require such frequent monitoring. You can slow down the frequency as appropriate. According to the general machine settings, the listener check is performed twice an hour. For example:
9,39 ****. $ HOME/. xxxxprofile; bash $ HOME/dbadmin/scripts/lsnr_check.sh
When the log file is cleared, the generation frequency of the log file is also reduced, but the problem is still that the indicator has no cure.
These check logs can be used as a background task without generating a large number of logs for each check. One way is to directly shield the logs, for example, set them as follows.
9,39 ****. $ HOME/. xxxxprofile; bash $ HOME/dbadmin/scripts/lsnr_check.sh>/dev/null 2> & 1
In this way, the solution of this problem has come to an end. It can be seen that a small change, after years of accumulation, will become a major problem, the setting frequency in monitoring is too high, which may lead to potential problems.
This article permanently updates the link address: