Recently in just a few businesses often run out of threads, resulting in server resources used up, so wrote a script for Nagios under the maximum number of threads of the related process Monitoring, UNIX server on the largest number of threads default is 1024, of course, in a busy server This is certainly not enough, Of course, in the actual production environment to do initialization tuning generally have made changes, such as opening the maximum number of open file handle and so on, in general, we are all modified/etc/security/limits.conf file, but to modify the maximum line Cheng will be modified/etc/security/ limits.d/90-nproc.conf file, modify and modify the limits.conf file file in the same way, do not do too much explanation here, I generally put all the user's maximum number of threads to increase
* Soft Nproc 65535
After adjusting the number of threads of the alert value can be adjusted according to the actual situation, the script implementation is also very simple, as follows:
#!/bin/bash#check_pstree.sh#used for pstree process monitoring#writer jim#history 2017.07.01# Nagios return value state_ok=0state_warning=1state_critical=2state_unknown=3# the arguments passed in to determine if [ $# -lt 1 ];then echo "please enter the Process string " echo " Ex> $0 java " exit $STATE _unknownfiif [ $# -gt 1 ]; then echo "The input parameters are too much" echo "Ex> $0 java" exit $STATE _unknownfireg_name=$1process_pid=$ (PS -ef | grep "$reg _name " | grep -v grep | awk "{ print $2} ') declare -i max_process_num=$ (ulimit -an | grep "Max user processes " | awk ' {print $5} ') declare -i warning_num= $max _process_num/2# This alert value to 50% of the maximum number of threads, Specific can be modified according to the actual production environment pstree_num=$ (pstree -p $process _pid | wc -l) if [ $pstree _num -le $warning _num ];then echo "$reg _name pstree number is: $pstree _num;warning_num is: $warning _num;max user processes is: $max _process_num,ok " exit $STATE _okelse echo "Error!!! The number of pstree is too much.the number is: $pstree _num " exit $STATE _criticalfi
Of course, this script can also be modified after the cron task to do a timed check, but in Nagios do not know why the value of the number of threads in the Nagios monitoring page shows there is always an exception, but direct execution is not a problem.
This article from "Technical essay" blog, declined reprint!
Maximum number of threads for monitoring server processes in Nagios