系統:Centos 6.4
伺服器:hostname:Nagios-Server kernel:2.6.32-358.el6.x86_64 IP:1.1.1.26
用戶端:hostname:Nagios-Client kernel:2.6.32-358.el6.x86_64 IP:1.1.1.27
升級核心:
[root@Nagios-Server ~]# yum install ntpdate -y #安裝ntpdate,時間同步
[root@Nagios-Server ~]# /usr/sbin/ntpdate time.nist.gov #時間同步
[root@Nagios-Server ~]# yum install kernel kernel-devel gcc gcc-c++ wget vim -y #升級核心
[root@Nagios-Client ~]# yum install ntpdate -y
[root@Nagios-Client ~]# /usr/sbin/ntpdate time.nist.gov
[root@Nagios-Client ~]# yum install kernel kernel-devel gcc gcc-c++ wget vim -y
升級後核心:
伺服器:2.6.32-504.1.3.el6.x86_64
用戶端:2.6.32-504.1.3.el6.x86_64
日誌分析:
一、服務
①check_users服務監控,第一次soft、第二次soft、第三次hard,在然後發送郵件
[1417348396] Warning: Return code of 255 for check of service 'check_users' on host '1.1.1.27' was out of bounds.
[1417348396] SERVICE ALERT: 1.1.1.27;check_users;CRITICAL;SOFT;1;(Return code of 255 is out of bounds)
[1417348456] Warning: Return code of 255 for check of service 'check_users' on host '1.1.1.27' was out of bounds.
[1417348456] SERVICE ALERT: 1.1.1.27;check_users;CRITICAL;SOFT;2;(Return code of 255 is out of bounds)
[1417348516] Warning: Return code of 255 for check of service 'check_users' on host '1.1.1.27' was out of bounds.
[1417348516] SERVICE ALERT: 1.1.1.27;check_users;CRITICAL;HARD;3;(Return code of 255 is out of bounds)
[1417348516] SERVICE NOTIFICATION: nagiosadmin;1.1.1.27;check_users;CRITICAL;notify-service-by-email;(Return code of 255 is out of bounds)
②:check_zombie_procs服務監控,第一次soft、第二次soft、第三次hard,在然後發送郵件
[1417348426] Warning: Return code of 255 for check of service 'check_zombie_procs' on host '1.1.1.27' was out of bounds.
[1417348426] SERVICE ALERT: 1.1.1.27;check_zombie_procs;CRITICAL;SOFT;1;(Return code of 255 is out of bounds)
[1417348486] Warning: Return code of 255 for check of service 'check_zombie_procs' on host '1.1.1.27' was out of bounds.
[1417348486] SERVICE ALERT: 1.1.1.27;check_zombie_procs;CRITICAL;SOFT;2;(Return code of 255 is out of bounds)
[1417348546] Warning: Return code of 255 for check of service 'check_zombie_procs' on host '1.1.1.27' was out of bounds.
[1417348546] SERVICE ALERT: 1.1.1.27;check_zombie_procs;CRITICAL;HARD;3;(Return code of 255 is out of bounds)
[1417348546] SERVICE NOTIFICATION: nagiosadmin;1.1.1.27;check_zombie_procs;CRITICAL;notify-service-by-email;(Return code of 255 is out of bounds)
③:check_total_procs服務監控,第一次soft、第二次soft、第三次hard,在然後發送郵件
[1417348436] Warning: Return code of 255 for check of service 'check_total_procs' on host '1.1.1.27' was out of bounds.
[1417348436] SERVICE ALERT: 1.1.1.27;check_total_procs;CRITICAL;SOFT;1;(Return code of 255 is out of bounds)
[1417348496] Warning: Return code of 255 for check of service 'check_total_procs' on host '1.1.1.27' was out of bounds.
[1417348496] SERVICE ALERT: 1.1.1.27;check_total_procs;CRITICAL;SOFT;2;(Return code of 255 is out of bounds)
[1417348556] Warning: Return code of 255 for check of service 'check_total_procs' on host '1.1.1.27' was out of bounds.
[1417348556] SERVICE ALERT: 1.1.1.27;check_total_procs;CRITICAL;HARD;3;(Return code of 255 is out of bounds)
[1417348556] SERVICE NOTIFICATION: nagiosadmin;1.1.1.27;check_total_procs;CRITICAL;notify-service-by-email;(Return code of 255 is out of bounds)
二、主機
[1417349046] HOST ALERT: 1.1.1.27;DOWN;SOFT;1;CRITICAL - Host Unreachable (1.1.1.27)
[1417349116] HOST ALERT: 1.1.1.27;DOWN;SOFT;2;CRITICAL - Host Unreachable (1.1.1.27)
[1417349186] HOST ALERT: 1.1.1.27;DOWN;HARD;3;CRITICAL - Host Unreachable (1.1.1.27)
[1417349186] HOST NOTIFICATION: nagiosadmin;1.1.1.27;DOWN;notify-host-by-email;CRITICAL - Host Unreachable (1.1.1.27)
配置:
①:配置警示郵箱
sed -i 's#email nagios@localhost#email byrd_monitor@163.com#g' /usr/local/nagios/etc/objects/contacts.cfg #修改發送警示郵件地址
②:配置主機警示頻次(備忘:可以自訂,也可以修改修改/usr/local/nagios/etc/objects/templates.cfg)
define host{
name linux-server #linux模板通用名
use generic-host #繼承了通用主機模板的其他值
check_period 24x7 #檢查周期7*24小時
check_interval 2 #每隔2分鐘檢查一次
retry_interval 1 #異常後,1分鐘後重試
max_check_attempts 3 #異常後,最大嘗試3次,然後警示
check_command check-host-alive #檢查主機存活命令
notification_period 24x7 #工作時間通知
notification_interval 2 #異常後,通知間隔2分
notification_options d,u,r #當主機down(關機)、unrealcable(不可達)、recovery(恢複)
contact_groups admins #通知發送Administrator 群組
register 0 #???
}
③:佈建服務警示頻次
define service{
name generic-service #泛型服務模板名稱
active_checks_enabled 1 #服務檢查啟用
passive_checks_enabled 1 #被動檢查啟用
parallelize_check 1 #並行檢查開啟
obsess_over_service 1 #分布式監控使用,1啟用,0禁用
check_freshness 0 #不檢查服務'freshness'
notifications_enabled 1 #服務通知啟用
event_handler_enabled 1 #啟用服務事件處理常式
flap_detection_enabled 1 #Flap detection is enabled
failure_prediction_enabled 1 #啟用故障預測
process_perf_data 1 #效能資料
retain_status_information 1 #保留重新啟動狀態資訊
retain_nonstatus_information 1 #保留非狀態資訊
is_volatile 0 #The service is not volatile
check_period 24x7 #7*24
max_check_attempts 3 #重新檢查服務3次,以確認是否真正的狀態
normal_check_interval 1 #正常情況下每個1分鐘檢查一次
retry_check_interval 1 #每隔1分鐘檢查一次服務,直到真正的狀態確定
contact_groups admins #通知管理組
notification_options w,u,c,r #發送通知,當服務狀態為warning, unknown, critical, and recovery events
notification_interval 2 #60分鐘後重新通知狀態
notification_period 24x7 #7*24
register 0 #???
}