System: Centos 6.4
Server: hostname: Nagios-Server kernel: 2.6.32-358. el6.x86 _ 64 IP: 1.1.1.26
Client: hostname: Nagios-Client kernel: 2.6.32-358. el6.x86 _ 64 IP: 1.1.1.27
Upgrade the kernel:
[Root @ Nagios-Server ~] # Yum install ntpdate-y # install ntpdate for time synchronization
[Root @ Nagios-Server ~] #/Usr/sbin/ntpdate time.nist.gov # Time Synchronization
[Root @ Nagios-Server ~] # Yum install kernel-devel gcc-c ++ wget vim-y # upgrade the kernel
[Root @ Nagios-Client ~] # Yum install ntpdate-y
[Root @ Nagios-Client ~] #/Usr/sbin/ntpdate time.nist.gov
[Root @ Nagios-Client ~] # Yum install kernel-devel gcc-c ++ wget vim-y
Kernel after upgrade:
Server: 2.6.32-504.1.3.el6.x86 _ 64
Client: 2.6.32-504.1.3.el6.x86 _ 64
Log analysis:
I. Service
① Check_users service monitoring, the first soft, the second soft, the third hard, and then send an email
[1417348396] Warning: Return code of 255 for check of service 'check _ users' on host' 1. 1.1.27 'was out of bounds.
[1417348396] service alert: 1.1.1.27; check_users; CRITICAL; SOFT; 1; (Return code of 255 is out of bounds)
[1417348456] Warning: Return code of 255 for check of service 'check _ users' on host' 1. 1.1.27 'was out of bounds.
[1417348456] service alert: 1.1.1.27; check_users; CRITICAL; SOFT; 2; (Return code of 255 is out of bounds)
[1417348516] Warning: Return code of 255 for check of service 'check _ users' on host' 1. 1.1.27 'was out of bounds.
[1417348516] service alert: 1.1.1.27; check_users; CRITICAL; HARD; 3; (Return code of 255 is out of bounds)
[1417348516] service notification: nagiosadmin; 1.1.1.27; check_users; CRITICAL; policy-service-by-email; (Return code of 255 is out of bounds)
②: Check_zombie_procs service monitoring, the first soft, the second soft, the third hard, and then send an email
[1417348426] Warning: Return code of 255 for check of service 'check _ zombie_procs 'on host' 1. 1.1.27' was out of bounds.
[1417348426] service alert: 1.1.1.27; check_zombie_procs; CRITICAL; SOFT; 1; (Return code of 255 is out of bounds)
[1417348486] Warning: Return code of 255 for check of service 'check _ zombie_procs 'on host' 1. 1.1.27' was out of bounds.
[1417348486] service alert: 1.1.1.27; check_zombie_procs; CRITICAL; SOFT; 2; (Return code of 255 is out of bounds)
[1417348546] Warning: Return code of 255 for check of service 'check _ zombie_procs 'on host' 1. 1.1.27' was out of bounds.
[1417348546] service alert: 1.1.1.27; check_zombie_procs; CRITICAL; HARD; 3; (Return code of 255 is out of bounds)
[1417348546] service notification: nagiosadmin; 1.1.1.27; check_zombie_procs; CRITICAL; policy-service-by-email; (Return code of 255 is out of bounds)
③: Check_total_procs service monitoring, the first soft, the second soft, the third hard, and then send an email
[1417348436] Warning: Return code of 255 for check of service 'check _ total_procs 'on host '1. 1.1.27' was out of bounds.
[1417348436] service alert: 1.1.1.27; check_total_procs; CRITICAL; SOFT; 1; (Return code of 255 is out of bounds)
[1417348496] Warning: Return code of 255 for check of service 'check _ total_procs 'on host '1. 1.1.27' was out of bounds.
[1417348496] service alert: 1.1.1.27; check_total_procs; CRITICAL; SOFT; 2; (Return code of 255 is out of bounds)
[1417348556] Warning: Return code of 255 for check of service 'check _ total_procs 'on host '1. 1.1.27' was out of bounds.
[1417348556] service alert: 1.1.1.27; check_total_procs; CRITICAL; HARD; 3; (Return code of 255 is out of bounds)
[1417348556] service notification: nagiosadmin; 1.1.1.27; check_total_procs; CRITICAL; policy-service-by-email; (Return code of 255 is out of bounds)
II. Host
[1, 1417349046] host alert: 1.1.1.27; DOWN; SOFT; 1; CRITICAL-Host Unreachable (1.1.1.27)
[1, 1417349116] host alert: 1.1.1.27; DOWN; SOFT; 2; CRITICAL-Host Unreachable (1.1.1.27)
[1, 1417349186] host alert: 1.1.1.27; DOWN; HARD; 3; CRITICAL-Host Unreachable (1.1.1.27)
[1417349186] host notification: nagiosadmin; 1.1.1.27; DOWN; running y-host-by-email; CRITICAL-Host Unreachable (1.1.1.27)
Configuration:
①: Configure the alarm mailbox
Sed-I's # email nagios @ localhost # email byrd_monitor@163.com # g'/usr/local/nagios/etc/objects/contacts. cfg # Modify the email address for sending alerts
②: Configure the host alarm frequency(Note: You can customize or modify/usr/local/nagios/etc/objects/templates. cfg)
Define host {
Name linux-server # common name of linux Template
Use generic-host # inherits other values of the general host template
Check_period 24x7 # check cycle 7x24 hours
Check_interval 2 # check every 2 minutes
Retry_interval 1 # try again in 1 minute after an exception
Max_check_attempts 3 # After an exception occurs, the maximum number of attempts is 3, and an alarm is reported.
Check_command check-host-alive # check the host survival command
Notification_period 24x7 # Working time notification
Icationication_interval 2 # interval of 2 points after an exception
Icationication_options d, u, r # When the host is down (shut down), unrealcable (inaccessible), recovery (recovery)
Contact_groups admins # notify the sender administrator Group
Register 0 #???
}
③: Configure the service alarm frequency
Define service {
Name generic-service # General service Template name
Active_checks_enabled 1 # enable service check
Passive_checks_enabled 1 # passive check enabled
Paralle_e_check 1 # enabling parallel check
Obsess_over_service 1 # distributed monitoring, 1 enabled, 0 disabled
Check_freshness 0 # Do not check the service 'refreshness'
Notifications_enabled 1 # Service notification enabled
Event_handler_enabled 1 # enable the service event handler
Flap_detection_enabled 1 # Flap detection is enabled
Failure_prediction_enabled 1 # enable fault prediction
Process_perf_data 1 # performance data
Retain_status_information 1 # retainin_status_information
Retain_nonstatus_information 1 # retain non-state information
Is_volatile 0 # The service is not volatile
Check_period 24x7 #7*24
Max_check_attempts 3 # Re-check the service three times to check whether the service is in a real status
Normal_check_interval 1 # check every 1 minute under normal circumstances
Retry_check_interval 1 # check the service once every 1 minute until the real status is determined.
Contact_groups admins # Notification Management Group
Icationication_options w, u, c, r # sends a notification when the service status is warning, unknown, critical, and recovery events
Icationication_interval 2 # notification status again after 60 minutes
Icationication_period 24x7 #7*24
Register 0 #???
}