Nagios Introduction
Nagios is an open source computer system and network monitoring tool, can effectively monitor Windows, Linux and UNIX host State, switch routers and other network settings, printers and so on. When the system or service status is abnormal, send an email or SMS alert the first time to notify the site operators, in the state of recovery after the normal mail or SMS notification.
Nagios formerly known as Netsaint, by Ethan Galstad development and maintenance to date. Nagios is an abbreviated form: "Nagios Ain ' t gonna insist on sainthood" sainthood translates as saints, while "Agios" is the Greek representation of "saint". Nagios was developed under Linux, but also worked very well under UNIX.
Main function
Network Service Monitoring (SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH) host resource monitoring (CPU load, disk usage, system logs), also includes Windows host (using nsclient++ Plugin) You can specify that your own plugin collects data over the network to monitor any situation (temperature, warning ...). Can be configured Nagios remote execution plug-in remote scripting remote monitoring support SSH or SSL plus channel way to monitor simple plugin design allows users to easily develop the inspection services they need, supporting many development languages (shell scripts, C + +, Perl, Ruby, Python, PHP, C #, etc. contains a number of graphical data plugins (Nagiosgraph, Nagiosgrapher, Pnp4nagios, etc.) parallel service checks can define the level of network hosts, allowing for progressive checking, is to start checking down from the parent host when there is a problem with the service or host, by email, pager, SMS or any user-defined plugin to notify the ability to customize an event-handling mechanism to reactivate a problematic service or host automatic log loops support redundant monitoring including web interface to view current network status, notification, problem history, log files, etc.
Support SMS, email notification
Nagios official website http://www.nagios.org
1. Nagios Installation-service side (192.168.0.11)
CENTOS6 The default Yum source does not have Nagios-related RPM packages, but we can install a epel extension source:
Copy Code code as follows:
Yum Install-y epel-release
Then install Nagios-related packages
Copy Code code as follows:
Yum install-y httpd Nagios nagios-pluginsnagios-plugins-all Nrpe
Set the user and password for login Nagios background: htpasswd-c/etc/nagios/passwd nagiosadmin
Copy Code code as follows:
Nagios-v/etc/nagios/nagios.cfg Detection configuration file
Start Services: Service httpd start; Servicenagios start
Browser access: Http://ip/nagios
Vim/etc/nagios/nagios.cfg #暂时先不管
2. Nagios Installation-Client (192.168.0.12)
On the client machine
Copy Code code as follows:
Yum Install-y epel-release
Yum install-y nagios-plugins Nagios-plugins-allnrpe Nagios-plugins-nrpe
Vim/etc/nagios/nrpe.cfg find "allowed_hosts=127.0.0.1" instead of "allowed_hosts=127.0.0.1,192.168.0.11" #服务器的ip
Find "dont_blame_nrpe=0" instead of "dont_blame_nrpe=1."
Start Client/etc/init.d/nrpe start
3. Monitoring Center (192.168.0.11) to add the monitored host (192.168.0.12)
Copy Code code as follows:
Vim/etc/nagios/conf.d/192.168.0.12.cfg
Define Host{
Use Linux-server
HOST_NAME 192.168.0.12
Alias 0.12
Address 192.168.0.12
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_ping
Check_command check_ping!100.0,20%!200.0,50% #0是ok, 20 is warning, 50 is dangerous.
MAX_CHECK_ATTEMPTS5 #单位是秒数
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_ssh
Check_command Check_ssh
Max_check_attempts 5 When the Nagios detects a problem, a total of 5 attempts to detect the problem will be the alarm, if the value is 1, then the problem detected immediately alarm
Normal_check_interval 1; The time interval for the re-test, the unit is minutes, the default is 3 minutes
Notification_interval 60; After the service has an exception, the failure has not been resolved, Nagios time to notify the user again. Unit is minutes. If you think that all events need only one notification is enough, you can set the option here to 0.
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_http
Check_command check_http
Max_check_attempts 5
Normal_check_interval 1
}
The above common service does not depend on the client Nrpe service, we can imagine that we can use ping or telnet on our computer to detect whether any remote machine survives, or whether to open a port or service. When we want to detect a particular service on a client, we need to use Nrpe, such as the responsibility of the client machine or disk usage.
4. Continue to add services
Increase:
Copy Code code as follows:
Define Command{
Command_name Check_nrpe #去对方获得服务状态, customizable
Command_line $USER 1$/check_nrpe-h $HOSTADDRESS $-c $ARG 1$
}
Continue editing
Copy Code code as follows:
Vim/etc/nagios/conf.d/192.168.0.12.cfg
Add the following:
Copy Code code as follows:
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_load
Check_command Check_nrpe!check_load
Max_check_attempts 5
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_disk_hda2
Check_command Check_nrpe!check_hda2 #这个不要写错, is the corresponding client
Max_check_attempts 5
Normal_check_interval 1
}
Description: Check_nrpe!check_load: Here the Check_nrpe is just defined in Commands.cfg, Check_load is a detection script on the remote host
Vim/etc/nagios/nrpe.cfg Search Check_load on the client, this line is the script to execute on the server, we can execute this script manually
Change check_hda1:/dev/hda1 changed to/DEV/SDA1
Add another line command[check_hda2]=/usr/lib/nagios/plugins/check_disk-w 20%-C 10%-p/dev/sda2 # w = warnning
c = critial
Crital can not be larger than the value of warning
Mechanism: First define the Check_nrpe command on the server side, and then through Check_nrpe followed by command (in the client's nrpe.cfg)
Reboot on client Nrpe services: Service Nrpe restart
Service side also restart Nagios service: Nagios restart
5. Configure Alarm
Copy Code code as follows:
VIM/ETC/NAGIOS/OBJECTS/CONTACTS.CFG//Increase:
Define Contact{
Contact_Name
Use Generic-contact
Alias Aming
Email @qq. com
}
Define Contact{
Contact_Name
Use Generic-contact
Alias AAA
Email wsw@.com
}
Define contactgroup{#定义联系组
Contactgroup_name Common
Alias Common
Members,
}
And then add Contactgroup to the service that needs the alarm.
Copy Code code as follows:
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_load
Check_command Check_nrpe!check_load
Max_check_attempts 5
Normal_check_interval 1
Contact_groups Common #监控哪个发邮件
Notifications_enabled 1; Whether to turn on the reminder function.
1 is turned on and 0 is disabled. Typically, this option is defined in the main configuration file (Nagios.cfg), with the same effect.
Notification_period 24x7; The time period during which reminders are sent. Very important host (service) I am defined as 7x24, the general host (service) is defined as the working hours. If you do not have a defined time period, no reminders will be sent, no matter what the problem occurs.
Notification_options:w,u,c,r; This is the state of the service. W is waning, U is unknown, C is critical, R is recover (restored), similar to a host corresponding state: d,u,r d = state is down, U = status is unreachable, R = status is OK, you need To be added to the definition configuration of host.
}
6. Configure graphical display Pnp4nagios
(1) Installation
Copy Code code as follows:
Yum Install Pnp4nagios RRDtool
(2) Configure the main configuration file
Copy Code code as follows:
VIM/ETC/NAGIOS/NAGIOS.CFG//Modify the following configuration
Process_performance_data=
Host_perfdata_command=process-host-perfdata
Service_perfdata_command=process-service-perfdata
enable_environment_macros=
(3) Modify Commands.cfg
Vim/etc/nagios/objects/commands.cfg//Comment out the original pair of Process-host-perfdata and Process-service-perfdata, redefine
Copy Code code as follows:
Define Command {
Command_name Process-service-perfdata
command_line/usr/bin/perl/usr/libexec/pnpnagios/process_perfdata.pl
}
Define Command {
Command_name Process-host-perfdata
Command_line/usr/bin/perl/usr/libexec/pnpnagios/process_perfdata.pl-d Hostperfdata
}
(4) Modify the configuration file Templates.cfg
Copy Code code as follows:
vim/etc/nagios/objects/templates.cfg Definehost {
name HOSTS-PNP
register 0
action_url/pnp4nagios/index.php/graph?host= $HOSTNAME $&srv=_host_
process_perf_data 1
}
Define service {
&nbs p; name SRV-PNP
register 0
action_url/pnp4nagios/index.php/graph?host= $HOSTNAME $ &srv= $SERVICEDESC $
process_perf_data 1
}
(5) Modify host and service configuration
Copy Code code as follows:
Vim/etc/nagios/conf.d/192.168.0.12.cfg
Put "define host{
Use Linux-server "
To
Copy Code code as follows:
Define Host{
Use Linux-server,hosts-pnp
Modify the corresponding service, such as
Put
Copy Code code as follows:
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}
To
Copy Code code as follows:
Define Service{
Use Generic-service,srv-pnp
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}
(6) Restart and start each service:
Copy Code code as follows:
Service Nagios Restart
Service httpd Restart
Service NPCD Start
(7) Access test
Two methods of Access:
Copy Code code as follows:
ip/nagios/
ip/pnp4nagios/
The above is a small set for you to share the Nagios installation and use of detailed tutorials, I hope to help you.