Detailed tutorial on installation and use of Nagios

Detailed tutorial on installation and use of Nagios _linux

Last Update:2017-01-18 Source: Internet

Author: User

Tags time interval ssh disk usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Nagios Introduction

Nagios is an open source computer system and network monitoring tool, can effectively monitor Windows, Linux and UNIX host State, switch routers and other network settings, printers and so on. When the system or service status is abnormal, send an email or SMS alert the first time to notify the site operators, in the state of recovery after the normal mail or SMS notification.

Nagios formerly known as Netsaint, by Ethan Galstad development and maintenance to date. Nagios is an abbreviated form: "Nagios Ain ' t gonna insist on sainthood" sainthood translates as saints, while "Agios" is the Greek representation of "saint". Nagios was developed under Linux, but also worked very well under UNIX.

Main function

Network Service Monitoring (SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH) host resource monitoring (CPU load, disk usage, system logs), also includes Windows host (using nsclient++ Plugin) You can specify that your own plugin collects data over the network to monitor any situation (temperature, warning ...). Can be configured Nagios remote execution plug-in remote scripting remote monitoring support SSH or SSL plus channel way to monitor simple plugin design allows users to easily develop the inspection services they need, supporting many development languages (shell scripts, C + +, Perl, Ruby, Python, PHP, C #, etc. contains a number of graphical data plugins (Nagiosgraph, Nagiosgrapher, Pnp4nagios, etc.) parallel service checks can define the level of network hosts, allowing for progressive checking, is to start checking down from the parent host when there is a problem with the service or host, by email, pager, SMS or any user-defined plugin to notify the ability to customize an event-handling mechanism to reactivate a problematic service or host automatic log loops support redundant monitoring including web interface to view current network status, notification, problem history, log files, etc.

Support SMS, email notification

Nagios official website http://www.nagios.org

1. Nagios Installation-service side (192.168.0.11)

CENTOS6 The default Yum source does not have Nagios-related RPM packages, but we can install a epel extension source:

Copy Code code as follows:

Yum Install-y epel-release

Then install Nagios-related packages

Copy Code code as follows:

Yum install-y httpd Nagios nagios-pluginsnagios-plugins-all Nrpe

Set the user and password for login Nagios background: htpasswd-c/etc/nagios/passwd nagiosadmin

Copy Code code as follows:

Nagios-v/etc/nagios/nagios.cfg Detection configuration file

Start Services: Service httpd start; Servicenagios start

Browser access: Http://ip/nagios

Vim/etc/nagios/nagios.cfg #暂时先不管

2. Nagios Installation-Client (192.168.0.12)

On the client machine

Copy Code code as follows:

Yum Install-y epel-release
Yum install-y nagios-plugins Nagios-plugins-allnrpe Nagios-plugins-nrpe
Vim/etc/nagios/nrpe.cfg find "allowed_hosts=127.0.0.1" instead of "allowed_hosts=127.0.0.1,192.168.0.11" #服务器的ip
Find "dont_blame_nrpe=0" instead of "dont_blame_nrpe=1."

Start Client/etc/init.d/nrpe start

3. Monitoring Center (192.168.0.11) to add the monitored host (192.168.0.12)

Copy Code code as follows:

Vim/etc/nagios/conf.d/192.168.0.12.cfg
Define Host{
Use Linux-server
HOST_NAME 192.168.0.12
Alias 0.12
Address 192.168.0.12
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_ping
Check_command check_ping!100.0,20%!200.0,50% #0是ok, 20 is warning, 50 is dangerous.
MAX_CHECK_ATTEMPTS5 #单位是秒数
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_ssh
Check_command Check_ssh
Max_check_attempts 5 When the Nagios detects a problem, a total of 5 attempts to detect the problem will be the alarm, if the value is 1, then the problem detected immediately alarm
Normal_check_interval 1; The time interval for the re-test, the unit is minutes, the default is 3 minutes
Notification_interval 60; After the service has an exception, the failure has not been resolved, Nagios time to notify the user again. Unit is minutes. If you think that all events need only one notification is enough, you can set the option here to 0.
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_http
Check_command check_http
Max_check_attempts 5
Normal_check_interval 1
}

The above common service does not depend on the client Nrpe service, we can imagine that we can use ping or telnet on our computer to detect whether any remote machine survives, or whether to open a port or service. When we want to detect a particular service on a client, we need to use Nrpe, such as the responsibility of the client machine or disk usage.

4. Continue to add services

Increase:

Copy Code code as follows:

Define Command{
Command_name Check_nrpe #去对方获得服务状态, customizable
Command_line $USER 1$/check_nrpe-h $HOSTADDRESS $-c $ARG 1$
}

Continue editing

Copy Code code as follows:

Vim/etc/nagios/conf.d/192.168.0.12.cfg

Add the following:

Copy Code code as follows:

Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_load
Check_command Check_nrpe!check_load
Max_check_attempts 5
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}
Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_disk_hda2
Check_command Check_nrpe!check_hda2 #这个不要写错, is the corresponding client
Max_check_attempts 5
Normal_check_interval 1
}

Description: Check_nrpe!check_load: Here the Check_nrpe is just defined in Commands.cfg, Check_load is a detection script on the remote host

Vim/etc/nagios/nrpe.cfg Search Check_load on the client, this line is the script to execute on the server, we can execute this script manually
Change check_hda1:/dev/hda1 changed to/DEV/SDA1

Add another line command[check_hda2]=/usr/lib/nagios/plugins/check_disk-w 20%-C 10%-p/dev/sda2 # w = warnning
c = critial

Crital can not be larger than the value of warning

Mechanism: First define the Check_nrpe command on the server side, and then through Check_nrpe followed by command (in the client's nrpe.cfg)

Reboot on client Nrpe services: Service Nrpe restart
Service side also restart Nagios service: Nagios restart

5. Configure Alarm

Copy Code code as follows:

VIM/ETC/NAGIOS/OBJECTS/CONTACTS.CFG//Increase:
Define Contact{
Contact_Name
Use Generic-contact
Alias Aming
Email @qq. com
}
Define Contact{
Contact_Name
Use Generic-contact
Alias AAA
Email wsw@.com
}
Define contactgroup{#定义联系组
Contactgroup_name Common
Alias Common
Members,
}

And then add Contactgroup to the service that needs the alarm.

Copy Code code as follows:

Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description Check_load
Check_command Check_nrpe!check_load
Max_check_attempts 5
Normal_check_interval 1
Contact_groups Common #监控哪个发邮件
Notifications_enabled 1; Whether to turn on the reminder function.
1 is turned on and 0 is disabled. Typically, this option is defined in the main configuration file (Nagios.cfg), with the same effect.
Notification_period 24x7; The time period during which reminders are sent. Very important host (service) I am defined as 7x24, the general host (service) is defined as the working hours. If you do not have a defined time period, no reminders will be sent, no matter what the problem occurs.
Notification_options:w,u,c,r; This is the state of the service. W is waning, U is unknown, C is critical, R is recover (restored), similar to a host corresponding state: d,u,r d = state is down, U = status is unreachable, R = status is OK, you need To be added to the definition configuration of host.
}

6. Configure graphical display Pnp4nagios

(1) Installation

Copy Code code as follows:

Yum Install Pnp4nagios RRDtool

(2) Configure the main configuration file

Copy Code code as follows:

VIM/ETC/NAGIOS/NAGIOS.CFG//Modify the following configuration
Process_performance_data=
Host_perfdata_command=process-host-perfdata
Service_perfdata_command=process-service-perfdata
enable_environment_macros=

(3) Modify Commands.cfg

Vim/etc/nagios/objects/commands.cfg//Comment out the original pair of Process-host-perfdata and Process-service-perfdata, redefine

Copy Code code as follows:

Define Command {
Command_name Process-service-perfdata
command_line/usr/bin/perl/usr/libexec/pnpnagios/process_perfdata.pl
}
Define Command {
Command_name Process-host-perfdata
Command_line/usr/bin/perl/usr/libexec/pnpnagios/process_perfdata.pl-d Hostperfdata
}

(4) Modify the configuration file Templates.cfg

Copy Code code as follows:

vim/etc/nagios/objects/templates.cfg Definehost {
         name      HOSTS-PNP
       register   0
        action_url/pnp4nagios/index.php/graph?host= $HOSTNAME $&srv=_host_
       process_perf_data               1
}
Define service {
    &nbs p;   name      SRV-PNP
       register 0
        action_url/pnp4nagios/index.php/graph?host= $HOSTNAME $ &srv= $SERVICEDESC $
       process_perf_data               1
}

(5) Modify host and service configuration

Copy Code code as follows:

Vim/etc/nagios/conf.d/192.168.0.12.cfg
Put "define host{
Use Linux-server "

Copy Code code as follows:

Define Host{
Use Linux-server,hosts-pnp

Modify the corresponding service, such as

Put

Copy Code code as follows:

Define Service{
Use Generic-service
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}

Copy Code code as follows:

Define Service{
Use Generic-service,srv-pnp
HOST_NAME 192.168.0.12
Service_description check_disk_hda1
Check_command check_nrpe!check_hda1
Max_check_attempts 5
Normal_check_interval 1
}

(6) Restart and start each service:

Copy Code code as follows:

Service Nagios Restart
Service httpd Restart
Service NPCD Start

(7) Access test

Two methods of Access:

Copy Code code as follows:

ip/nagios/
ip/pnp4nagios/

The above is a small set for you to share the Nagios installation and use of detailed tutorials, I hope to help you.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More