CentOS builds Nagios monitoring system.

Source: Internet
Author: User

Build Nagios monitoring under Linux

One, what is Nagios


1.nagios Introduction

Nagios is a monitoring system that monitors the system's operational status and network information.

Nagios can monitor the specified local or remote host and service, as well as provide exception notification functionality.

Nagios runs on top of the Linux/unix platform, while providing an optional browser-based web interface to allow system administrators to view network status.

Various system problems, as well as logs and so on.

Nagios is a very popular, open source and free computer and network system monitoring software.

Nagios is the abbreviation for "Nagios Ain ' t gonna insist on sainthood".

It was first released in 1999 with "Netsaint". Nagios is primarily used for monitoring in Linux and UNIX platform environments,

But through the plugin, you can also monitor the MS Windows System host. Nagios is poll and selected as the most popular it ops tool in LinuxCon.

It was named best open source software by InfoWorld in 2009 and is the best choice for systems management tools for the SourceForge community of the year.

Nagios is also used by many well-known companies, including Aol,dhl,at&t, L ' oreal, Texas Instruments, Siemens COM CZ, Time Warner Cable, Yahoo, etc.


The main features of the 2.Nagios are:


-Monitor network services (SMTP, POP3, HTTP, NNTP, ping, etc.)

-Monitor host resources (processes, disks, etc.)

-Simple plug-in design to easily expand Nagios's monitoring capabilities

-concurrent processing of monitors such as services

-Error notification function (via email, pager, or other user-defined method)

-can specify a custom event handling controller

-Optional browser-based web interface to allow system administrators to view network status, system issues, logs, etc.

-System monitoring information can be viewed from the phone


II. Nagios Monitoring Environment construction

1. Setting Up Environment Introduction:


HostnameIP System


Service side webserver192.168.1.20CentOS 6.6

Client hpf-linux192.168.1.110CentOS 6.6


2. Basic service-side installation:

[[email protected] ~]# yum install-y epel-release//Omit this step if the machine has a epel extension source installed [[email protected] ~]# Yum install-y httpd Nagi Os nagios-plugins nagios-plugins-all Nrpe Nagios-plugins-nrpe//install Nagios related packages [[email protected] ~]# htpasswd-c/etc/nagio S/PASSWD nagiosadmin//Generate login Nagios backend account and password new password:re-type new password:adding password for user nagiosadmin[[email Protected] ~]# nagios-v/etc/nagios/nagios.cfg//Detect Nagios configuration file Error Total Warnings:0total errors:0things look okay-n O Serious problems were detected during the pre-flight check



Start service-side Nagios services and monitoring services:

[[email protected] ~]#/etc/init.d/httpd start[[email protected] ~]#/etc/init.d/nagios start


Log in to the browser to enter Http://ip/nagios to see if the service is Nagios build success



Enter the Nagios backend management by entering the password you just generated;




Click Serviers View monitoring, according to monitor the service is normal debugging;


The HTTP service can have a WARNING at first, with an error prompt for HTTP warning:http/1.1 403 Forbidden-5152 bytes in 0.001 second response t;

The reason for this is: when Nagios monitors HTTP, it will monitor the index.html file under/var/www/html/, and if not, it will prompt an error.

Create a file! After the creation, the monitoring status will be changed to OK;


3. Add Server Nagios Monitor (increase monitoring client)


Client installs Nagios monitoring service and file configuration:

[[email protected] ~]# yum install-y epel-release//Omit this step if the client has installed the Epel extension source [[email protected] ~]# Yum install-y nagios-plug INS Nagios-plugins-all Nrpe Nagios-plugins-nrpe//install Nagios monitoring related packages [[email protected] ~]# vi/etc/nagios/nrpe.cfg found "Allowe d_hosts=127.0.0.1 "changed to" allowed_hosts=127.0.0.1,192.168.1.20 "after the IP for the server IP; The changes to the two configuration files under "Dont_blame_nrpe=0" to "dont_blame_nrpe=1" are changed according to the Monitoring Service (CHECK_HDA1) added by the Nagios service side: command[check_sda1]= /usr/lib/nagios/plugins/check_disk-w 20%-C 10%-p/dev/sda1command[check_sda2]=/usr/lib/nagios/plugins/check_disk- W 20%-C 10%-p/dev/sda2



Note: The option to add COMMAND[CHECK_SDA] to the command options on both the monitor and the monitored side

And after restarting Nrpe and Nagios, it will take a while for Nagios's web pages to mark the original check disk

For the critical option to revert to normal.



Configure the server-side Nagios script file:

[[email protected] conf.d]# vi /etc/nagios/objects/commands.cfg  // Under this profile, add the following content define command{         command_name     check_nrpe         command_line      $USER 1$/check_nrpe -h  $HOSTADDRESS $ -c  $ARG 1$          }[[email protected] ~]# cd /etc/nagios/conf.d/[[email  protected] conf.d]# vi 192.168.1.110.cfg define host{          use                  linux-server                      host_name            192.168.1.110         alias                1.110         address              192.168.1.110          } define service{         use                       generic-service         host_name                192.168.1.110          service_description     check_ping          check_command            check_ping!100.0,20%!200.0,50%         max_check_attempts       5              normal_check_ interval   1 } define service{         use                       generic-service        host_name                 192.168.1.110         service_description      check_ssh         check_command             check_ssh        max_check_attempts        5           #当nagios检测到问题时, a total of 5 attempts to detect a problem before the alarm, if the value is 1, Then detect the problem immediately alarm         normal_check_interval    1            #重新检测的时间间隔, Unit is minutes, default is 3 minutes          notification_interval    60           #在服务出现异常后, the failure has not been resolved, and Nagios again notifies the user of the time. Units are minutes. If you think that all events require only one notification, you can set the option here to 0.  }define service{         use                       generic-service         host_name                192.168.1.110          service_description     check_http         check_command            check_http          max_check_attempts      5          normal_check_interval   1 }define service{         use                      generic-service        host_name                192.168.1.110         service_description     check_load         check_command            check_nrpe!check_load        max_check_attempts      5         normal_check_interval   1}define service{         use                      generic-service         host_name                192.168.1.110        service_description      check_disk_sda1        check_command            check_nrpe!check_sda1         max_check_attempts      5         normal_check_interval   1}define service{        use                         generic-service        host_name                  192.168.1.110         service_description       check_disk_sda2         check_command              check_nrpe!check_sda2        max_check_ attempts        5         Normal_check_interval     1}[[email protected] ~]# nagios -v  /etc/nagios/nagios.cfg   //detects if the configuration file is correct total warnings: 0total errors:   0things look  Okay - no serious problems were detected during the pre-flight  check


To start the Nrpe service on the client:

[[email protected] ~]#/etc/init.d/nrpe start


Restart the Nagios service on the server:

[Email protected] ~]#/etc/init.d/nagios restart



See if the monitoring of the Nagios service is displayed on the browser:



4. Configure Email Alerts:

[[email protected] ~]#  vim /etc/nagios/objects/contacts.cfgdefine contact{         contact_name                 nagios1        use                                   generic-contact         alias                                 mail1        email                              &nbSp [email protected]        }define contact{         contact_name                nagios2        use                                  generic-contact         alias                               mail2         email                             [ Email protected]        }define contactgroup{         contactgroup_name           common         alias                                       common        members                              nagios1,nagios2         }[[email protected] conf.d]# vi 192.168.1.110.cfg  The 192.168.1.110.cfg  configuration file above has the following section: Define service{        use                      generic-service         host_name                192.168.1.110        service_description      check_load        check_command            check_nrpe!check_load         max_check_attempts      5         NORMAL_CHECK_INTERVAL   1} Add the following four statements to the last section of the configuration:         contact_groups        common         notifications_enabled  1              #是否开启提醒功能.1 is on, 0 is disabled. In general, this option is defined in the main configuration file (Nagios.cfg), with the same effect.         notification_period   24x7             #发送提醒的时间段. Very important host (service) I defined as 7x24, the general host (service) is defined as working hours.           #如果不在定义的时间段内, no reminders are sent, no matter what happens.                 notification_ options    w,u,c,r           # This is the status of the service. W for Waning, u for unknown, c for Critical, r for recover (resumed),          #类似的还有一个   host corresponding status:d,u,r   d =  status is down, u =  Status is unreachable , r =  state reverts to ok,        # Need to be added to the definition configuration of host.
[[email protected] ~]# nagios-v/etc/nagios/nagios.cfg//Detect configuration File Error Total Warnings:0total errors:0things look okay- No serious problems were detected during the pre-flight check

5. Verify that the alert message configuration is successful:

Turn on the Virtual machine Mail Service

[[email protected] ~]# yum install-y sendmail//install mail Service pack [[email protected] ~]#/etc/init.d/sendmail start//Start mail Service [[E                   Mail protected] ~]# NETSTAT-LNP |grep sendmail//view mail Service open port TCP 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1011/sendmail

Configure a whitelist of 163 mailboxes on your browser to prevent alert messages from being treated as spam:

Wkiol1wciowtxqqraaufz-fmy94418.jpg

[[email protected] ~]#/etc/init.d/nrpe stop//In the client to turn off the Nrpe service to see whether the server sends alarm messages; Shutting down Nrpe: [OK]



Alarm message send time will have a period of time delay, need to wait patiently;



This article is from the "clear" blog, make sure to keep this source http://duanyexuanmu.blog.51cto.com/1010786/1750019

CentOS builds Nagios monitoring system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.