Nagios+influxdb+grafana Monitoring Data visualization process

Source: Internet
Author: User
Tags grafana influxdb

Nagios Introduction

Nagios is an open source monitoring application that can be used to monitor local and remote host logs, resources, dead and alive, and many more. Through the SNMP protocol and the Nrpe protocol.

The Nagios profile is configured on nconf and then clicked on to the server, which has various templates that can be either self-matching or available.

Nagios's build process, self-Baidu.

Here is an example of a Nagios configuration file

Nagios configuration file directory structure:

# ll/usr/local/nagios/etc/total 152-rw-rw-r--1 nagios nagios 12999 APR 29 08:08 Cgi.cfgdrwxr-xr-x 2 Apache Apache 4096 may 08:45 default_collector <-define a directory for monitoring items drwxr-xr-x 2 Apache Apache 4096 may 08:38 global <-definition of global parameters, Checkcommand definition -rw-r--r--1 nagios nagios April 08:12 htpasswd.users-rw-rw-r--1 nagios nagios 46169 May 3 01:37 nagios.cfg 
    
      <-nagios configuration file 
    -rw-rw-r--1 root root 44816 Apr 08:58 nagios.cfg.29-04-18-rw-r- -r--1 root root 2358 2 10:48 nagiosconfig.tgzdrwxr-xr-x 2 Nagios nagios 4096 May 8 11:59 nrpe-rw-r--r--1 nag iOS nagios 7217 Apr 08:45 nrpe.cfg <-Nrpe communication configuration -rwxr-xr-x 1 Nagios nagios 7  217 Apr 08:38 nrpe.cfg.savedrwxrwxr-x 2 nagios nagios 4096 Apr 02:19 objects-rw-rw----1 Nagios nagios 1312 APR 29 08:08 resource.cfg 

Vim/usr/local/nagios/etc/default_collect/services.cfg

Define Service {                service_description                   check_proc_nagios    <-page shows the name of the monitoring item, defined here                check_ Command                         check_remote_procs!nagios!8!1:8   <-Execution of the check command, this configuration is the main command of monitoring, almost by a command to monitor whether OK, by "! "                host_name host name <-as the parameter delimiter                             writes out  which server to monitor, provided that the Nagios server is able to communicate                Check_period                          24x7  <-time                to monitor Contact_groups                        +admins  <-Contact Group, contact, error sent to who                event_handler_enabled                 0                use                                   which existing template (configured on nconf) is used by the <-}

Vim/usr/local/nagios/etc/default_collect/hosts.cfg

Define host {host_name **********The host name defined by <-is primarily displayed on the monitoring page
Alias ***サーバ<--defined aliasesAddress ***-b-**-***-**.stage-***-org.fastretailing.cn<-This line defines the IP address that is the primary communication_graphiteprefix graphite_host Icon_image_alt Linux Icon_image base/linux40.gif statusmap_image Base/linux 40.GD2 Check_command check-host-alive Check_period 24x7 Notification_period 24x7 Contact_groups +admins Use Generic-host,linux-server}

Vim/usr/local/nagios/etc/global/checkcommands.cfg

Define command {command_name Check_local_disk command_line $USER 1$/check_disk-w $ARG 1$-C $ARG 2$-P $ARG 3$<-in Services.cfg, check_command parameter, use "! "Split is the interval between the parameters of" $ARG $ "}
--------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------
Define Command {
Command_name Check_remote_procs
Command_line $USER 1$/check_nrpe-h $HOSTADDRESS $-C check_procs-a "-C $ARG 1$-W $ARG 2$-C $ARG 3$ "<-This line of monitoring is configured to monitor remote processes
                               
}


Above, the monitoring item is configured well.

INFLUXDB Introduction

Influxdb, Chinese name Time series database,

Timing data is a series of data based on time. These data points are connected to a line in the coordinates of time, which can be made into multi-latitude report, revealing its tendency and regularity; looking to the future, we can make big data analysis, machine learning, realize forecast and early warning.

The time series database is the database that holds the time series data, and it needs to support the basic functions of fast writing, persistence, and multi-latitude aggregation query.

Compared to the traditional database, only the current value of the data is recorded, and the time series database records all the historical data. The query of time series data also always takes time as the filter condition.

Influxdb Main Configuration

[Meta]  # where the Metadata/raft database is stored  dir = "/var/lib/influxdb/meta" [Data]  # The directory where the TSM St Orage engine stores TSM files.  <-Final Data  dir = "/var/lib/influxdb/data"  # The directory where the TSM storage engine stores WAL files.  <-   wal data for pre-write data  wal-dir = "/var/lib/influxdb/wal"

Nagios's process of depositing data into influxdb

Nagios will monitor the data, through the Graphios, to the influxdb, in order to facilitate later grafana on the visualization.

Process

1, nagios execution command output, write to a file, and then the file for MV, stored in a directory.

The specific configuration is as follows

# vim/usr/local/nagios/etc/global/misccommands.cfgdefine Command {                command_name                          graphios_perf_host                Command_line                          /BIN/MV  /usr/local/nagios/var/host-perfdata/var/spool/nagios/graphios/host-perfdata.$ timet$    <-mv Command main execution }define command {                command_name                          graphios_perf_service                command_line                          /bin/mv  /usr/local/nagios/var/service-perfdata/var/spool/nagios/graphios/service-perfdata. $TIMET $   main execution of the <-MV Command }

In the above configuration file, Mv/usr/local/nagios/var/host-perfdata/var/spool/nagios/graphios/host-perfdata is required. $TIMET $

Then we will face these problems:

1. How is the replication source generated?

2. What does the file in the copy source look like?

3, where is the configuration?

The above questions need to look at the following configuration file.

First look at Nagios's main configuration file nagios.cfg, from this file can be seen in the above problem:3, where the replication source is configured.

# vim/usr/local/nagios/etc/nagios.cfgprocess_performance_data=1service_perfdata_file=/usr/local/nagios/var/ Service-perfdata<-This line specifies the file to copy the source, which is the output to which the Nagios monitored command execution results are exported, as specified here. service_perfdata_file_template=datatype::serviceperfdata\ttimet:: $TIMET $\thostname:: $HOSTNAME $\tservicedesc:: $SERVICEDESC $\tserviceperfdata: : $SERVICEPERFDATA $\tservicecheckcommand:: $SERVICECHECKCOMMAND $\thoststate:: $HOSTSTATE $\thoststatetype::$ Hoststatetype$\tservicestate:: $SERVICESTATE $\tservicestatetype:: $SERVICESTATETYPE $\tgraphiteprefix::$_ servicegraphiteprefix$\tgraphitepostfix::$_servicegraphitepostfix$\tmetrictype::$_servicemetrictype$The above line configuration, configuration is the output of the file format, in order to facilitate the subsequent graphios to store in the Influxdb to prepare
Service_perfdata_file_mode=aservice_perfdata_file_processing_interval=15service_perfdata_file_processing_ Command=graphios_perf_servicehost_perfdata_file=/usr/local/nagios/var/host-perfdatahost_perfdata_file_template=datatype::hostperfdata\ttimet:: $TIMET $\thostname:: $HOSTNAME $\thostperfdata:: $HOSTPERFDATA $\thostcheckcommand: : $HOSTCHECKCOMMAND $\thoststate:: $HOSTSTATE $\thoststatetype:: $HOSTSTATETYPE $\tgraphiteprefix::$_ Hostgraphiteprefix$\tgraphitepostfix::$_hostgraphitepostfix$\tmetrictype::$_hostmetrictype$host_perfdata_file_ Mode=ahost_perfdata_file_processing_interval=15host_perfdata_file_processing_command=graphios_perf_host

See here, the most basic configuration to understand, then the following question is 1, how the replication source is generated? 2. What does the file in the copy source look like?

Replication source generation is simple, and Nagios monitors the data on the page, in fact, based on the commands that Nagios itself executes, then returns to the page and provides us with A view whether it is OK or down

#/usr/local/nagios/libexec/check_nrpe-h 10.**.58.***-C check_disk-a "-W 20%-C 10%" Disk ok-free space:/dev 1873 MB (99.99% inode=100%);/dev/shm 1882 mb (100.00% inode=100%);/499479 MB (99.15% inode=100%); |/dev=0mb;1498;1685;0;1873/d ev/shm=0mb;1505;1693;0;1882/=4259mb;403068;453452;0;503836

The above is the output results, this output will be entered into the/usr/local/nagios/var/service-perfdata, this configuration in the above nagios.cfg configured.

But look closely, you will find there is a set of nagios.cfg in the above:service_perfdata_file_template, is a long, long paragraph, the content of this setting is formatted above the output of the result.

Here's an example of formatting done.

# Vim/var/spool/nagios/graphios/service-perfdata.1528256154datatype::serviceperfdata TIMET::1528256140 HOSTNA Me::mng01-a-bjn-grafana-ariake2spl-fr Servicedesc::load serviceperfdata::load1=0.000;2.000;4.000;0; load5=0.000;2.000;4.000;0;      load15=0.000;2.000;4.000;0; servicecheckcommand::check_remote_load! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard Graphiteprefix::grafana GRAPHITEPO Stfix::load Metrictype::$_servicemetrictype$datatype::serviceperfdata timet::1528256145 HOSTNAME::MNG01-A-BJ N-LOGAGGREGATOR-ARIAKE2SPL-FR servicedesc::swap-usage serviceperfdata::swap=2047mb;1433;1023;0;2047 SERVICECHECKC ommand::check_remote_swap! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard graphiteprefix::logaggregator GR Aphitepostfix::swap Metrictype::$_servicemetrictype$datatype::serviceperfdata TIMET::1528256146 HOSTNAME::mn G01-a-bjn-jenkins-ariake2spl-fr    Servicedesc::swap-usage serviceperfdata::swap=2047mb;1433;1023;0;2047 servicecheckcommand::check_remote_swap! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard graphiteprefix::jenkins GRAPHITEPO Stfix::swap metrictype::$_servicemetrictype$

When you see the above configuration file, there is a question: Why do you want to change the file to this format?

The reason is that Nagios's data wants to be transferred to INFLUXDB and needs graphios to act as a porter, Graphios's code is written in Python, where a piece of code is designed to take data in such a format.

Graphios code can be viewed in https://github.com/shawn-sterling/graphios/blob/master/graphios.py

The next thing is simple, in the Graphios configuration, turn on the following configuration

enable_influxdb09 = true# Extra tags to add to metrics, like data center location etc.# only valid for 0.9#influxdb_extra_ tags = {"Location": "La"}# Comma separated list of the server:ports# defaults to 127.0.0.1:8086 (: 8087 if using SSL). influxdb_ Servers = 127.0.0.1:9096# SSL, defaults to False#influxdb_use_ssl = true# database-name, defaults to nagiosinfluxdb_db = n agios# Credentials (required) Influxdb_user = Influxdbinfluxdb_password = Influxdb

The above configuration, after the configuration is successful,

#/etc/init.d/graphios Start

#/etc/init.d/nagios Start

Above configuration, as a note, the approximate process Nagios---Graphios-Influxdb

Finally generate the chart on the Grafana, generate the icon way of self Baidu

Nagios+influxdb+grafana Monitoring Data visualization process

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.