Nagios Introduction
Nagios is an open source monitoring application that can be used to monitor local and remote host logs, resources, dead and alive, and many more. Through the SNMP protocol and the Nrpe protocol.
The Nagios profile is configured on nconf and then clicked on to the server, which has various templates that can be either self-matching or available.
Nagios's build process, self-Baidu.
Here is an example of a Nagios configuration file
Nagios configuration file directory structure:
# ll/usr/local/nagios/etc/total 152-rw-rw-r--1 nagios nagios 12999 APR 29 08:08 Cgi.cfgdrwxr-xr-x 2 Apache Apache 4096 may 08:45 default_collector <-define a directory for monitoring items drwxr-xr-x 2 Apache Apache 4096 may 08:38 global <-definition of global parameters, Checkcommand definition -rw-r--r--1 nagios nagios April 08:12 htpasswd.users-rw-rw-r--1 nagios nagios 46169 May 3 01:37 nagios.cfg
<-nagios configuration file
-rw-rw-r--1 root root 44816 Apr 08:58 nagios.cfg.29-04-18-rw-r- -r--1 root root 2358 2 10:48 nagiosconfig.tgzdrwxr-xr-x 2 Nagios nagios 4096 May 8 11:59 nrpe-rw-r--r--1 nag iOS nagios 7217 Apr 08:45 nrpe.cfg <-Nrpe communication configuration -rwxr-xr-x 1 Nagios nagios 7 217 Apr 08:38 nrpe.cfg.savedrwxrwxr-x 2 nagios nagios 4096 Apr 02:19 objects-rw-rw----1 Nagios nagios 1312 APR 29 08:08 resource.cfg
Vim/usr/local/nagios/etc/default_collect/services.cfg
Define Service { service_description check_proc_nagios <-page shows the name of the monitoring item, defined here check_ Command check_remote_procs!nagios!8!1:8 <-Execution of the check command, this configuration is the main command of monitoring, almost by a command to monitor whether OK, by "! " host_name host name <-as the parameter delimiter writes out which server to monitor, provided that the Nagios server is able to communicate Check_period 24x7 <-time to monitor Contact_groups +admins <-Contact Group, contact, error sent to who event_handler_enabled 0 use which existing template (configured on nconf) is used by the <-}
Vim/usr/local/nagios/etc/default_collect/hosts.cfg
Define host {host_name **********The host name defined by <-is primarily displayed on the monitoring page
Alias ***サーバ<--defined aliasesAddress ***-b-**-***-**.stage-***-org.fastretailing.cn<-This line defines the IP address that is the primary communication_graphiteprefix graphite_host Icon_image_alt Linux Icon_image base/linux40.gif statusmap_image Base/linux 40.GD2 Check_command check-host-alive Check_period 24x7 Notification_period 24x7 Contact_groups +admins Use Generic-host,linux-server}
Vim/usr/local/nagios/etc/global/checkcommands.cfg
Define command {command_name Check_local_disk command_line $USER 1$/check_disk-w $ARG 1$-C $ARG 2$-P $ARG 3$<-in Services.cfg, check_command parameter, use "! "Split is the interval between the parameters of" $ARG $ "}
--------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------
Define Command {
Command_name Check_remote_procs
Command_line $USER 1$/check_nrpe-h $HOSTADDRESS $-C check_procs-a "-C $ARG 1$-W $ARG 2$-C $ARG 3$ "<-This line of monitoring is configured to monitor remote processes
}
Above, the monitoring item is configured well.
INFLUXDB Introduction
Influxdb, Chinese name Time series database,
Timing data is a series of data based on time. These data points are connected to a line in the coordinates of time, which can be made into multi-latitude report, revealing its tendency and regularity; looking to the future, we can make big data analysis, machine learning, realize forecast and early warning.
The time series database is the database that holds the time series data, and it needs to support the basic functions of fast writing, persistence, and multi-latitude aggregation query.
Compared to the traditional database, only the current value of the data is recorded, and the time series database records all the historical data. The query of time series data also always takes time as the filter condition.
Influxdb Main Configuration
[Meta] # where the Metadata/raft database is stored dir = "/var/lib/influxdb/meta" [Data] # The directory where the TSM St Orage engine stores TSM files. <-Final Data dir = "/var/lib/influxdb/data" # The directory where the TSM storage engine stores WAL files. <- wal data for pre-write data wal-dir = "/var/lib/influxdb/wal"
Nagios's process of depositing data into influxdb
Nagios will monitor the data, through the Graphios, to the influxdb, in order to facilitate later grafana on the visualization.
Process
1, nagios execution command output, write to a file, and then the file for MV, stored in a directory.
The specific configuration is as follows
# vim/usr/local/nagios/etc/global/misccommands.cfgdefine Command { command_name graphios_perf_host Command_line /BIN/MV /usr/local/nagios/var/host-perfdata/var/spool/nagios/graphios/host-perfdata.$ timet$ <-mv Command main execution }define command { command_name graphios_perf_service command_line /bin/mv /usr/local/nagios/var/service-perfdata/var/spool/nagios/graphios/service-perfdata. $TIMET $ main execution of the <-MV Command }
In the above configuration file, Mv/usr/local/nagios/var/host-perfdata/var/spool/nagios/graphios/host-perfdata is required. $TIMET $
Then we will face these problems:
1. How is the replication source generated?
2. What does the file in the copy source look like?
3, where is the configuration?
The above questions need to look at the following configuration file.
First look at Nagios's main configuration file nagios.cfg, from this file can be seen in the above problem:3, where the replication source is configured.
# vim/usr/local/nagios/etc/nagios.cfgprocess_performance_data=1service_perfdata_file=/usr/local/nagios/var/ Service-perfdata<-This line specifies the file to copy the source, which is the output to which the Nagios monitored command execution results are exported, as specified here. service_perfdata_file_template=datatype::serviceperfdata\ttimet:: $TIMET $\thostname:: $HOSTNAME $\tservicedesc:: $SERVICEDESC $\tserviceperfdata: : $SERVICEPERFDATA $\tservicecheckcommand:: $SERVICECHECKCOMMAND $\thoststate:: $HOSTSTATE $\thoststatetype::$ Hoststatetype$\tservicestate:: $SERVICESTATE $\tservicestatetype:: $SERVICESTATETYPE $\tgraphiteprefix::$_ servicegraphiteprefix$\tgraphitepostfix::$_servicegraphitepostfix$\tmetrictype::$_servicemetrictype$The above line configuration, configuration is the output of the file format, in order to facilitate the subsequent graphios to store in the Influxdb to prepare
Service_perfdata_file_mode=aservice_perfdata_file_processing_interval=15service_perfdata_file_processing_ Command=graphios_perf_servicehost_perfdata_file=/usr/local/nagios/var/host-perfdatahost_perfdata_file_template=datatype::hostperfdata\ttimet:: $TIMET $\thostname:: $HOSTNAME $\thostperfdata:: $HOSTPERFDATA $\thostcheckcommand: : $HOSTCHECKCOMMAND $\thoststate:: $HOSTSTATE $\thoststatetype:: $HOSTSTATETYPE $\tgraphiteprefix::$_ Hostgraphiteprefix$\tgraphitepostfix::$_hostgraphitepostfix$\tmetrictype::$_hostmetrictype$host_perfdata_file_ Mode=ahost_perfdata_file_processing_interval=15host_perfdata_file_processing_command=graphios_perf_host
See here, the most basic configuration to understand, then the following question is 1, how the replication source is generated? 2. What does the file in the copy source look like?
Replication source generation is simple, and Nagios monitors the data on the page, in fact, based on the commands that Nagios itself executes, then returns to the page and provides us with A view whether it is OK or down
#/usr/local/nagios/libexec/check_nrpe-h 10.**.58.***-C check_disk-a "-W 20%-C 10%" Disk ok-free space:/dev 1873 MB (99.99% inode=100%);/dev/shm 1882 mb (100.00% inode=100%);/499479 MB (99.15% inode=100%); |/dev=0mb;1498;1685;0;1873/d ev/shm=0mb;1505;1693;0;1882/=4259mb;403068;453452;0;503836
The above is the output results, this output will be entered into the/usr/local/nagios/var/service-perfdata, this configuration in the above nagios.cfg configured.
But look closely, you will find there is a set of nagios.cfg in the above:service_perfdata_file_template, is a long, long paragraph, the content of this setting is formatted above the output of the result.
Here's an example of formatting done.
# Vim/var/spool/nagios/graphios/service-perfdata.1528256154datatype::serviceperfdata TIMET::1528256140 HOSTNA Me::mng01-a-bjn-grafana-ariake2spl-fr Servicedesc::load serviceperfdata::load1=0.000;2.000;4.000;0; load5=0.000;2.000;4.000;0; load15=0.000;2.000;4.000;0; servicecheckcommand::check_remote_load! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard Graphiteprefix::grafana GRAPHITEPO Stfix::load Metrictype::$_servicemetrictype$datatype::serviceperfdata timet::1528256145 HOSTNAME::MNG01-A-BJ N-LOGAGGREGATOR-ARIAKE2SPL-FR servicedesc::swap-usage serviceperfdata::swap=2047mb;1433;1023;0;2047 SERVICECHECKC ommand::check_remote_swap! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard graphiteprefix::logaggregator GR Aphitepostfix::swap Metrictype::$_servicemetrictype$datatype::serviceperfdata TIMET::1528256146 HOSTNAME::mn G01-a-bjn-jenkins-ariake2spl-fr Servicedesc::swap-usage serviceperfdata::swap=2047mb;1433;1023;0;2047 servicecheckcommand::check_remote_swap! Hoststate::up hoststatetype::hard Servicestate::ok servicestatetype::hard graphiteprefix::jenkins GRAPHITEPO Stfix::swap metrictype::$_servicemetrictype$
When you see the above configuration file, there is a question: Why do you want to change the file to this format?
The reason is that Nagios's data wants to be transferred to INFLUXDB and needs graphios to act as a porter, Graphios's code is written in Python, where a piece of code is designed to take data in such a format.
Graphios code can be viewed in https://github.com/shawn-sterling/graphios/blob/master/graphios.py
The next thing is simple, in the Graphios configuration, turn on the following configuration
enable_influxdb09 = true# Extra tags to add to metrics, like data center location etc.# only valid for 0.9#influxdb_extra_ tags = {"Location": "La"}# Comma separated list of the server:ports# defaults to 127.0.0.1:8086 (: 8087 if using SSL). influxdb_ Servers = 127.0.0.1:9096# SSL, defaults to False#influxdb_use_ssl = true# database-name, defaults to nagiosinfluxdb_db = n agios# Credentials (required) Influxdb_user = Influxdbinfluxdb_password = Influxdb
The above configuration, after the configuration is successful,
#/etc/init.d/graphios Start
#/etc/init.d/nagios Start
Above configuration, as a note, the approximate process Nagios---Graphios-Influxdb
Finally generate the chart on the Grafana, generate the icon way of self Baidu
Nagios+influxdb+grafana Monitoring Data visualization process