Zabbix monitoring the number of errors that occur in the log of the detection program

Source: Internet
Author: User

Recently developers have introduced a new monitoring requirement that triggers an alarm when the number of errors in a log file increases (the number of error keywords in the log file increases).

I think this is a boring problem, the problem is boring, because the problem itself is limited (to dig their own hole). First of all, the log file can not grow infinitely, so when the log file because of the maintenance of any changes that occur will trigger an alarm;

Second, by detecting the error keyword or error code method may also be unreliable, such as the log may appear some not error code but the same number as the error code, so the error keyword and error code needs to be monitored simultaneously;

Again, such as the process of generating log restarts or hangs, log cutting (logrotate) and so on will cause the storage count changes, many cases will trigger false alarm, so this problem is really boring!

This is not to say that the number of times to detect this error is really complicated (it is really not simple), if you want to detect whether there is a problem big can not do so, for the program, all the problems that affect the correct execution of the program are all exceptions, as long as the exception is caught and handled correctly will clearly know where the problem is, How to fix it. Therefore, in the early stages of design, if you do not make these plans in advance, you can only use boredom to deal with it later.

Nonsense not to say, there are two methods can be easily implemented, posted here.

Method 1: Write two scripts, one continuous run, one for the monitoring software to run, where the continuous running script can be done by crontab, for the monitoring software to run the command line and script no matter how complex the logic of the process of how many must have the processing of exit run.

Method 2: Using the monitoring software to bring the diff or change, the trend is determined by the monitoring software (taking Zabbix as an example), Zabbix's trigger expression can easily collect data in various expressions to calculate the user's desired data, It saves users from writing programs or scripts to solve complex problems such as text comparisons, numerical calculations, and trend calculations.

Method 1:getdata.sh run in the background to provide data, checkdata.sh to Zabbix run, to query the data.

#!/bin/bash# name: getdata.shjavalogfile=/data/tomcat/tomcat-cstest/logs/ catalina.outpathtojavalogfile=$ (dirname  $javalogfile) zabbixstatusfile=pathtojavalogfile/. zabbixstatus.catalina.outerrorkeyword=13003previoustime=$ (grep  "$errorkeyword"   $javalogfile  |  wc -l) currenttime=$ (grep  "$errorkeyword"   $javalogfile  | wc -l) if [[  !  $previoustime  -eq  $currenttime  ]]; thenecho 0exit 1fiwhile [[   $previoustime  -eq  $currenttime  ]]; do#  Actually, it's like crontab,while+sleep=crontabsleep.  2currenttime=$ (grep  "$errorkeyword"   $javalogfile  | wc -l) if [[ $ currenttime -gt  $previoustime  ]]; thenprevioustime= $currenttimeecho  0 >>$ zabbixstatusfileelif [[  $currenttime  -le  $previoustime  ]]; thenecho 1  >> $zabbixstatusfilefidone #!/bin/bash# name: checkdata.shjavalogfile=/data/tomcat/tomcat-cstest/logs/catalina.outpathtojavalogfile=$ (dirname  $javalogfile) zabbixstatusfile=pathtojavalogfile/.zabbixstatus.catalina.outgrep  "0"   $zabbixstatusfileif  [[  $? -eq 0 ]]; thenecho 0true >  $zabbixstatusfileexit  1elseecho  1exit 0fi

Method 2: Generate data and query data are given to Zabbix to do.

# single line for zabbix# itemname: cs connection error#  templatenmae: template app javalogmonitor# applicationname: javaerrorcodetextfound#  triggername: cs connection error is occur# # /etc/zabbix/zabbix_ agentd.conf.d/userparameter_csconnerr.conf # /etc/zabbix/zabbix_agentd.conf.d/userparameter_cs.conf#  {template app javalogmonitor:csprocess.cs.csconnerr[*].diff (0)}>0# For /bin/bash,  such as centos# userparameter=csprocess.cs.csconnerr[*],javalogfile=/data/tomcat/ tomcat-cstest/logs/catalina.out;errorkeyword=13003;if [[ -f  $javalogfile  ]]; then  echo $ (grep  "$errorkeyword"   $javalogfile  | wc -l); exit 0;  else echo 0; exit 1; fi# for /bin/sh, such as  Ubuntuuserparameter=csprocess.cs.csconnerr[*],javalogfile=/data/tomcat/tomcat-cstest/logs/catalina.out;errorkeyword=13003;if test -f  $javalogfile  ; then echo $ (grep  "$errorkeyword"   $javalogfile  | wc -l);  Exit 0; else echo 0; exit 1; fi

--end--

This article is from "Communication, My Favorites" blog, please make sure to keep this source http://dgd2010.blog.51cto.com/1539422/1678879

Zabbix monitoring the number of errors that occur in the log of the detection program

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.