伺服器營運時我們通常使用Zabbix作監視,在這裡安裝及配置monit,當檢測到進程的停止時可自動啟動進程。
推薦使用rpmforge源安裝monit,因為rpmforge的monit版本較新。
安裝monit
# cd /tmp
# rpm -ivh http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# yum install --enablerepo=rpmforge monit
# rpm -q monit
monit-5.6-1.el7.x86_64
# rpm -ql monit
/etc/logrotate.d/monit
/etc/monit.d
/etc/monit.d/logging
/etc/monitrc
/usr/bin/monit
/usr/lib/systemd/system/monit.service
/usr/share/doc/monit-5.6
/usr/share/doc/monit-5.6/CHANGES
/usr/share/doc/monit-5.6/COPYING
/usr/share/doc/monit-5.6/PLATFORMS
/usr/share/doc/monit-5.6/README
/usr/share/man/man1/monit.1.gz
/var/log/monit.log
# monit -V
This is Monit version 5.6
Copyright (C) 2001-2013 Tildeslash Ltd. All Rights Reserved.
2015年9月9日的最新版本是5.6。
配置monit
查看預設的設定檔內容
# grep -v '^#' /etc/monitrc
set daemon 60 # check services at 1-minute intervals
set httpd port 2812 and
use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and
allow admin:monit # require user 'admin' with password 'monit'
allow @monit # allow users of group 'monit' to connect (rw)
allow @users readonly # allow users of group 'users' to connect readonly
include /etc/monit.d/*
# cat /etc/logrotate.d/monit
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/systemctl reload monit.service > /dev/null 2>&1 || :
endscript
}
# cat /etc/monit.d/logging
# log to monit.log
set logfile /var/log/monit.log
監視周期為60秒,日誌輸出及日誌滾動以配置好了。
配置monit
sshd
check process sshd with pidfile /var/run/sshd.pid
start program "/usr/bin/systemctl start sshd.service"
stop program "/usr/bin/systemctl stop sshd.service"
if failed port 22 protocol ssh then restart
if 5 restart within 5 cycles then timeout
Apache
CentOS6.5上的配置monit的Apache,和CentOS7相比啟動/停止命令不同而已。
check process apache with pidfile /var/run/httpd/httpd.pid
start program = "/etc/init.d/httpd start" with timeout 60 seconds
stop program = "/etc/init.d/httpd stop"
if failed host www.zabbix.cc port 80 protocol http
and request "/readme.html"
then restart
if 3 restarts within 5 cycles then timeout
group apache
Nginx
check process nginx with pidfile /var/run/nginx.pid
start program = "/usr/bin/systemctl start nginx.service"
stop program = "/usr/bin/systemctl stop nginx.service"
MySQL
CentOS6.5上的配置monit的MySQL,和CentOS7相比只是啟動/停止的命令不同。
check process mysqld with pidfile "/var/run/mysqld/mysqld.pid"
start = "/etc/init.d/mysqld start"
stop = "/etc/init.d/mysqld stop"
if failed unixsocket /var/lib/mysql/mysql.sock with timeout 60 seconds then restart
if 5 restarts within 5 cycles then timeout
MariaDB
check process mariadb with pidfile "/var/run/mariadb/mariadb.pid"
start = "/usr/bin/systemctl start mariadb.service"
stop = "/usr/bin/systemctl stop mariadb.service"
if failed host 127.0.0.1 port 3306 protocol mysql then restart
if 5 restarts within 5 cycles then timeout
查看monit狀態
# monit status
The Monit daemon 5.6 uptime: 8m
Process 'sshd'
status Running
monitoring status Monitored
pid 884
parent pid 1
uptime 19d 11h 57m
children 4
memory kilobytes 3016
memory kilobytes total 19420
memory percent 0.0%
memory percent total 0.5%
cpu percent 0.0%
cpu percent total 0.0%
port response time 0.008s to localhost:22 [SSH via TCP]
data collected Sun, 05 Apr 2015 21:41:18
Process 'nginx'
status Running
monitoring status Monitored
pid 13963
parent pid 1
uptime 6m
children 3
memory kilobytes 2428
memory kilobytes total 67520
memory percent 0.0%
memory percent total 1.8%
cpu percent 0.0%
cpu percent total 0.0%
data collected Sun, 05 Apr 2015 21:41:18
Process 'mariadb'
status Running
monitoring status Monitored
pid 24790
parent pid 24354
uptime 10d 4h 36m
children 0
memory kilobytes 216168
memory kilobytes total 216168
memory percent 5.9%
memory percent total 5.9%
cpu percent 0.0%
cpu percent total 0.0%
port response time 0.000s to 127.0.0.1:3306 [MYSQL via TCP]
data collected Sun, 05 Apr 2015 21:41:18
System 'zabbix.cc'
status Running
monitoring status Monitored
load average [0.00] [0.01] [0.05]
cpu 0.8%us 0.1%sy 0.1%wa
memory usage 1524496 kB [42.1%]
swap usage 0 kB [0.0%]
data collected Sun, 05 Apr 2015 21:41:18
確認monit自動啟動進程
停止nginx進程之後,查看monit.log檔案。
# systemctl stop nginx.service
# tailf /var/log/monit.log
[CST Apr 5 21:35:18] error : 'nginx' process is not running
[CST Apr 5 21:35:18] info : 'nginx' trying to restart
[CST Apr 5 21:35:18] info : 'nginx' start: /usr/bin/systemctl
配置OS自動啟動
配置OS啟動時的自動啟動。根據系統及版本自動啟動的命令不同,在這裡介紹CentOS7上配置自動啟動的方法。
# systemctl list-unit-files | grep monit.service
monit.service disabled
# systemctl enable monit.service
ln -s '/usr/lib/systemd/system/monit.service' '/etc/systemd/system/multi-user.target.wants/monit.service'
# systemctl list-unit-files | grep monit.service
monit.service enabled
Zabbix監視monit
當檢測到進程停止時自動啟動該進程的環境已經能夠搭建好了,但是monit本身停止了就無法檢測到了。在這裡使用Zabbix監視monit。
監視monit進程
監控對象(Item)
監控對象(Item)
| 項目 |
配置 |
| Name |
Process monit daemon running |
| Type |
Zabbix agent |
| Key |
proc.num[monit] |
| 資料類型 |
Numeric(整數) |
觸發器(Trigger)
| 項目 |
配置 |
| Name |
Process monit daemon down |
| 邏輯條件式 |
{Zabbix server:proc.num[monit].last()}=0 |
| 嚴重度 |
Warning(警告) |
監控monit記錄檔
監控對象(Item)
| 項目 |
配置 |
| Name |
Process monit daemon running |
| Type |
Zabbix agent |
| Key |
log[/var/log/monit,error] |
| 資料類型 |
日誌(Log) |
觸發器(Trigger)
| 項目 |
配置 |
| Name |
Process monit daemon error |
| 邏輯條件式 |
(({Zabbix server:log[/var/log/monit,error].regexp(error)})#0)&({Zabbix server:log[/var/log/monit,error].nodata(300)}=0) |
| 嚴重度 |
Warning(警告) |