標籤:des style class blog code http
最近在學習check_mk,這是一款基於nagios監控工具,但比nagios更易使用,引起了我的興趣,我最關注的依然是對自訂監控指令碼的支援度。在windows下的實踐,其支援的用戶端指令碼或外掛程式可以是 exe,bat,vbs,ps1等,我採用powershell來編寫監控指令碼。
本文適合對check_mk有一定瞭解的人,關於check_mk的介紹和搭建,請點擊查看:手把手打造開源新監控利器check_mk
另外如果覺得這篇文章描述不夠清晰,可以參考官方文檔:http://mathias-kettner.de/checkmk_devel_agentbased.html
=====================================================================================================
1、首先我們在用戶端建立自訂監控指令碼 process_top5.ps1:
$dp = (Get-Process) | select -First 5 echo `<`<`<process_top5`>`>`> #這個輸出很重要,用來告訴check_mk監控端勘探端有哪些監控項foreach($p in $dp){ Write-Host $p.name $p.WorkingSet}
指令碼很簡單,抓取當前電腦前5個進程,輸出進程名和記憶體佔用。將這個檔案拷貝到 用戶端check_ck外掛程式目錄(C:\Program Files (x86)\check_mk\plugins)
2、在監控端輸入“check_mk -d hostname”,查看返回內容中可以看到<<<process_top5>>>段,可以看到有5項進程資訊
[[[Windows PowerShell]]]<<<logwatch>>><<<process_top5>>>agent 19939328AlipaySecSvc 20549632aspnet_state 1511424check_mk_agent 9351168cmd 6447104<<<local>>>
現在需要在監控端 編寫check指令碼,用於解析用戶端傳回值,檢查項指令碼位於“/usr/share/check_mk/checks”目錄下(check指令碼為python編寫,需要些的東西並不負責,所以不會python的人也無需擔心)。注意檔案名稱必須和新增的監控項名相同(即process_top5),內容如下:
process_top5_default_values=(10000000,15000000) #定義警示閾值,順序無所謂#inventory代表檢查清單,參數info為用戶端返回項<<<process_top5>>>,其傳回值inventory用於check_process_top5中的item參數def inventory_process_top5(info): inventory = [] for line in info: disk = line[0] field = int(line[1]) inventory.append( (disk, "process_top5_default_values") ) return inventory#這個是check主函數def check_process_top5(item,params,info): warn,crit = params #取出process_top5_default_values中定義的閾值,注意賦值順序,這裡warn取值10000000,crit取值15000000 for line in info: if (line[0]) == item : celsius = int(line[1]) if celsius > crit: return (2, "mem is %d" % celsius) elif celsius > warn: return (1, "mem is %d" % celsius) else: return (0, "mem is %d" % celsius) return (3, "%s not found in agent output" % item)#這裡是向check_mk添加檢查項check_info["process_top5"] = { ‘check_function‘: check_process_top5, ‘inventory_function‘: inventory_process_top5, ‘service_description‘: ‘%s‘,}
3、 通過“ check_mk -L | grep process_top5” 可以看到監控端已經有該check,然後要將process_top5添加到對應的host檔案上
[[email protected] ~]# check_mk --checks=process_top5 -I [hostname]process_top5 5 new checks
“5 new checks”代表process_top5新增了5個監控項,實際上就是對應的5個進程項,在該host對應autocheck檔案也已經增加了這5項,如下
[[email protected] ~]# cat /var/lib/check_mk/autochecks/3.81.mk[ ("3.81", "process_top5", ‘AlipaySecSvc‘, process_top5_default_values), ("3.81", "process_top5", ‘agent‘, process_top5_default_values), ("3.81", "process_top5", ‘aspnet_state‘, process_top5_default_values), ("3.81", "process_top5", ‘check_mk_agent‘, process_top5_default_values), ("3.81", "process_top5", ‘cmd‘, process_top5_default_values), (‘3.81‘, ‘df‘, ‘C:/‘, {}), (‘3.81‘, ‘df‘, ‘D:/‘, {}), (‘3.81‘, ‘df‘, ‘E:/‘, {}), (‘3.81‘, ‘df‘, ‘F:/‘, {}), (‘3.81‘, ‘df‘, ‘G:/‘, {}), (‘3.81‘, ‘logwatch‘, ‘HardwareEvents‘, ""), (‘3.81‘, ‘logwatch‘, ‘Windows PowerShell‘, ""), (‘3.81‘, ‘mem.win‘, None, {}), (‘3.81‘, ‘uptime‘, None, {}), (‘3.81‘, ‘winperf_if‘, ‘01‘, {‘state‘: [‘1‘], ‘speed‘: 1000000000}), (‘3.81‘, ‘winperf_if‘, ‘02‘, {‘state‘: [‘1‘], ‘speed‘: 1000000000}), (‘3.81‘, ‘winperf_if‘, ‘03‘, {‘state‘: [‘1‘], ‘speed‘: 1000000000}), (‘3.81‘, ‘winperf_if‘, ‘04‘, {‘state‘: [‘1‘], ‘speed‘: 1000000000}), (‘3.81‘, ‘winperf_if‘, ‘05‘, {‘state‘: [‘1‘], ‘speed‘: 100000}), (‘3.81‘, ‘winperf_if‘, ‘06‘, {‘state‘: [‘1‘], ‘speed‘: 100000}), (‘3.81‘, ‘winperf_if‘, ‘07‘, {‘state‘: [‘1‘], ‘speed‘: 100000}), (‘3.81‘, ‘winperf_if‘, ‘08‘, {‘state‘: [‘1‘], ‘speed‘: 100000}), (‘3.81‘, ‘winperf_if‘, ‘09‘, {‘state‘: [‘1‘], ‘speed‘: 1410065408}), (‘3.81‘, ‘winperf_if‘, ‘10‘, {‘state‘: [‘1‘], ‘speed‘: 100000}), (‘3.81‘, ‘winperf_phydisk‘, ‘SUMMARY‘, diskstat_default_levels), (‘3.81‘, ‘winperf_processor.util‘, None, winperf_cpu_default_levels),]
4、開啟check_mk監控頁面,查看對應host的service,可以看到已經增加的那5項資訊