First, the phenomenon of fault
This week there are two pieces of disk can not read and write, back through the System log view, the keyword "EXT4-FS error corresponding to a disk", so the use of Zabbix, the system log crawl out, as an alarm!
Second, step together
1. Too many machines, use the Ansible playbook, a one-time push
2. Define the key value in the/etc/zabbix/zabbix_agentd.conf.d/agentd.conf
# # #kernel_error of disk from/var/log/messsageuserparameter=disk_health,awk-v kernel_error= ' sudo tail/var/log/ Messages | grep "Ext4-fs Error" | Wc-l ' begin{if (Kernel_erro > 0) {print 1} else {print 0}} '
sudo permissions for 3.zabbix users
Vim/etc/sudoers.d/zabbix Zabbix all= (Root) nopasswd:/bin/bash,/bin/netstat,/usr/bin/nmap,/bin/grep,/bin/awk,/usr/ Local/mysql/bin/mysql,/usr/bin/tail,/bin/cat
Playbook
--- - hosts: "{{hosts}}" gather_facts: false tasks: - name: Add include path lineinfile: dest: "{{ item.dest }}" regexp: "{{ item.regexp }}" line: "{{ item.line }}" with_items: - { dest: "/etc/zabbix/zabbix_agentd.conf", regexp: "^include", line: "\ n \n## #Add include\ninclude=/etc/zabbix/zabbix_agentd.conf.d/*.conf " } - { dest: "/etc/sudoers", regexp: "^deFaults requiretty ", line: " # Defaults requiretty " } - name: copy configuration file copy: src=\ ' #\ ' "/etc/ Sudoers.d/zabbix ", dest: "/etc/sudoers.d/" } - { src=\ ' #\ ' "/etc/zabbix/zabbix_agentd.conf.d /agentd.conf ", dest: "/etc/zabbix/zabbix_agentd.conf.d/" } - name: rresart zabbix service service: name=zabbix_agentd state=restarted
4. Implementation
Ansible-playbook copyfile.yml-e "Hosts=all"
This article is from the "Scattered People" blog, please be sure to keep this source http://zouqingyun.blog.51cto.com/782246/1740998
Monitor disk health with system error logs