Recently, a large number of alarms have been reported on zabbix server, and all the host agent. Ping alarms have been reported, and then the system returns to normal again and again. This has continued for a long time. zabbix agent on XXXX is unreachable for 5 minutes. The values of other monitoring items are normal, and only the agent. Ping value is intermittent.
Manually run the following command on zabbix Server:
[Email protected]: ~ # Time zabbix_get-s 10.10.20.201-k "agent. Ping"
1
Real 0m0. 002 s
User 0m0. 000 s
Sys 0m0. 000 s
No latency issues.
The timeouts of all zabbix_agentd.conf and zabbix_server.conf have already been set to 30, and the restart of agentd and server processes is invalid.
The problem may be caused by MySQL parameters. The problem of modifying MySQL parameters is the same.
It was accidentally discovered that only monitoring of one host would not happen: zabbix agent on XXXX is unreachable for 5 minutes. In this way, one host and one host are added to monitoring,
After one of them is added, "zabbix agent on XXXX is unreachable for 5 minutes" appears ".
This host is used to monitor Oracle and recently added a self-discovery rule with low monitoring efficiency SQL statement (executed for more than 20 seconds, taking the first 20 rows). The host belongs to the CDB database and has many users, so there are more SQL _id items, and the number of self-discovered SQL _id monitoring items has increased continuously by 3000, because the self-discovered rules set "resource cycle Insufficiency: 30d ", as a result, invalid monitoring items must be deleted after 30 days.
The "resource cycle is not enough: 2D" is manually modified. When unused monitoring items are deleted, the monitoring items are kept between 700 and 800, the alarm "zabbix agent on XXXX is unreachable for 5 minutes" disappears.
A high probability is a bug. This problem occurs when the monitoring metrics of a host exceed a certain number. Write it out to prevent everyone from entering the trap.
Zabbix agent on XXXX is unreachable for 5 minutes