Monitoring automation scripts in Linux systems

Source: Internet
Author: User
Tags documentation time interval

Problem Summary
Question 1, the gradual warning, when the problem arises, the first time to notify XX, how long did not solve, notify XXX person

Problem 2, when the problem host alarm, want to obtain other relevant monitoring values, such as load, CPU, etc., but also may need to obtain other affected host.

Solving method
Question 1

Many open source monitoring products have escalations functions, such as the Common Zabbix, Nagios (this is really not the attention of the knowledge point)

Zabbix the configuration method that is sent to different people to handle depending on the duration of the problem:

For example:

The code is as follows Copy Code

1–5 min Mail to user_a

6–10 min Mail to User_b

11-15 SMS to Phone_a

16-20 SMS to Phone_b

Checked some Zabbix data found that escalations can implement this function, configured as follows:

Zabbix Web Page-configuration-actions
Set Period (seconds)
Turn on Enable Escalations
Set step in action operations
Using the value from and to and the period control of the alarm cycle, send message to select the recipient, send only to select the type sent, the last save save.

Refer to the official documentation, which is already available in the Zabbix 1.8 version.

Similar functions are also available under Nagios

Define successive notifications for a host

The code is as follows Copy Code
# Vim ${nagios_home}/etc/servers/host-escalation.cfg
-------------------------------------------------------------------
Define Hostescalation{
Hostgroup_name linux-servers
First_notification 2
Last_notification 3
Notification_interval 1440
Contact_groups 361way1
}
Define Hostescalation{
Hostgroup_name linux-servers
First_notification 4
Last_notification 0
Notification_interval 0
Contact_groups Admins
}

Define successive notifications for a service

The code is as follows Copy Code
# VI ${nagios_home}/etc/servers/service-escalation.cfg
-------------------------------------------------------------------
Define Serviceescalation{
Servicegroup_name Host-basic,host-info,host-perf,oracle-basic,oracle-self,mysql-basic
First_notification 3 from the 3rd message, sent to the team
Last_notification 5; 0 indicates unrestricted notification, and 5 sends only to article 5th, after which the message is sent to the previous team member.
Notification_interval 480; Configure each notification at a time interval of 480 minutes, configured to 0 does not indicate no notification
Contact_groups Admins
}
Define Serviceescalation{
Servicegroup_name Host-basic,host-info,host-perf,oracle-basic,oracle-self,mysql-basic
First_notification 6
Last_notification 7
Notification_interval 1440
Contact_groups Admins
}
Define Serviceescalation{
Servicegroup_name Host-basic,host-info,host-perf,oracle-basic,oracle-self,mysql-basic
First_notification 8
Last_notification 0
Notification_interval 0
Contact_groups Admins
Escalation_options C; Urgent news, other messages are still sent to the previous team
}

You can also refer to the file method of the Web page configuration given in the official documentation.

Question 2

The problem, my earliest thought is to pass the alarm when the relevant information also sent to the notice, obviously this method is unreasonable and stupid, and even SMS bomb feeling. Later on the bus back to think of the previous in the movement of the trend to see the Taobao approach is robot robot training. The principle is to integrate a lot of instant query into API and monitoring system, and then integrate with XMPP (even current micro-letters). Need to get the current resource information status, through Spark, Skype, Weixin and other client post a data request in the past, the monitoring platform to the host after the data collected and then returned to the client.

There are no good drawing tools installed on the computer, and a sketch is drawn in Word as follows:

On a similar implementation there is a Skype Sevabot project on GitHub, but no integration with the monitoring platform. There are three scenarios for interacting with the client:

1, like Zabbix, Nagios have provided the Jabber of the client's notice, this has done two times after the development can achieve the above post request, while the domestic based on the Jabber protocol to implement the client application has the MO and Sina Weibo, obviously these two are not suitable;

2, SMS way, need to realize the short message interface, to achieve the right also more trouble;

3, and the combination of Weixin, because Winxin official provided the corresponding development API, and the online corresponding development tutorials are also more, the realization of this should be the best.

The effect of the interface completion, such as sending A-system, return a host of all CPU, memory and other related system information; Send A-tomcat returns the corresponding information for Tomcat on a host. On the integration of Weixin, individual think and Zabbix, nagios the difficulty of combining the relatively large, and saltstack and so on to combine the same easier.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.