How to configure the server to automatically monitor and alarm

Source: Internet
Author: User
Tags datadog statsd

A technology-savvy operation
Links: https://www.zhihu.com/question/21073555/answer/106131463
Source: Know
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

If it is a start-up company, the size of the machine and the flow of work are not so complicated. Operation and maintenance monitoring and alarm, all use Zabbix and some alarm aggregation services.

First of all, our company how to use Zabbix monitoring and alarm bar.

Zabbix Configuring Alarms

In fact, there are many tutorials on the line: Zabbix installation tutorial.

Below your own Zabbix in the addition of server monitoring and monitoring alarm summary of the experience it.

Zabbix The default language is English if you feel that use is not used to your own Chinese, the same online has a related solution here is not much to say. Next, we'll introduce the process of adding host monitoring.

One System Configuration---> Hosts--->create host


second, set the host name---> Monitor host group---> Monitor host IP Address---> owning template

Since then, a new server has been successfully added to the Zabbix. The following describes how to configure monitoring alarms under Zabbix.

In the alarm this block Zabbix by default only support mail, if need SMS alarm also need to do SMS gateway docking, the complexity is high. At the same time I think the SMS alarm is not a good way, although the sending of alarm information, there will be omissions, and the probability that this situation occurs is not very low. So here's an example of mail configuration.


One: Add the alarm processing method
1, open Zabbix Management---> Processing method--->create media type



Here in order to facilitate the use of the script to send an email to the police, the script name is mail.py. Here it is important to focus on the location of the script, my script is placed under the/usr/local/zabbix/bin/directory, where the lazy to steal the absolute path is not written, the path of the script is set in the Zabbix server-side configuration file, in the Zabbix_ server.conf settings in configuration file: alertscriptspath=/usr/local/zabbix/bin/


Second: Add Zabbix Users and groups, set their e-mail address and other information
1. Open Zabbix Management---> Users---> Select user group drop-down--->create user groups:

The main thing here is to write down the group name, set the permissions you need and then save it.


2. Open Zabbix Management---> Users---> Select User dropdown--->create users:


Set up a good group and the user will add the alarm media, in fact, is the alarm mode, because the set is the alarm, so the "information" that is not checked, the information is generally the server information changes when the alarm, this generally does not have any meaning, so do not tick.

third, the alarm trigger trigger action Settings

The point of this step is that when the trigger in the monitoring item reaches the alarm value you set, you need to perform actions such as sending the message. Here's how.
1. Open Zabbix System Configuration---> Operation---> select event source as trigger--->create action:


2. Configure departure conditions

3. Detailed Operation Settings

That is, to meet the trigger conditions after the action, which is generally set to send mail and so on, set up the user to receive mail, it is recommended that each group corresponding to a user, so that the message is easy to set the sending object.

Here the alarm setup work is done.

Having said so much Zabbix monitoring and alarm configuration, I would like to talk about some of the feelings in using the Zabbix process.

Zabbix Features:
    • Autodiscover Servers and network devices
    • Distributed monitoring network, centralized management (agent, server separate)
    • Monitor indicator Template Rich
    • Flexibility to assign user rights

Insufficient Zabbix
    • Relatively complex installation configuration, high maintenance costs at a later stage
    • Data is read-only and cannot be aggregated for monitoring data
    • The alarm mechanism is not flexible enough: different indicators require different scripts; Alarm Channel single
    • Different monitoring needs, different scripts are required to complete

aggregate alarms using the Alarm aggregation tool

If the number of machines grow to a certain scale, you will find that there is no time to deal with the alarm. The Manpower department does not give the force, the delay cannot find the person.

Fortunately, there are some alarm aggregation services abroad, pagerduty, Bigpanda and so on.

The main function of this kind of tool is to realize the alarm of all monitoring system in one platform, so as to realize the service of alarming aggregation, so that OPS personnel can concentrate on it event, avoid multi-platform switch and improve operational efficiency. It takes less than 15 minutes to automatically integrate alarms from mainstream surveillance platforms such as Nagios and Zabbix without the need for additional configuration. Take Bigpanda as an example we can look at his integrated monitoring platform

At the same time, Bigpanda will compress a large number of repetitive alarm events into a truly meaningful alarm. And then through the machine learning algorithm and other algorithms to merge the relevant alarms for operations and maintenance personnel to provide analysis, selection of the most important alarm.

From these points of view, foreign development in the field of surveillance has been from the Zabbix of this 1.0 era, into the Integrated Monitoring solution of the 2.0 era. They began to choose a monitoring tool or solution based on STATSD technology. such as Datadog, boundary and other third-party monitoring service providers.

The idea of these companies is to provide an integrated solution: how to integrate different operating systems, databases, middleware monitoring problems, you don't have to worry;

Monitoring aggregation + alarm aggregation

Because the Zabbix monitoring is to set the alarm for one of the indicators of each host, and the template to quickly create the alarm. If you want to alert a group of machines to CPU utilization, or manage a cluster, you have to write your own scripts.

Are there aggregated data to monitor, so alarms are naturally an aggregation solution?

As far as domestic, I tried the cloudinsight product is good, it is the use of STATSD and OPENTSDB to achieve an integrated monitoring solution. Because of its use of the time series database, the data is no longer read-only can be aggregated grouping and other functions, so that the aggregation of data from different data sources, to the alarm processing engine; The engine checks the value of a fixed time window according to the alarm conditions set in the alarm policy. When the engine checks the time window the value satisfies the condition, triggers the alarm event generation, and flows to the Cloudinsight event stream, the mail, the Onealert and so on different channels, distributes, notifies the user.

Landlord raised this question, I think it may be just beginning to contact operation and maintenance monitoring this area, so try Cloudinsight. Thanks to its easy installation, it is only two steps, installing the probe and viewing the instrument panel. This allows for quick trial and error, while Cloudinsight integrates dozens of types of internet-based monitoring of popular infrastructure components, requiring minimal configuration to enable complex basic component monitoring without the complexity of traditional basic component monitoring. This will allow you to see whether new technological developments are better able to meet your needs.
Because usually more like the new things so the trial of the next, its visual effect is good, the following for everyone to cut a few pictures to see.

To summarize, I think with the continuous development of cloud computing, the mode of service will be the trend of future development, just like the cloud host with the IaaS service instead of a part of the traditional physical host and IDC market share. In this form, the SaaS monitoring services such as Datadog, boundary, and cloudinsight in the field of surveillance will be a bellwether for the future.

How to configure the server to automatically monitor and alarm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.