Open source recommends three:
First: Zabbix
Advantage:
1. Enterprise-level distributed open source monitoring software supporting multiple platforms;
2. Simple installation and deployment, convenient management;
3. Powerful, flexible monitoring, and can realize complex multi-conditional alerts;
4. Multiple data collection plug-ins, flexible integration;
5. With built-in drawing function, the obtained data can be drawn into graphics;
6. Support calling script at the same time, very convenient;
7. Provide multiple API interfaces to customize the highest monitoring software;
8. Automatically execute commands remotely when problems occur (need to set execute permissions on the agent);
Disadvantages:
1. It is inconvenient to modify items in batches;
2. Although the community is mature, there are relatively few Chinese materials and limited service support;
3. It is easy to get started and can achieve basic monitoring, but deep-level requirements need to be very familiar with Zabbix and carry out a lot of secondary customization development, which is difficult;
4. There are relatively many system-level alarm settings. If you do not filter, there will be a lot of alarm emails; and custom project alarms need to be set by yourself, the process is tedious;
5. Lack of data summary function, if unable to view the average of a group of servers, secondary development is required;
6. Data report requires special secondary development definition;
Second: Nagios
Advantage:
1. Automated operation and maintenance, error servers, applications and devices will automatically restart;
2. Flexible configuration, many monitoring items, can customize shell scripts, and is suitable for large networks through distributed monitoring mode;
3. Automatic log scrolling;
4. Supports host monitoring in a redundant manner;
5. Good correlation between service events and host events;
6. Command to reload the configuration file without disturbing the operation of Nagios;
7. Diversity of alarm settings;
Disadvantages:
1. Weak event console;
2. Inadequate processing of performance, flow and other indicators;
3. No historical data can be seen, only alarm events can be seen, it is difficult to trace the cause of the failure;
4. The configuration is complicated, and the time and energy invested by beginners is relatively large;
5. The usability of the plug-in is not good;
Third: Ganglia
Advantage:
1. Suitable for monitoring system performance, it is easy to see the working status of each node through the curve, which plays an important role in rationally adjusting and allocating system resources and improving the overall system performance;
2. Support browser access, but cannot monitor node hardware technical indicators;
3. Suitable for large cluster environment;
4. Easy deployment, no need to add configuration for each machine;
5. A server can manage tens of thousands of machines through different layers;
6. You can customize the monitoring items. There are two types of monitoring display: table and image. Support mobile version.
Disadvantages:
1. No built-in message notification system;
2. There is no alarm mechanism, and a problem cannot be reported in time;