Nagios Learning Note II: Nagios overview

Source: Internet
Author: User

1. Introduction

Nagios is a plug-in structure, it does not have any monitoring function, all the monitoring is through the plug-in, so it is highly modular and flexible. Nagios monitors objects into two categories: Hosts and services. Hosts typically refer to physical hosts, such as servers, routers, workstations, and printers, where hosts can also be virtual devices, such as Xen virtual Linux systems, and services typically refer to a particular feature, such as the httpd process that provides HTTP services. In order to manage the convenience, the host and services can also be planned as the main unit and service groups.

Nagios does not monitor any specific numeric indicators, such as the number of processes on the operating system, and it only describes the state of the monitored objects in four abstract properties: OK, WARNING, critical, and unknown. As a result, administrators only need to focus on and define the thresholds for the warning and critical states of a monitored object. Nagios passes the thresholds for warting and crtical to the plug-in and is responsible for monitoring and analyzing the results of a specific object by the plug-in, with output information such as status information (Ok,warning,critical or Unkown) and additional detailed information.

2. Characteristics

With the above instructions, Nagios is highly resilient, and its monitoring functions can be performed exactly as the administrator expects. In addition, it provides automatic responsiveness to problems and a powerful notification system. The implementation of all these features is based on a well-structured object definition system and a few object types.

1) command (Commands)

Commands are used to define how Nagios performs a specific monitoring work. It is a layer of abstraction that is defined based on a particular Nagios plug-in, and typically contains a set of actions to be performed.

2) period (time periods)

The time period is used to define a date and time span that an action can or cannot perform, such as daily 8:00-18:00 during the workday;

3) Contact person and Contact Group (Contacts and contacts groups)

A contact is used to define a notification object for a monitoring event, the information to be notified, and when and how those recipients receive notifications; One or more contacts can be defined as a contact group, and a contact can belong to more than one group;

4) host and host group (host and hosts groups)

A "host" typically refers to a physical host that includes the recipient (that is, the contact) of the notification information associated with the host, and how and when the monitoring is defined. Hosts can also be grouped, that is, host group (host groups), a host can belong to multiple groups at the same time;

5) Service (services)

A "service" typically refers to a specific feature or resource that can be monitored on a host, including the recipient of the notification information associated with the service, how and when it is monitored, and so on. Services can also be grouped, i.e. service groups, a service can belong to multiple service groups at the same time;

3. Dependency relationship

The power of Nagios is also reflected in its mature dependency system. For example, a routing device failure will inevitably cause the other hosts associated with it can not be properly accessed, if the dependency between these devices can not be defined, then the monitoring system will inevitably appear a large number of device failure information. Nagios describes the topology of network devices through dependency relationships, and can not detect the other devices that depend on this device when a device fails, thus avoiding unnecessary fault information and convenient for administrators to locate and troubleshoot in time. In addition, Nagios's dependencies can be implemented at the service level, and can be implemented like host dependencies if a service relies on other services.

4, macro

Nagios also has the ability to use macros, and the definition of macros is consistent across the Nagios system. A macro is a variable that can be used in an object definition, and its value is usually dependent on the context. Macros defined in commands vary in value relative to the host, service, or many other parameters. For example, a command can monitor different hosts based on the different IP addresses that are passed to them.

5. Planned Downtime

Nagios also provides a scheduling outage mechanism in which administrators can periodically set a host or service as unavailable in a scheduled state. This feature prevents Nagios from notifying you of any information during the scheduled outage period. Of course, this also allows Nagios to automatically notify the administrator of the host or service maintenance.

6. Soft state and hard state (Soft and States)

As mentioned above, Nagios's primary task is to detect and store the state of a host or service. At some point, the host or service state can only be one of four available states, so it is particularly critical that its status correctly reflects the actual state of the host or service. To avoid occasional temporary or random problems, Nagios introduces a soft state and a hard state. In real-world testing, Nagios will test this host or service multiple times to ensure that the State is non-accidental once it discovers that the state of a host or service is unkown or different from the state of the last detection. There are several tests that can be configured, and Nagios assumes that the state of the change is software state during the specified number of test sessions. Once the test is complete and the state is still in a new state, this state becomes a hard state.

Nagios Learning Note II: Nagios overview

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.