Openstack-ceilometer-alarm operating mechanism

Source: Internet
Author: User

1 checksum

Timed monitoring of the alarm list, alerting if found and set limit value not met

There are three types of monitoring services: default service, single process checksum service, and distributed checksum service. Depending on the configuration, the default configuration: Defaults

Alarm status

Name

Database code

Corresponding Database action fields

UNKNOWN

Insufficient data

Insufficient_data_actions

Ok

Ok

Ok_actions

ALARM

Alarm

Alarm_actions

1.1 Service 1.1.0 Alarmservice

base class, the other service integrates it, implements the basic check function

1.1.1 Alarmevaluationservice (default service)
    1. Start the check timer based on the alarm list (current enable alarm)
    2. Start the Load Balancer service and start the Heartbeat information timer
1.1.2 Singletonalarmservice (Single process calibration service)

Single-process verification, weak processing capacity, high data volume will be delayed or shutdown, not recommended to use

    1. Based on the alarm list (current enable alarm)
1.1.3 Partitionedalarmservice (Distributed calibration Service)

Partitionedalarmservice

It implements a set of collaboration protocols (Partitioncoordinator) between multiple evaluator processes through RPC, enabling the ability to continuously increase the processing power of alarm service through horizontal scaling, enabling a simple load balancing and high availability

Partitioncoordinator

Allow to start multiple ceilometer-alarm-evaluator processes, the relationship between these processes is a collaborative relationship between them, the earliest initiated process will be selected as the master process, the main thing the master process is to assign alarm to other processes, Each process performs three tasks on a recurring schedule:

    • Publish the presence of messages, broadcast their status to other processes through RPC, tell other processes that they are alive, and that each process holds the last active time of other processes
    • Check whether it can become master, each process will constantly update the status of the other processes maintained by the list, according to the status list, to determine whether it should be the master, to determine whether a process is master only one condition, that is to see who started the early
    • Verify the data, check the alarm that the process is responsible for, call the Ceilometerclient interface to obtain the monitoring data corresponding to the alarm monitoring indicator, then make judgment, send alarm, etc.

1.2 Alarm1.2.1 Combination

Alarm alarm, combined with the results of multiple indicators to operate accordingly

1.2.2 Threshould

Monitor one or more indicators, if greater than, less than, or equal to the threshold of monitoring and other conditions, triggering the action of alarm specified state

2 Alarms

The alarm function is to check the meter data according to the rules stipulated in the alarm object, and if the data is found to meet the conditions, the alarm is issued. The initial alarm status is OK, and if the status changes to Unkown or alarm then the Alarm_history table will have alarm status update data while triggering the action of the corresponding state. If the current state is alarm, the post-checksum state is still alarm, and the corresponding action is not triggered.

2.1 Log

Logging, Level: info

2.2 Rest

The action of the specified state in the alarm is invoked through the HTTP protocol, usually a call to the specified address, reporting status.

2.3 Test

Test use, no actual use

2.4 Trust

Call the Keystone interface and use the method in rest to send

3 issues that may be encountered
    1. Error selected for time period when creating alarm. This field can be directly unassigned if you need to monitor it all the time, not just for a certain period.
    2. The combination of several conditions when creating alarm needs to be considered well. Interval time (period/evaluation_periods), Time range (time_constraints), alarm type and action based on type (xx_action)
    3. Create alarm initial state given as OK
    4. Alarm rule settings. Generally: [Field] in [Meter_name] record [avg/max/min/] in [evaluation_periods] time [] value [greater than (GT), less than (LT), equals (eg) ...] The condition is met and the alarm state needs to be updated. Rule examples

"Threshold_rule": {

"Comparison_operator": "GT", #大于

"Evaluation_periods": 2, #和period确定校验时间段

"Exclude_outliers": False,

"Meter_name": "Disk.device.read.requests",

"Period": 10,

"Query": [#查询规则

{

"Field": "resource_id",

"Op": "EQ",

' Type ': ' String ',

"Value": "Fc0e5394-0276-413e-8d81-e3324df35a12-vda"

}

],

"Statistic": "Avg", #针对meter中volume的具体计算方法, such as average, maximum, minimum, etc.

"Threshold": 990 #阈值

}

Openstack-ceilometer-alarm operating mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.