SNMP&IPMI of server Status monitoring

Source: Internet
Author: User
Tags snmp

First, IPMI

1. Introduction

IPMI (Intelligent Platform Management Interface) Intelligent Platform Management interface is a new generation of common interface standards that make hardware management "intelligent"

Open source free standards, across different operating systems

Monitor the physical health characteristics of the server, such as temperature, voltage, fan operating status, power supply, and chassis intrusion

Core components: BMC (baseboard Management Controller), an embedded microcontroller, the entire platform management of the brain,

All IPMI functionality is done through the BMC sending commands, the BMC accepts and logs event messages in the system event log, maintains sensor data that describes sensor conditions in the system, and supports remote access

BMC has the following features:

1. Access via the system's serial port

2. Fault logging and SNMP alert sending

3. Access to the System event log, SEL, and sensor status

4. Control including boot and shutdown

5. Support independent of system power or working status

6. Text console redirection for system settings, text-based utilities, and operating system consoles

BMC-based, maximum benefit: independent of the CPU BIOS and OS, when powered on or off, the server can be monitored

2. Prerequisites for using IPMI

(1) The server hardware itself provides IPMI support

Currently, most vendors such as HP, Dell, and NEC support IPMI 2.0, but not all servers are supported, so you should first determine whether the server supports IPMI through the product manual or in the BIOS, which means that the server has embedded management microcontrollers such as the BMC on the motherboard.

(2) The operating system provides the appropriate IPMI driver

The system kernel provides support for monitoring the IPMI information of the server itself through the operating system, and the Linux system provides an IPMI-based system interface through kernel-to-OPENIPMI (IPMI driver) support. Before you use the driver, start the driver:

modprobe ipmi_watchdog

(3) IPMI management tools

The IPMI Platform management tool that chooses the command line mode under Linux Ipmitool, there are many open source, such as: Ipmiutil

Ipmitool through the OpenIPMI interface to access the BMC, two ways to manage the server: (1) Monitoring the local server through the OS, (2) Monitoring the remote server through the network

Local Service Management: System architecture

Monitor local command format: Ipmitool-i Open command, where-I open

Command has the following items:

A) Raw: sends an original IPMI request and prints the reply message.
b) LAN: Configure network (LAN) channel (channel)
c) Chassis: Check the chassis status and configure the power supply
d) Event: sends a defined event to the BMC that can be used to test whether the configured SNMP was successful
e) MC: View MC (Management Contollor) status and various allowable items
f) SDR: Print any monitoring items in the sensor warehouse and values read from the sensor.
g) Sensor: Print detailed information about the sensors.
h) Fru: Print the built-in field replaceable Unit (Fru) information
i) Sel: print System Event Log (SEL)
j) PEF: Configure Platform Event Filtering (PEF), which is used to filter events in PEF when the monitoring system discovers an event, and then see if an alarm is required.
k) Sol/isol: For configuring LAN monitoring over the serial port
L) User: Configures information for users in the BMC.
m) channel: Configures the management controller channel.

Monitoring remote Servers

System architecture

Ipmitool-h 10.6.77.249-u root-p changeme-i LAN command

Configure IP, NetMask, gateway

Second, SNMP

1. Introduction

SNMP Simple Network Management Protocol is a network management protocol defined by the Internet workgroup.

An application-layer protocol for TCP/IP protocol clusters

Monitor network status, modify network device configuration, accept network event alarms, and more

2. Working principle

Client/server mode, which is the agent/management station model. The management and maintenance of the network is accomplished through the interaction between the management workstation and the SNMP agent.

The SNMP agent answers the SNMP management station query for Proxy MIB definition information.

Application Scenarios

The management station and the proxy side use MIB to unify the interface, the MIB defines the managed object in the device. Both the management station and the agent implement the corresponding MIB objects, so that both parties can identify each other's data and realize communication. The management station requests the data defined in the MIB to the agent, and after the proxy is identified, converts data such as the relevant State or parameters provided by the management device into a MIB-defined format, and finally returns the information to the management station to complete a management operation.

A complete set of SNMP systems mainly includes the management Information base (MIB), the management info structure (SMI) and the SNMP message protocol.

(1) Management Information Base MIB

Any managed resource (CPU, memory) is represented as an object and is managed. A MIB is a collection of managed objects. Defines a series of properties for a managed object: the name of the object, access to the object, and the data type of the object. Each SNMP device (Agent) has its own MIB. MIB can be regarded as the communication bridge between NMS (network management system) and agent.

NMS, agent, and MIB relationships

The MIB file is a hierarchical tree structure, with three nodes at the first level: CCITT, ISO, Iso-ccitt. The lower-level object IDs are assigned by the relevant organization, respectively. An identifier for a particular object can be obtained by a path from the root to the object. The general network device takes the object content under the ISO node. In the namespace IP node, the next MIB variable named Ipinreceives is assigned a numeric value of 3, so the name of the variable is

Iso.org.dod.internet.mgmt.mib.ip.ipInReceives

The corresponding numeric representation (the object identifier OID, which uniquely identifies a MIB object) is:

1.3.6.1.2.1.4.3

(2) Management information structure (SMI)

A set of common structures and representation symbols for MIB

(3) SNMP Message Protocol

There are five types of messages defined in SNMP: Get-request, Get-response, Get-next-request, Set-request, and trap.

(1) Get-request, Get-next-request and Get-response

The SNMP management station uses the GET-REQUEST message to retrieve information from the network device that owns the SNMP agent, while the SNMP agent responds with a get-response message. Get-next-request is used in combination with Get-request to query column elements in a particular Table object.

(2) Set-request

SNMP Management station with Set-request can be remote configuration of network devices (including device name, device properties, delete device or make a certain device property valid/invalid, etc.).

(3) Trap

The SNMP agent uses traps to send non-request messages to the SNMP management station, which is typically used to describe the occurrence of an event, such as an interface Up/down,ip address change.

Of the five messages above, Get-request, Get-next-request, and Set-request are sent by the management station to the proxy side of Port 161, and the following two get-response and traps are sent to the management process by the agent process. Where the trap message is sent to the 162 port of the management process, all data is out of the UDP package. SNMP Workflow 2:

SNMP Message Format

The SNMP agent and the management station communicate through standard messages in the SNMP protocol, each of which is a separate datagram. SNMP uses UDP (User Datagram Protocol) as the Fourth Protocol (Transport Protocol) for non-connected operation. The SNMP message packet consists of two parts: the SNMP header and the Protocol Data unit PDU.

In the actual network transmission environment, the length of the SNMP message depends on the encoding method used. SNMP unifies the coding rules of the BER (Basic Encoding rule), while using the ASN.1 syntax in the formal SNMP specification, which defines a number of data types.

The SNMP message is encapsulated in the UDP packet in the transport layer, and UDP is based on the IP network, so we can get the complete message description structure, as shown in:

SNMP Trap

An SNMP Trap is a mechanism by which managed devices actively send messages to the NMS

SNMP traps are part of SNMP, and when a specific event occurs in a monitored segment, it may be a performance issue, or even a network device interface outage, and the proxy will send an alarm event to the management station. If the NMS is not actively notified by the agent at the moment when a particular event occurs, the NMS must constantly poll the agent. This is a very wasteful method of computing resources, just as people use interrupts to notify the arrival of CPU data, rather than having the CPU poll. Trap notifications are a more reasonable option.

Net-snmp

An open source SNMP protocol implementation that also contains all relevant implementations of the SNMP trap

Practical Walkthrough

Agent

Nms

Implementation process

Get CPU Usage

// 空闲CPU占用百分比

void get_cpu_idle(unsigned int clientreg, void *clientarg)

{

char buffer[80];

const char* cpu_cmd = "mpstat -u -P ALL |grep all | awk ‘{print $12}‘";

executeCMD(cpu_cmd, buffer);

float cpu_idle = atof(buffer);

 

// 获取CPU阈值

std::string max_cpu_idle_per_str;

int max_cpu_idle_per = -1;

if (get_section_val("cpu", "max_cpu_idle_per", max_cpu_idle_per_str) == 0)

max_cpu_idle_per = atoi(max_cpu_idle_per_str.c_str());

 

float cpu_util_rate = 100 - cpu_idle;

if (cpu_util_rate > max_cpu_idle_per && max_cpu_idle_per > 0)

{

// 发送告警信息

String msg;

msg.format("Warning: CPU utilization rate(%%) is %.2f%%", cpu_util_rate);

send_msg(msg);

}

}

Register Timer

// 注册定时器

snmp_alarm_register(SEND_WARNING_TIME, /* seconds ,可自行设置时间间隔*/

SA_REPEAT, /*repeat. */

get_cpu_idle, /* our callback */

NULL /* no callback data needed */

);

Configuration file netsnmp.conf

;;netsnmp配置文件

#session配置

[session]

#网络管理端口 ip 地址

#peername = 172.29.16.104

peername = 172.29.4.181

community = public

retries = 3

timeout = 2000

sessid = 0

# 发送警告信息间隔时间(s),默认10分钟

send_trap_time = 600

 

# cpu配置

[cpu]

# 最大空闲CPU占用百分比

max_cpu_idle_per = 80

 

# 内存配置

[memory]

# 最大内存使用率(小数表示)

max_memory_used_per = 1

 

# 磁盘配置

[disk]

# 是否记录磁盘信息(1:是,0:否),默认为0

is_record_disk_info = 0

 

# oid配置(不要轻易修改)

[oid]

# 企业 oid

oid_enterprise = 1,3,6,1,4,1,2021,251,1

# 发送信息oid

oid_send_msg = 1,3,6,1,2,1,1,6,0

# 信息 oid

oid_msg = .1.3.6.1.6.3.1.1.4.1.105

Send Alert message: sent_msg

int send_traps(oid* oid_msg_para, size_t oid_msg_para_len, const char msg_type, const char* msg)

{

String oid_enter = oid_enterprise;

vector<String> oid_enter_vec;

oid_enter.split(",", oid_enter_vec);

oid* objid_enterprise = new oid[oid_enter_vec.size()];

int i = 0;

for (vector<String>::iterator iter = oid_enter_vec.begin(); iter != oid_enter_vec.end(); ++iter, ++i)

{

String num = *iter;

int i_num = atoi(num.getCStr());

objid_enterprise[i] = i_num;

}

printf("oid_enterprise_len: %d\n", (int)oid_enter_vec.size());

oid objid_snmptrap[] = { 1, 3, 6, 1, 6, 3, 1, 1, 4, 1, 0 };

// const char * msg_oid_ = ".1.3.6.1.6.3.1.1.4.1.1";

netsnmp_ds_set_int(NETSNMP_DS_LIBRARY_ID, NETSNMP_DS_LIB_DEFAULT_PORT, SNMP_TRAP_PORT);

netsnmp_session* sess = snmp_open(&session);

if (NULL == sess)

{

snmp_sess_perror("snmptraps", &session);

}

// 这里应该抛给应用端来判断是否超过预期值,发送告警信息

netsnmp_pdu* pdu;

pdu = snmp_pdu_create(SNMP_MSG_TRAP2);

pdu->enterprise = (oid *) malloc(sizeof(objid_enterprise));

memcpy(pdu->enterprise, objid_enterprise,

sizeof(objid_enterprise));

pdu->enterprise_length = oid_enter_vec.size();

snmp_add_var(pdu, objid_snmptrap, sizeof(objid_snmptrap) / sizeof(oid), MSG_OID, oid_msg_.c_str());

snmp_add_var(pdu, oid_msg_para, oid_msg_para_len, msg_type, msg);

int status = snmp_send(sess, pdu) == 0;

if (NULL != sess)

{

snmp_close(sess);

}

if (status == STAT_SUCCESS)

{

return SNMP_SUCESS;

}

return SNMP_FAILED;

}

// 发送告警信息

void send_msg(String& msg)

{

String oid_msg_local = oid_send_msg;

vector<String> oid_msg_vec;

oid_msg_local.split(",", oid_msg_vec);

oid *oid_msg = new oid[oid_msg_vec.size()];

int i = 0;

for (vector<String>::iterator iter = oid_msg_vec.begin(); iter != oid_msg_vec.end(); ++iter, ++i)

{

String num = *iter;

int i_num = atoi(num.getCStr());

oid_msg[i] = i_num;

}

printf("oid_msg_len: %d\n", (int)oid_msg_vec.size());

size_t oid_msg_len = oid_msg_vec.size();//OID_LENGTH(oid_msg);

send_traps(oid_msg, oid_msg_len, MSG_STR, (char*)msg.getCStr());

}

Summary

Implement server alarm system with IPMI and SNMP

SNMP&IPMI of server Status monitoring

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.