Basic concepts and classification of system monitoring:
A Overview of System monitoring:
How to scientifically, systematically and efficiently monitor the overall and detailed operation of existing IT architectures is a very important part of the current business operations and management departments. With the increasing number and type of servers, applications, and types in the current enterprise IT environment, operations departments need to be scientifically and efficiently to get the details of each server, every system, and even every application in the entire architecture in as detailed, real-time, and accurate a way as possible, and to analyze the raw data acquired, Drawing and statistics, in order to establish a reference basis for subsequent performance tuning, construction adjustments, and various types of troubleshooting.
The common monitoring objects basically cover all aspects of the IT operating environment, including the computer room environment, hardware, network, etc., each of which involves a wide variety of monitoring projects. For example, the hardware environment monitoring, the content will include the server operating temperature, fan speed and other indicators, the monitoring of the system environment, will include the basic operating system operation environment, such as CPU, memory, I/O, storage space usage, network throughput, process quantity and status, etc., for specific application situation , there may be more content involved in monitoring, and there will be many specific indicators for the application.
In addition to monitoring the content needs to be as comprehensive as possible, we also want to use the monitoring solution can be flexible and have more expansion capabilities. For example, effectively support the change and expansion of it architecture, with the increase in monitoring capacity to use as few resources as possible, a powerful event notification mechanism and so on.
The content of this article is mainly aimed at monitoring the operating system and the software environment, especially for the operation of Linux operating system. Although there are a lot of commercial software and solutions to implement the relevant functions, but in fact we also have a lot of open source solutions can play the same role, and the effect is very good. In the following sections, we will describe the implementation methods of these solutions in detail.
b Basic principles and types of system monitoring based on Linux:
There are basically two ways in which system monitoring on Linux systems is used:
The first, through the SNMP protocol combined with data acquisition software to achieve:
The architecture involved in this approach generally consists of two parts, one of which is the monitored server and the other is the network management workstation. As for the implementation method, it is the process of starting SNMP Simple Network Management Protocol on Linux server SNPMD to dynamically provide the running parameters of the server in all aspects of software and even hardware, so that the server becomes a monitored node. Then the client software on the other network management workstations should have two functions: capturing SNMP data and summarizing statistics. In the vast majority of cases, the monitoring software on the network management workstation provides the system operation status graph based on the Web page mode, and covers various operational indicators. At the same time, the new status information can be dynamically updated to the Web page.
The data format of this type of monitoring is standard and comprehensive and simple to configure, so it is a good solution from the perspective of comprehensive monitoring.
Second, by writing a script to invoke the System State monitoring commands, and combined with data acquisition software to achieve:
In some cases, the configuration of the SNMP protocol is relatively cumbersome, and the acquisition of SNMP information often requires consideration of selecting different monitoring software. From another point of view, the Linux operating system itself provides a lot of very useful state acquisition tools, such as SAR (can achieve monitoring of multiple indicators), Iostat (dedicated to I/O usage monitoring), Vmstat (designed for CPU and memory usage monitoring) and tools such as the free command. These tools can be used in conjunction with system task scheduling and self-scripting to make periodic calls, which provides considerable convenience for monitoring. Because these commands can be embedded in the script to generate periodic system monitoring software required data, as a result of these data in conjunction with the drawing software to draw an intuitive statistical chart. The information obtained from this type of monitoring is more flexible and accurate, and is handy for some users who are familiar with scripting.
In this way, the use of command monitoring can be extended without restrictions, and users can obtain and customize their own monitoring scripts through a variety of pipelines.
So above all, these two kinds of monitoring programs have advantages. As a result, we will provide some examples and operational methods for each of the two scenarios from easy to difficult to explain and demonstrate separately.
Methods of deployment and implementation of various system monitoring tools in the Enterprise:
A Configuration of the SNMP protocol and test methods on Linux and on Windows:
First of all, we introduce the first method, that is, through the SNMP protocol and data acquisition software to implement the system operation monitoring program. Because in a very large number of cases, most enterprises tend to choose SNMP to get the information of the server running, after all, because the SNMP protocol is an important standard for the industry to implement monitoring.
So let's take some time to introduce the basic concepts of SNMP protocol and how it works.
Simple Network Management Protocol SNMP is widely used to monitor network equipment (computers, routers) and even other devices (such as UPS) network protocol, is also designed to manage network nodes in the IP network (including servers, workstations, routers, switches and HUBS, etc.) a standard, belongs to the application layer protocol. SNMP enables network administrators to manage network performance, identify and resolve network problems, and plan for network growth. By receiving random messages (and event reports) via SNMP, the network management system will be informed of various problems that occur on the network.
SNMP-managed networks have three main components: Managed devices (Managed device), agents, and network management systems (networks Management station, or NMS).
The managed device is a network node that contains the ANMP agent and is in the management network, sometimes called a network unit, for collecting and storing network information, which can be obtained through SNMP and NMS. The managed devices may be routers, access servers, switches and bridges, HUBS, hosts or printers, and so on.
An SNMP agent is a network management software module on a managed device. The SNMP agent has local management information and translates them into an SNMP-compatible format.
NMS runs the application to implement monitoring of managed devices. In addition, NMS provides a large number of processing procedures and necessary storage resources for network management. Any managed network requires at least one or more NMS.
Currently, there are 3 different versions of SNMP: SNMPV1, SNMPv2, and SNMPv3. The 1th and 2nd editions do not have much difference, but SNMPv2 is an enhanced version that includes other protocol operations. The first two SNMP protocols mainly use the Community name (community)-based approach to achieve the network Management station access authentication agent, compared with the first two SNMPv3 contains more security mechanisms and remote configuration means, Authentication can be achieved by using a symmetric and asymmetric encryption protocol to encrypt the user name and password to achieve network management workstation access authentication agent. and to solve the incompatibility problem between different SNMP versions, RFC3584 defines a three-coexistence strategy.
In addition, the SNMP protocol includes four basic actions:
Get:
If the network management system needs to obtain information about the device being monitored, a get action is performed.
GetNext:
If a project information to be obtained is one of several items in a project list, the network management system executes getnext to obtain all relevant project information.
Set:
The network management system uses the SET command to change a value for a managed project.
Trap:
If the managed device needs to notify the network management system of certain information, the trap command needs to be executed.
The last thing to note is that all SNMP-related packages have been provided in red Hat Enterprise Linux, and all MIB information is provided in these packages, and the information files are stored in/usr/share/snmp/mibs. The so-called MIB is a hierarchical database of a device (Management information Base). And each value for this device is represented by a unique object identifier, which is the OID format, which includes the available name, prefix, or number.
If the Net-snmp-utils package is installed in the system, the MIB and OID information in SNMP can be displayed through the snmptranslate command with information about the entire MIB tree and its OID ()
The snmptranslate is primarily used to convert OID information displayed with a literal name or numeric ID, as well as an MIB structure tree listing SNMP.
Just now we have used a lot of space to introduce the basic principle and composition of SNMP Simple Network Management protocol. Now we are going to demonstrate how to configure and implement the SNMP service with Red Hat Enterprise Linux 5 update 8 (Rhel 5u8), the latest corporate operating system.
A RPM package called NET-SNMP is provided in Rhel 5u8, and Net-snmp is a set of programs that perform SNMP IPv6 and v1,v2 version agreements on IPv4 and v3.
Specifically, because the stable version of the Red Hat Enterprise Linux operating system is used for enterprise applications in most environments, the Linux platform used for all subsequent operations is also Rhel, However, users who are interested in the technology experience can also use Fedora or other Linux distributions to do all of the above.
In this example, Assuming that the server 192.168.1.10 is a monitored system, we will configure and enable the SNMP service based on the V1 and V3 versions separately, while another host 192.168.1.100 the power to act as a management station and use SNMP commands to obtain detailed information about the monitored system.
On the server 192.168.1.10, the basic information is as shown ()
First configure the V1 version of the SNMP protocol:
Mount the DVD installation disc and install the SNMP-related package from the CD: Lm_sensor,net-snmp,snmp-utils. The role of the NET-SNMP package has just been introduced, and as for Net-snmp-utils, a series of tools to manage the network using the SNMP protocol () are provided.
After loading the required packages, we can directly enable SNMPV1 by modifying the main configuration file of SNMP/etc/snmp/snmpd.conf and restarting the service. Changes made: ().
One of the important signs of using the SNMPV1 version is that you need to use a community-based community authentication method for network management devices to access the proxy. The community here uses the default public and can, of course, modify it to any string according to its own needs. When you are finished, save the file and run the command restart service:
# service SNMPD start [Enter]# chkconfig snmpd on [Enter]
To monitor the ability to correctly obtain the OID value for each MIB in the entire system, you can run the Snmpwalk command to obtain the result of the response (Screenshot07.png), and the Snmpwalk command automatically obtains the management information on the MIB tree through the SNMP getnext action. For example, in this case, the following information is executed, indicating that all MIB and OID information is obtained:
# snmpwalk–v1–cpublic 192.168.1.10 [Enter]
So far, snmp on the monitored object has been configured to complete. To illustrate the results, I looked for a monitoring software that took advantage of the SNMP protocol on an operating system running on Windows. There are many software that can implement this feature on the Windows platform, such as Whatsup,solawins and so on. Take WhatsUp as an example, the operating system on my monitoring host is selected by Windows Server 2003 Enterprise Edition. The IP address is 192.168.1.100. Follow the steps in the diagram to install the WhatsUp software, which is simple, as long as the installation style of Windows software-all the way to the return ().
Since I am installing a 30-day free trial version, I need to select "Activate later" when launching the product.
and select "IP Range Scan" () in "Device Discovery Method".
The starting address is then populated with the address of the monitored device 192.168.1.10 ()
According to the contents of the/etc/snmp/snmpd.conf file enter the community name "public" according to determine the scan content and start scanning, scan time needs to depend on the number of devices ().
In action policy Selection, select Do not Apply an Action Policy and end the scan ().
Finally, the "Report View" tab selects "Device Reports" and eventually gets the health status () of all devices.
In many system monitoring software, the function of WhatsUp is relatively powerful, and the setting is convenient and the interface is friendly. It is a good choice in the service monitoring of many enterprises, and the other view modes and functions of WhatsUp are also more. As for other similar software such as Solawins, the steps in configuration are basically the same, so there is no time to dwell on the details.
After using the V1 version of the SNMP protocol, we will describe how to configure and use the V3 version of the SNMP protocol to achieve the same effect: Unlike the V1 version of the SNMP protocol, the most important feature of the V3 version is stronger security. In fact, the V1 version of SNMP is somewhat lacking in security because the V1 version of the community information is transmitted in clear text on the network. Therefore, the V3 version no longer uses clear-text community information for authentication, but uses symmetric or asymmetric encryption to encrypt the username and password for authentication. So the security side is naturally much higher than the V1 version, but it is obviously more cumbersome to configure than the V1 version. Fortunately, the system comes with a net-snmp-utils toolkit that provides us with another powerful SNMP configuration tool,--net-snmp-config, so that the general user can still implement the V3 version of SNMP configuration with great ease. Here are the configuration methods:
We switch to the disc first, since the Net-snmp-config tool is provided by the Net-snmp-devel package, so after installing a series of dependent packages including beecrypt,elfutils-devel,elfutils-devel-static, Finally, you will install the Net-snmp-devel package. Then stop the SNMPD service and back up its master configuration file, and then run the command:
# net-snmp-config--create-snmpv3-user-a 12345678-x 12345678-a md5-x DES admin [Enter]
The parameters used for this command are described below:
--create-snmpv3-user [-A Authpass] [-X Privpass] [-A md5| SHA] [-X des| AES] [Username]
After the command executes, a new configuration file, snmpd.conf, is automatically created, and the content is simple. Only the user name and permissions, and information about the authentication method is stored in the system/var/net-snmp/snmpd.conf file ().
Finally, restart the SNMPD service and use Snmpwalk again to indicate that the OID information on the MIB is obtained by V3 authentication method ().
The command is:
# snmpwalk-v3-u admin-l auth-a md5-x des-a 12345678-x 12345678 192.168.1.10 [Enter]
If you want to verify that the configured information is OK, or you can monitor the information through the WhatsUp in Windows, the steps are basically the same as in the previous example, except to change the SNMP version and fill in the appropriate authentication information. Don't repeat it here.
Using SNMP to monitor server performance